Week 33, 2019 - AWS Lake Formation; Aurora Multi-Master; GitHub Actions Update

By (4 minutes read)

AWS released Lake Formation and support for multiple masters in Aurora, while GitHub announced updates about GitHub Actions.

AWS Lake Formation

AWS is still rolling out features from re:Invent, this time with Lake Formation. If the name makes you think of CloudFormation, I suspect that’s not an accident. Like CloudFormation, Lake Formation is a tool to allow you to more easily set up and configure what you need for your data lake. In case you’re not familiar with the term, a data lake is a single place where you collect multiple sources of data for easy access. Usually, this means a LOT of data that isn’t as easily structured and combined.

While I haven’t had a chance to play with it myself1, my understanding is that Lake Formation eases the set up of that central repository. It also allows you to manage the data sources from a single place and offers fine-grained controls over who can see the data.

Aurora Multi-Master

Another one from re:Invent, but not the last one, is Aurora Multi-Master. This was announced in 20172, but it has now finally arrived. It does exactly what you expect it too. Instead of having a single instance that can write, you can now spread out the writes over multiple instances. This helps both with uptime (no need for failing over if another instance can take it over) and having a write instance in each AZ means your application doesn’t need to talk to a different AZ. This blogpost goes a bit deeper into how it works as well.

All great, but please take note that this is still very limited. As usual, it’s only available in several regions, but multi-master is also only available for the MySQL 5.6 compatible clusters. So, not for MySQL 5.73 or Postgres.

GitHub Actions Update

GitHub has been silent regarding GitHub Actions for a while, but last week they had a special event where they announced updates about GitHub Actions. In case you don’t remember what I’m talking about, GitHub Actions is GitHub’s built-in automation system. It was announced last year in beta, and as I got access pretty early I wrote a review of it. Tl;dr: I liked how it worked. It wasn’t perfect, of course. Moreover, as I continued using it, there were some issues I didn’t raise in my original review such as a lack of notifications on failures.

In last week’s announcement however, they changed a lot about how it works and how you configure it. First, there are two types of actions: those running in Docker and those written in Javascript. The biggest advantage of the Javascript solution is that it works in other runtimes than Linux. You can use both macOS and Windows as runtimes for your actions.

There are also significant changes to the syntax of the workflows. For one, you can now define them in separate files4 , and the language changed to YAML instead of HCL. While more readable, this new syntax has some limitations as well. A major one is that you can no longer define parallel actions within a single workflow, but instead need to define parallel workflows. I didn’t really use that much, so it’s not a real concern but will require some rewriting of the workflows.

Which brings me to the last item about the changes, if you’re already in the beta, you will need to rewrite your actions. There is a migration tool that does this for you, but I’ve found a couple of issues with my migrated workflows so far that I’m still working through. I’ll save all of that for a separate post though.

More importantly for most people is that they announced both pricing and availability. If you can’t get into the beta, you will have to wait until the next GitHub Universe conference in November for it to become generally available. As for the pricing the news is pretty good. For public repos it will be free to use, while private repos get 2000 minutes a month for free in the free tier and more than that for the various paid levels. In addition you can pay for additional minutes5.

  1. I don’t tend to generate enough data by myself to make a data lake useful. [return]
  2. I almost gave up on it ever happening. [return]
  3. I honestly don’t get why the more modern version constantly seems to be an afterthought. [return]
  4. Which was really missing in the original. [return]
  5. Pricing differs per runtime, with macOS 10 times the price of Linux. [return]
comments powered by Disqus