Both of them are open-sourced data integration tools, with cloud offerings. Which one is better, or more fit you? This is Jove, cofounder at Timeplus, a streaming database company (with cloud offering). I’ve been using them over 1 year. Not every day, not every week. Maybe 1–3 times a month. As an end user AND a connector developer. If you are as lazy as me, here is the guide how to choose one over the other.
Let’s start with the cover image:
I do have some connections (met their founders&team in a few tech conference in person) This is not an endorsement or sponsored content. I just want to share a bit, as a beginner user of both Airbyte and Meltano, what I like and what I struggle.
You read it right. I am more leaning towards Meltano, even they don’t raise as much 💰 as Airbyte. Well, as a user and a potential paid customer, shall I care about the VC 💰? Maybe a little (I don’t want that tool will sunset after I setup everything well) but shall not care too much.
I don’t expect to write a 10-page report. Let’s go over those rows in my comparison table now.
Open Source License
Let’s face it. Not everyone is happy with Airbyte’s license, or even how they organize the source code. The platform code in Elastic License v2, basically meaning source-code available, you can use it, change it, but just cannot turn this as a SaaS and charge others. Everyone can contribute connectors for Airbyte and then the code will be maintained by Airbyte, not the original developers. This is an interesting design. This is supposed to solve both the long-tail issue and the lack of maintenance for such large amount of connectors.
meltano/LICENSE at main · meltano/meltano
Extract & Load with joy - CLI & version control for ELT without limitations. No more black box. Let your creativity…
Meltano simply chooses MIT for everything.
Both of them can be deployed locally, or on-prem, with Docker or k8s. If you don’t want to setup and update them. You can consider purchasing their cloud offering.
Pricing is a big topic and there are just too many options. With a few iteration, Airbyte Cloud can accept signup from everyone. If you are just use those non-GA connectors, it’s FREE. For example, the Timeplus destination connector is in alpha stage, so you can use it for free in Airbyte Cloud. But if you need to use a GA connector like Hubspot (as me), you need to bind the credit and purchase some credits.
Pricing | Airbyte - Open-source data integration
Airbyte offers the first transparent and scalable pricing across ETL / ELT. Based on compute time, Airbyte enables you…
Buy credits before you run any GA sync:
The minimal credit you can buy is 20, i.e. USD 50. This will allow you to sync ~ 1.3 million rows.
For Meltano, the pricing model is more complicated. Daily job and hourly job are charged differently.
BTW, there is some discount if you purchase the Meltano cloud credit while it’s still in beta.
Web UI, CLI and Yaml
Clearly Airbyte’s web UI is way better than Meltano. And clearly Meltano’s CLI is way better than Airbyte.
At very beginning, I was not so used the
meltano-cloud commands to add plugin, set config, dry-run and manage the cloud deployments and schedules. Later on, I just enjoy using the CLI.
For Airbyte, my understanding, most of the configuration need to be done via web UI. The
octavia CLI is still in alpha phase.
I like the Infrastructure-as-Code (IaC) of Meltano a lot. You can define almost everything in the yaml file. As a fact, in most cases, the CLI just helps you to update the yaml file, except sensitive API keys etc.
For Meltano Cloud, I just need to link the github repo to the cloud account and I make most of changes by the yaml, and trigger some run with
Quality vs Quantity
There are 350+ connectors maintained by Airbyte team.
On hub.meltanto.com the number is bigger.
So it seems that Meltano wins.
I don’t have a strong prove, but with my limited data points, I think in general Airbyte connectors are in better quality. For example, one of my pipeline is sending HubSpot data to my Timeplus workspace.
With Airbyte, I can get many properties from hubspot objects, no matter built-in or custom-defined.
On Meltano, the list is much shorter.
But still on the quality topic, I am not impressed for Airbyte platform.
Maybe just me, but over 50% chances when I tried something with Airbyte, I will fail. I was surprised for various errors:
- OOTB docker compose can miss a file
- OOTB k8s helm chart doesn’t really work.
It almost drives me to purchase credits on their Airbyte Cloud. So maybe it’s by design not a bug?
On the Meltano side, the documentations are a bit overwhelming but I rarely hit issues when I try something new.
Both of them maintain Slack community. I don’t want to list the number of members, since I don’t care. Actually you will see there are a lot of issues reported in Airbyte slack. Most of them are not answered. There is a new AI bot tried to be helpful, but I doubt.
Meltano community is smaller but much nicer/closer.
To me, it’s a culture thing, a founder thing.
(PS, I built a data connector for my company. I submitted it to both Airbyte and Meltano. One took 6 months, and the other took 12 hours to review and list on catalog. I am sure you can guess which one is which)
If you read this far, hopefully you see why I more like Meltano now. Here is a text version of the table
As I keep saying, choose the one that fits for you, not because it’s best in the world. So my guide to a lazy data engineer who doesn’t want to spend a lot of time and money to look into the details:
- if you are not comfortable to use command line, edit yaml, or not that technical, just use Airbyte OSS or even the cloud.
- if you think nice UI is optional and look for a lot of parameters/pipeline tunings, choose Meltano
- if you are building a new data source, or a new database or destination, build for both platform. Essentially they are using same or similar Singer SDK (I can be wrong, but 90% of my connector code for Airbyte and Meltano are same)
Again this is a personal tech blog. I don’t want to put our company-wide relationship to Airbyte/Meltano at risk. I personally enjoy the free options to sync lots of data to my Timeplus databases, such as Hubspot, Github, CSV, Database, and build charts there.
BTW, check this if you are wondering why this is in my personal medium, not company blog.
July 26: changed references of “meltano” to “Meltano”