Crafting Connectors with Airbyte's No-Code Magic: A Tutorial for Building Custom Connectors
Introduction: Building Data Bridges with Airbyte Connectors
In today's data-driven world, seamless integration of diverse data sources has become a fundamental aspect of modern businesses. As organizations strive to gain actionable insights from their data, the need for efficient data integration tools has risen exponentially. This is where Airbyte comes into play.
1. What is Airbyte and Its Purpose: Airbyte is an open-source data integration platform designed to simplify the process of moving data from various sources to destinations. Whether you're dealing with databases, APIs, file systems, or other data origins, Airbyte provides a unified solution to extract, transform, and load (ETL) data with ease. By automating and streamlining these processes, Airbyte empowers businesses to focus on deriving insights rather than grappling with complex data plumbing.
2. The Role of Connectors in Data Integration: At the heart of Airbyte's functionality are connectors, which serve as the bridges connecting data sources to destinations. Connectors are integral components that enable seamless communication between diverse systems, ensuring data flows smoothly and accurately. These connectors abstract away the technical intricacies of working with various APIs, databases, and file formats, making the integration process accessible to users with varying technical backgrounds.
3. Understanding Key Terminology: Before we embark on our connector-building journey, let's clarify a few key terms that will be pivotal throughout this tutorial:
Connectors: These are modules responsible for fetching data from sources and pushing it to destinations.
Source: The origin of data, which can be an API, a database, or a file system.
Destination: The target location where data is transferred, such as a data warehouse.
Replication: The ongoing process of keeping the destination data up-to-date with changes from the source.
API: Application Programming Interface, a set of rules allowing different software applications to communicate with each other.
With a clear understanding of these terms, we're well-prepared to embark on our journey of building an Airbyte connector using the Todoist API as our guiding example. So, let's dive in and explore the exciting world of data integration and connector development!
Lets Know Connector Builder UI
Before we dive into the specifics of the new beta feature, it's essential to grasp the different ways you can build connectors in Airbyte. There are four primary methods:
Python CDK: This method involves using the Python Cloud Development Kit (CDK) to create connectors. It's suitable for those who are well-versed in programming concepts and Python.
Java: Similar to the Python CDK, the Java method requires a strong foundation in Java programming.
Low Code: The low code approach is designed to simplify connector creation. With this method, you mainly deal with writing configurations rather than complex code. It's an excellent choice for those who want a more user-friendly experience.
Connector Builder UI: Airbyte's Connector Builder UI is a powerful tool for building connectors with minimal coding. It's designed for individuals who might not be proficient programmers but still want to harness the capabilities of connectors.
The Connector Builder UI is particularly noteworthy as it allows you to create connectors without the need to write YAML code. You can interact with input fields and buttons, combined with a basic understanding of your source API, to automatically generate the necessary YAML. This approach is incredibly valuable considering the vast landscape of APIs that need integration. Not everyone has an extensive programming background, and this builder serves as a bridge to enable a wider range of users to create connectors effortlessly.
The Beta Feature: Compatibility Guide and Questions
However, it's important to note that while this feature offers a no-code approach, not every API can be seamlessly integrated using it. To determine if your chosen source API is compatible with the Connector Builder UI, you'll need to answer a set of crucial questions. You can find these questions and the detailed compatibility guide here.
Answering the Compatibility Questions
Let's address these questions in the context of our reference Todoist API to understand if we can utilize the Connector Builder UI to create a no-code connector.
Is it an HTTP API returning a collection of records synchronously?
- The Todoist API primarily returns data through HTTP requests, making it synchronous.
Are data endpoints fixed?
- Yes, Todoist have endpoint fixed
What type of authentication is required?
- Todoist API support Bearer Token Auth and it is supported bu Connector Builder UI
Is the data returned as JSON?
- The Todoist API returns data in JSON format, aligning with the requirements of the Connector Builder UI.
How are records paginated?
- Todoist have no pagination involved, this is aligns with requirements.
Are the required parameters of the integration key-value pairs?
- Yes, this too aligns with Todoist API docs
With these answers, we can determine whether the Todoist API is suitable for building a no-code connector using the Connector Builder UI. If it meets all the compatibility criteria, we can take advantage of this innovative feature to streamline our data integration process.
In the next section, we will delve into creating an environment in which we can create connectors using Connecto Builder UI.
Getting Started: Setting Up Your Environment
To embark on your Airbyte connector-building journey, you have two avenues of access: the Airbyte Cloud or a self-hosted setup. Let's explore how to set up each option step by step.
1. Airbyte Cloud
The Airbyte Cloud offers a straightforward way to access the Connector Builder. Follow these steps to get started:
Sign up for an account to gain access to the Connector Builder and create new connectors seamlessly.
2. Self-Hosted Setup
If you prefer a self-hosted setup, Airbyte's open-source platform is at your disposal. Here's how you can set it up:
Step 1: Install Docker Ensure you have Docker installed on your workstation. Refer to the Docker documentation for installation instructions. Confirm that you're using the latest version of docker-compose as well.
Step 2: Deploy Airbyte In your terminal, execute the following commands:
git clone https://github.com/airbytehq/airbyte.git
cd airbyte
./run-ab-platform.sh
Once you spot the Airbyte banner, the UI is ready to roll at http://localhost:8000. To access it, use the default credentials: username - airbyte, password - password.
Alternative Hosting with GitHub Codespaces
If you find yourself dealing with limited local resources β just like me π β and you're fortunate enough to have access to GitHub Codespaces, here's a seamless way to host Airbyte:
Click on "New Codespace."
In the repository field, search for "airbytehq" and select the repository.
Keep the branch as "master" and choose a 4-core machine type for smoother performance.
Click "Create" to set up your Codespace.
Access your Codespace through a web browser or within Visual Studio Code.
Now open the terminal and run the command
./run-ab-platform.sh
Once you see a pop-up like the below image click on Open in Browser button, then use the default credentials: username - airbyte, password - password.
Remember, Airbyte can be hosted in various environments. For more hosting options and details, refer to docs. With your environment ready, you're all set to dive into building connectors using Airbyte's powerful capabilities. Stay tuned for the upcoming sections where we'll delve deeper into creating connectors effortlessly!
Using the Connector Builder
Launch Airbyte: Upon opening Airbyte, you'll notice a toolbar logo button on the left-hand side. Give it a click to get started.
Choose Your Path: Once clicked, you'll be presented with two options: "Import a YAML Manifest" and "Start from Scratch." For our tutorial, where we're building a connector from the ground up, select the second option β "Start from Scratch."
Authorization
Begin by giving your connector a meaningful name. Replace the "Untitled" placeholder at the top with a name that reflects the purpose of your connector. For our illustration, let's name it "Todoist."
Base URL: Refer to your API documentation to find the base URL. For instance, in the Todoist REST API, the base URL is https://api.todoist.com/rest/v2/
.
Input this base URL into the designated field.
Authorization: Explore the API documentation for authorization techniques. In the Todoist example, you'd discover that the supported technique is a "Bearer token" placed in the header. From the dropdown menu, select the appropriate option β in this scenario, it's "Bearer."
Adding Streams
You'll notice a glowing "+" icon asking you to add streams. These streams represent the endpoints within the API that provide the data you want to integrate.
In the realm of Airbyte, streams are categorized into two types:
Full Refresh Streams: These streams involve sending the entire dataset every time you sync data. While this method is straightforward, it can be resource-intensive. For more in-depth insights, refer here.
Incremental Streams: With this approach, only new or updated data is transmitted during synchronization. The process picks up from where it left off in the previous sync, using a cursor field as a reference point. For additional details, explore here.
Now, let's put this into action with an example using the Todoist API. Here, we'll add two streams: "projects" and "tasks." First, let's identify the API endpoints: /projects
and /tasks
respectively.
Let's start by adding the "projects" stream. Simply click on the glowing "+ icon" to begin the process. Then give the stream name and endpoint found from API documentation as shown below.
Having successfully added the "projects" stream, it's time to delve deeper into the process. On the upper right-hand side of the page, you'll spot the "Testing values" button. Click on it and input the required fields. Then, hit "Test" to receive real-time responses. You can even edit the connector definition while testing. Notably, the builder automatically generates a schema based on the response, simplifying your workflow.
While examining the response, you might notice that the "id" serves as the primary key. To align with this, input "id" in the primary key field.
Now, let's proceed to add the "tasks" stream, similar to how we added the "projects" stream.
Explore the stream page options:
For each option, hover over the "i" symbol for a quick overview and click the book logo for in-depth documentation. This will empower you to fine-tune your connector's behaviour to suit your needs.
Record Selector: Use the "Record selector" field to specify the response object's property holding the record. More info here.
Primary Key: Specifies a unique record identifier. More info here.
Query Parameters: Customize query parameters for specific data.
Pagination: Set up pagination handling for your connector. More info here.
Incremental Sync: Configure fetching data incrementally based on a time field. More info here.
Partitioning: Define how to partition and iterate over stream data. More info here.
Error Handler: Determine error handling strategies. Default includes retries for server errors. More info here.
Transformations: Apply transformations(Add/Delete field) to output records. More info here
Using Connector
Let's put your connector to the test. At the workspace's bottom-left corner, you'll find two buttons: "Export YAML" and "Publish to Workspace."
Export YAML: Clicking "Export YAML" will download a YAML file containing all your connector's configurations. You can even access the YAML file within the workspace by clicking the "YAML" button at the top, beside the UI. This autogenerated YAML code is truly remarkable!
Publish to Workspace: The "Publish to Workspace" button does exactly what it says. It publishes your custom connector to your workspace, making it accessible for your use. Here's how to proceed:
Click "Publish to Workspace."
Navigate to the "Sources" tab.
Search for the custom source you've created.
Open it and provide the necessary fields.
Voila! Your source is now ready to roll. You can use it seamlessly with your preferred destination. The ability to test and publish your connector within the Airbyte workspace ensures a smooth transition from creation to integration.
Contributing with Connector
For open-source enthusiasts looking to contribute a new connector to Airbyte, here's a step-by-step guide:
Begin by launching your terminal or command prompt.
Navigate to the root directory of the Airbyte project. This is usually the location where you cloned the Airbyte repository from GitHub
And run these commands
cd airbyte-integrations/connector-templates/generator
./generate.sh
After successful execution, a list of options will appear. Use the up and down arrow keys to navigate. In this case, select "Configuration Based Source."
Next, provide a name for your source. For instance, let's use "Todoist."
Your template is now ready. Navigate to it with this command:
Your template is now ready. Navigate to it with this command:
cd airbyte-integrations/connectors/source-todoist
Here is a directory structure of the connector folder
β .dockerignore β build.gradle β Dockerfile β icon.svg β main.py β metadata.yaml β README.md β requirements.txt β setup.py β ββββintegration_tests β acceptance.py β configured_catalog.json β invalid_config.json β sample_config.json β __init__.py β ββββsecrets β config.json β ββββsource_todoist β manifest.yaml β source.py β __init__.py β ββββschema customers.json employees.json TODO.md
Copy the content from your
todoist.yaml
file.Paste this content into the
manifest.yaml
file.Since the schema information is now present in the
manifest.yaml
file, you can safely delete the entireschema
folder.
Testing Connector
Testing your source connector is an essential phase to ensure its proper functioning, reliability, and compatibility with Airbyte.
Building the Docker Image of the Source:
Execute the following commands to build the Docker image of your source connector:
docker build . -t airbyte/source-retently:dev
Running the Acceptance Test:
Use the following command to run the acceptance test:
python -m pytest integration_tests -p integration_tests.acceptance
Before running the acceptance tests, ensure that you modify the acceptance-test-config.yaml
file to align with your source configuration.
Alternatively, you can opt for Docker to run the acceptance test:
./acceptance-test-docker.sh
Once the tests passed you can create Pull Request as per convention used by Airbyte and it can be read here.
Conclusion
Imagine having a specific API you want to integrate, replicate data from, or simply connect to, but it's missing from Airbyte's connector catalog. You might think you need to wait for a contributor to pick up the task, or that you have to possess strong programming skills to craft a new connector. Well, think again!
Enter Airbyte's Connector Builder tool, designed to empower you to create and utilize your own connectors without requiring in-depth coding knowledge. The process is remarkably user-friendly, even allowing you to contribute your creation to Airbyte's connector catalog, benefiting fellow users. This also opens doors for aspiring contributors to engage in the open-source community without feeling limited by programming expertise.
The Airbyte Builder UI truly simplifies the connector creation process, making it accessible to everyone. Don't hesitate any longer β start crafting your connectors, enrich the connector catalog, and contribute to Airbyte's remarkable open-source ecosystem. It's time to harness the power of data connectivity, all at your fingertips.
Great Resources to look
In this section, I will be providing some references that can help you while creating connectors,
Airbyte Docs: The official Airbyte documentation is your ultimate guide, offering comprehensive insights into all aspects of connector creation.
Connector Builder Docs: Delve deeper into the Connector Builder UI with its dedicated documentation, offering an in-depth understanding of its functionalities.
Yaml Reference: Understand its components and properties as they relate to your connector.
Advanced yaml: Explore the potential of YAML with advanced features like $parameters, references, string interpolation, and custom components. These elements can supercharge your connector.
Slack: Don't hesitate to turn to the Airbyte Slack community for support. It's a hub of knowledge where you can ask questions, clear doubts, and gain insights from fellow creators.
Subscribe to my newsletter
Read articles from Parthiv Makwana directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by