Mannlowe Ops - Airbyte CDK Development steps
By Gaurav Arora on March 1, 2023
ExpertSteps Airbyte:
1. $ cd airbyte-integrations/connector-templates/generator # assumes you are starting from the root of the Airbyte project.
2. $ ./generate.sh
3. cd ../../connectors/source-<name>
python -m venv .venv # Create a virtual environment in the .venv directory
source .venv/bin/activate # enable the venv
pip install -r requirements.txt
4. Let's verify everything is working as intended. Run:
python main.py spec
5. Run the source using docker
# First build the container
docker build . -t airbyte/source-<name>:dev
# Then use the following commands to run it
docker run --rm airbyte/source-python-http-example:dev spec
6. Creating a spec.yaml file in source_<name>/spec.yaml which describes your connector's inputs according to the ConnectorSpecification schema
7. Note that this user-supplied configuration has the values described in the spec.yaml filled in. In other words if the spec.yaml said that the source
requires a username and password the config object might be { "username": "airbyte", "password": "password123" }.
8. In source.py we'll find the following autogenerated source:
class SourcePythonHttpTutorial(AbstractSource):
def check_connection(self, logger, config) -> Tuple[bool, any]:
9. python main.py check --config secrets/config.json
10. The discover method of the Airbyte Protocol returns an AirbyteCatalog: an object which declares all the streams output by a connector and their schemas.
11. This is a simple task with the Airbyte CDK. For each stream in our connector we'll need to:
Create a python class in source.py which extends HttpStream.
Place a <stream_name>.json file in the source_<name>/schemas/ directory. The name of the file should be the snake_case name of the stream whose schema it describes,
and its contents should be the JsonSchema describing the output from that stream.
12. With .json schema file in place, let's see if the connector can now find this schema and produce a valid catalog:
python main.py discover --config secrets/config.json # this is not a mistake, the schema file is found by naming snake_case naming convention as specified above
13. Read Data
To do this, we'll need a ConfiguredCatalog. We've prepared one here -- download this and place it in sample_files/configured_catalog.json. Then run:
python main.py read --config secrets/config.json --catalog sample_files/configured_catalog.json
Commands:
sudo docker build . -t airbyte/source-airside-reports:dev
sudo docker run --rm airbyte/source-airside-reports:dev spec
sudo docker run --rm -v $(pwd)/secrets:/secrets airbyte/source-airside-reports:dev check --config /secrets/config.json
sudo docker run --rm -v $(pwd)/secrets:/secrets airbyte/source-airside-reports:dev discover --config /secrets/config.json
sudo docker run --rm -v $(pwd)/secrets:/secrets -v $(pwd)/sample_files:/sample_files airbyte/source-airside-reports:dev read --config /secrets/config.json --catalog /sample_files/configured_catalog.json
14. Add the connector to the API/UI
Open the following file: airbyte-config/init/src/main/resources/seed/source_definitions.yaml
More articles on Airbyte