How to: Use Code Components to Pull API Data into a Bragi Pipeline
Bragi pipelines support custom functionality when SQL alone isn't enough, whether that's scraping from an API, transforming with external logic, or integrating with a non-SQL system.
In this example, we'll use Bragi's code component to scrape public transport data from an API, combine it with local weather metrics from a CSV file, and prep the data for further modelling. The result is a staged table that can be used to explore possible links between rainfall, temperature, and bus usage.
Introduction
We'll walk through how to set up a code component to scrape data from data.gg, archive the results, load a static weather CSV file, and stage the two datasets together for analysis.
Tips for Writing Code Components in Bragi:
✏️ Use bragiCodeUtil.GetHttpClientFactory() to obtain a preconfigured, secure HTTP client (handles proxies, headers, and security best practices automatically).
✏️ Your code should return a DataTable, not raw JSON. Bragi will automatically store, track, and version the data with full observability, so you don't need to implement any custom wiring or persistence logic.
This code component will run consistently across environments (Dev → Test → Prod) and can be monitored like any other pipeline step.
1. Scraping Public Transport Data
The dataset we're working with is Guernsey's public dataset of monthly bus usage.
We'll use this as our external input and pull it into a structured SQL-compatible format (dbo.buses) so we can treat it like any other internal table.
Source | URL | Format |
|---|---|---|
Guernsey bus usage | https://data.gg/api/1.0/buses/usage | JSON |
This returns standard JSON that looks like this:
We'll write a small Bragi code component in C# that:
Calls the API
Deserialises the response
Transforms the data into a
DataTablewith rows by year and columns for each monthReturns it to Bragi for archiving and staging
Let's look at the code. It's absolutely standard C# that combines with some of Bragi's timesaving helper methods:
When the code is plugged in to Bragi, it will automatically parse it, compile it and infer the data that is returned from it and prepare all the necessary the bulk load it in to a load table.
2. Archiving the Scraped Data
Once the code component is built, we archive the output using a traditional archive with the following settings:
Setting | Value | Description |
|---|---|---|
Schema | dbo | Database schema where the archive is stored |
Name | arc_buses | Archive name |
Description | (optional) | Description for the archive |
Type | Tracker Traditional | Archive type specifying change tracking method |
Update Non-Change Tracked Column | true | Enables update of columns not tracked by change tracking |
Include Expiry Logic | true | Enables record expiry logic in the archive |
Intra-Day Update | enabled | Allows intra-day updates of the archived data |
We don't expect the API data to change retroactively, but enabling Update Non-Change Tracked Column catches corrections if they do. Including expiry logic helps to manage the lifecycle of archived records, and intra-day updates ensure data is refreshed more frequently within the day.
This gives us historical tracking and the ability to diff across time if needed.
3. Loading Weather Data
We're pairing the bus usage with a simple weather.csv file with just a few columns:
Year
Month
Rainfall (mm)
Temperature (°C)
To load it:
Head to Load Configs
Create a Text / Excel File Load
Upload the file or link to it via a file source
Bragi will auto-detect the headers and map the columns for you.
This creates a load config (e.g. load_weather) you can then archive like any other dataset.
4. Archiving the Weather Data
Just like with the buses, create a Traditional Archive using the following settings:
Setting | Value |
|---|---|
Schema | dbo |
Name | arc_weather |
Description | (optional) |
Type | Tracker Traditional |
Update Non-Change Tracked Column | true |
Include Expiry Logic | true |
Intra-Day Update | enabled |
Now both datasets are in the archive with proper versioning and lifecycle management.
5. Staging the Combined Dataset
With both sources in place, we can build a simple stage that joins the two together on Year and Month.
You can now model seasonality, test correlations, or feed the result into forecasting logic!
6. Deploy to Test
Once the code, archives, and stages are configured, you can deploy into Test with one click. Once deployed, Bragi:
Compiles and packages the code
Binds the environment context
Makes it repeatable and traceable
If the API structure changes, Bragi will alert you to the failure, know exactly which version broke, and quickly roll out a fix.
Summary
This is a minimal example of how to use Bragi's custom functionality to pull in third-party data and make it part of a repeatable, governed pipeline.
The same pattern applies to:
Calling internal services or APIs
Referencing ML models or lookup logic
Performing data reshaping not suited to SQL
If something can output a DataTable, you can plug it into Bragi.