Skip to content

Development

Prerequisites

  • Docker + Docker Compose
  • uv — Python package manager

Local setup

# Clone and install
git clone https://github.com/18for0/eirgrid-downloader
cd eirgrid-downloader
uv sync           # installs all deps (including dev) into .venv

Start the full stack

make up           # starts MinIO + runs pipeline
make api          # starts FastAPI (in a second terminal)

The pipeline prints a summary on completion:

{
  "duration_seconds": 42.1,
  "chunks_uploaded": 148,
  "chunks_failed": 0,
  "errors": []
}

Makefile reference

Pipeline

Command Description
make up Build images + start MinIO + run pipeline
make minio Start MinIO only
make run Re-run pipeline without rebuilding
make run-source SOURCE=eirgrid Run a single source
make logs Tail logs/pipeline.log
make shell Bash inside the pipeline container
make down Stop all containers
make clean Stop + wipe volumes and logs
make status Show container status

API service

Command Description
make api Start FastAPI on port 6502 (MinIO must be running)
make api-rebuild Rebuild API image + start
make api-logs Tail API logs

Local (no Docker)

Command Description
make install uv sync
make test Run pytest
make lint ruff check
make run-local Run pipeline locally (requires S3_BUCKET in env)
make gap-check Run gap-detection + backfill dispatch locally (write-only — no dispatch unless PIPELINE_LAMBDA_ARN is set)
make docs Serve the MkDocs site on http://localhost:8000
make docs-build Build the static docs site into site/

Running tests

Tests run entirely without infrastructure (S3 and HTTP are mocked):

uv run pytest            # full suite
uv run pytest -v         # verbose
uv run pytest tests/test_s3.py   # single file

Test files:

File What it tests
test_api.py FastAPI endpoints (TestClient, cache mocked)
test_eirgrid_source.py EirGrid fetch logic, HTTP mocked
test_gaps.py Gap detection, backfill ledger, dispatch
test_healthcheck.py Healthcheck probe + SNS alert path
test_orchestrator.py run_pipeline with fake sources
test_python_version.py Python 3.13 runtime + cross-config consistency
test_s3.py S3Storage merge-on-write, dedup
test_summarise.py _to_records, _dumps, _s3_pattern

Hot-deploying static files

FastAPI's StaticFiles reads from disk on each request, so you can update the dashboard without rebuilding:

docker cp src/api/static/dashboard.js  eirgrid_api:/app/src/api/static/dashboard.js
docker cp src/api/static/dashboard.css eirgrid_api:/app/src/api/static/dashboard.css
docker cp src/api/static/index.html    eirgrid_api:/app/src/api/static/index.html

Dependency management

Dependencies are declared in pyproject.toml and pinned in uv.lock. Always commit uv.lock.

uv add <package>         # add a runtime dependency
uv add --group dev <pkg> # add a dev-only dependency
uv lock                  # regenerate uv.lock after manual edits
uv sync                  # install / sync from uv.lock

Adding a new data source

The pipeline is designed to accept multiple independent sources. To add one:

flowchart LR
    A["Create src/sources/<name>.py\nclass MySource(BaseSource)"] --> B["Implement async fetch()\nyields DataChunk objects"]
    B --> C["Register in\nsrc/sources/__init__.py\nREGISTERED_SOURCES"]
    C --> D["Add export summary\nin summarise.py\n(optional)"]

1. Implement BaseSource

# src/sources/mydata.py
from typing import AsyncIterator
from sources.base import BaseSource, DataChunk, SourceConfig

class MyDataSource(BaseSource):
    source_id = "mydata"

    def __init__(self, config: SourceConfig) -> None:
        self.config = config

    async def fetch(self) -> AsyncIterator[DataChunk]:
        # ... fetch data ...
        yield DataChunk(
            source_id=self.source_id,
            s3_key_suffix="area=myarea/region=ROI/year=2026/month=03/data.parquet",
            df=df,
            metadata={"area": "myarea", "rows": len(df)},
        )

2. Register the source

# src/sources/__init__.py
from sources.eirgrid import EirGridSource
from sources.mydata import MyDataSource

REGISTERED_SOURCES = [EirGridSource, MyDataSource]

That's it. The orchestrator, storage, and retry logic require no changes.

EirGrid API quirks

Quirk Details
fuelMix is snapshot-only Always returns current values regardless of the requested date range. The pipeline stamps each fetch with the run timestamp.
frequency/NI returns 403 Only ROI is requested. Restricted in DASHBOARD_AREA_REGIONS.
co2Emission/NI is unreliable Only ROI is requested.
demandActual/ALL returns 403 ALL demand is computed in the export layer as ROI + NI.
DashboardService 503s Retried automatically on 5xx and network errors. 4xx gives up immediately.
Frequency is 5-second resolution ~51k rows/month. Downsampled to hourly mean in DuckDB at export time.
vargen date cap The end timestamp is capped to yesterday to avoid empty responses for the current day.

Docs

This documentation is built with MkDocs Material and served from the API Lambda image at /guide/ (see ADR-007). The site ships in the image — there is no separate deployment target.

make docs               # live preview at http://localhost:8000
make docs-build         # build the static site into site/

mkdocs-material lives in a dedicated docs dependency group (uv sync --group docs), which the Dockerfile.api build pulls in for the docs-build stage only. The runtime image never installs mkdocs.

The site/ directory is gitignored locally. The image build runs mkdocs build --strict in a separate stage and copies /build/site into /app/site/ — a broken internal link therefore fails the image build (and CI).