Upstream Data Sources¶
All grid data served by this project comes from three public EirGrid "Smart Grid Dashboard" backends. This page documents every HTTP call the pipeline makes, the parameters we send, the shape of the response rows we store, and how those raw rows flow through Parquet into the JSON summaries served by the API.
API backends at a glance¶
| Backend | Base URL | Areas covered | Date format |
|---|---|---|---|
| vargen | https://www.vargen.smartgriddashboard.com/api/export |
WIND, SOLAR |
YYYYMMDDHHII (path segment) |
| interconn | https://www.interconn.smartgriddashboard.com/api/Interconnector |
interconnector flows (EWIC, Greenlink, Moyle, Net) | YYYYMMDDHHII (path segment) |
| dashboard | https://www.smartgriddashboard.com/DashboardService.svc/data |
demandActual, fuelMix, co2Emission, frequency, SnspAll |
DD-Mon-YYYY (query string) |
The
dashboardbackend is known to be flaky — 5xx responses are common. The pipeline retries only on transient errors (HTTP 5xx or network transport errors); 4xx responses are treated as terminal. See_is_retryablefor the exact rule.
Supported (area, region) combinations on the dashboard backend are declared in DASHBOARD_AREA_REGIONS:
| Area | Allowed regions | Notes |
|---|---|---|
demandActual |
ROI, NI |
ALL is derived in export from ROI + NI |
fuelMix |
ROI |
Snapshot-only; ignores date range |
co2Emission |
ROI |
ALL/NI return 403 or null |
frequency |
ROI |
Fetched in weekly chunks — monthly requests time out |
SnspAll |
ALL |
Only meaningful for the whole island |
Fetches for the vargen backend cover all three regions (ROI, NI, ALL).
1. vargen — wind and solar generation¶
Endpoint¶
Path / query parameters¶
| Parameter | Source | Example | Notes |
|---|---|---|---|
start |
path | 202603010000 |
YYYYMMDDHHII — always 0000 for the start of day |
end |
path | 202603312359 |
YYYYMMDDHHII — capped at yesterday 23:59 to avoid empty responses |
region |
path | ROI, NI, ALL |
Fetched for all three |
area |
path | WIND, SOLAR |
One call per area + region pair |
FORMAT |
query | JSON |
Always JSON |
The pipeline issues 6 calls per calendar month (2 areas × 3 regions) and iterates across months from date.today() - DAYS_BACK to date.today().
Response row shape¶
{
"Rows": [
{
"EffectiveTime": "01-Mar-2026 00:00",
"FieldName": "WIND_FCAST",
"Value": 615.2,
"Region": "ROI",
"Effective_Date": "01-Mar-2026 00:00"
}
]
}
FieldNamedistinguishes actual vs forecast measurements (e.g.WIND_ACTUAL/WIND_FCAST). Thesummarise.pyexport keys output byFieldName.Valueis megawatts at 15-minute resolution.
Mapping¶
| Stage | Path |
|---|---|
| Parquet key | grid-data/eirgrid/area=wind/region={ROI\|NI\|ALL}/year={YYYY}/month={MM}/data.parquet (same pattern for area=solar) |
| JSON summary | grid-data/summary/wind/latest.json, grid-data/summary/solar/latest.json |
| API endpoint | GET /api/wind, GET /api/solar |
2. interconn — cross-border interconnector flows¶
Endpoint¶
GET https://www.interconn.smartgriddashboard.com/api/Interconnector/{start}/{end}/{region}?format=json
Path / query parameters¶
| Parameter | Source | Example | Notes |
|---|---|---|---|
start |
path | 202603010000 |
Same YYYYMMDDHHII format as vargen |
end |
path | 202603312359 |
Capped at yesterday 23:59 |
region |
path | ALL |
Hardcoded to ALL — the backend embeds each connector's own region in the response |
format |
query | json |
Always json |
One call per calendar month covers every interconnector.
Response row shape¶
{
"Rows": [
{
"Effective_Date": "01-Mar-2026 00:00",
"Field_Name": "INTER_EWIC",
"Value": 250.0,
"Region": "ROI"
}
]
}
Field_Nameis one ofINTER_EWIC,INTER_GRNLK,INTER_MOYLE,INTER_NET.- The
Regioncolumn varies per connector (EWIC/GRNLK → ROI,MOYLE → NI,NET → ALL) — the export groups byField_Name, never byRegion. summarise.pystrips theINTER_prefix: the JSON summary exposesEWIC,GRNLK,MOYLE,NET.
Mapping¶
| Stage | Path |
|---|---|
| Parquet key | grid-data/eirgrid/area=interconnection/region=ALL/year={YYYY}/month={MM}/data.parquet |
| JSON summary | grid-data/summary/interconnection/latest.json |
| API endpoint | GET /api/interconnection |
3. dashboard — demand, generation mix, CO₂, frequency, SNSP¶
Endpoint¶
GET https://www.smartgriddashboard.com/DashboardService.svc/data?area={area}®ion={region}&datefrom={DD-Mon-YYYY}&dateto={DD-Mon-YYYY}
All requests share the same four query parameters — only area and region differ.
| Parameter | Source | Example | Notes |
|---|---|---|---|
area |
query | demandActual |
See table below |
region |
query | ROI, NI, ALL |
Constrained by DASHBOARD_AREA_REGIONS |
datefrom |
query | 01-Mar-2026 |
DD-Mon-YYYY (locale-independent month abbreviation) |
dateto |
query | 31-Mar-2026 |
If the backend rejects a request with dateto = today (HTTP 403), the pipeline retries once with dateto = yesterday — see _fetch_dashboard_with_fallback |
Areas¶
| Area | Region(s) | Cadence | Fetch strategy | Notes |
|---|---|---|---|---|
demandActual |
ROI, NI |
15-min | Per month | ALL is computed in export (ROI + NI) |
co2Emission |
ROI |
15-min | Per month | ALL/NI variants return 403 or Value: null |
SnspAll |
ALL |
30-min | Per month | System Non-Synchronous Penetration (%) |
frequency |
ROI |
5-second | Weekly chunks | ~51k rows/month — monthly requests time out; summary downsamples to hourly mean |
fuelMix |
ROI |
Snapshot | Per month (date range ignored) | Always returns 5 rows reflecting the current fuel split |
Response row shape¶
{
"Rows": [
{
"EffectiveTime": "01-Mar-2026 00:00",
"FieldName": "FUEL_GAS",
"Value": 24000.0,
"Region": "ROI"
}
]
}
Column names vary slightly by area: the pipeline accepts any of EffectiveTime, Effective_Time, or Effective_Date as the timestamp column when parsing the DataFrame.
fuelMix — special handling¶
- The
datefrom/datetoparameters are ignored by the upstream API; it always returns the current 5 fuel rows (FUEL_GAS,FUEL_RENEW,FUEL_COAL,FUEL_OTHER_FOSSIL,FUEL_NET_IMPORT). - Values are cumulative MWh, not MW.
summarise.pynormalises them to percentages. - The upstream
EffectiveTimeis stale, so the pipeline overwrites it with the run timestamp (datetime.utcnow()) before persisting, letting daily-partitioned snapshots form a time series.
Mapping¶
| Area | Parquet key | JSON summary | API endpoint |
|---|---|---|---|
demandActual |
area=demandactual/region={ROI\|NI}/year=.../month=... (monthly) |
summary/demand/latest.json |
GET /api/demand |
co2Emission |
area=co2emission/region=ROI/year=.../month=... (monthly) |
summary/co2/latest.json |
GET /api/co2 |
SnspAll |
area=snspall/region=ALL/year=.../month=... (monthly) |
summary/snsp/latest.json |
GET /api/snsp |
frequency |
area=frequency/region=ROI/year=.../month=.../day={DD}/data.parquet (daily — one per weekly chunk, keyed by week-start) |
summary/frequency/latest.json (hourly avg) |
GET /api/frequency |
fuelMix |
area=fuelmix/region=ROI/year=.../month=.../day={DD}/data.parquet (daily — one per pipeline run) |
summary/generation/latest.json |
GET /api/generation |
Call volume per pipeline run¶
With the default DAYS_BACK=31, a single pipeline run issues roughly:
| Backend | Calls per month | Monthly windows | Total |
|---|---|---|---|
| vargen | 6 (2 areas × 3 regions) |
2 (spans month boundary) | 12 |
| interconn | 1 | 2 | 2 |
| dashboard (non-frequency) | 5 (demandActual×2 + fuelMix + co2Emission + SnspAll) |
2 | 10 |
| dashboard (frequency) | 1 per week | ~5 | 5 |
Typical total: ~29 HTTP calls per run, executed concurrently up to MAX_CONCURRENCY (default 10). See configuration.md for environment variables that tune concurrency, retry count, and request timeout.
Transforming upstream rows into JSON summaries¶
src/export/summarise.py reads the Parquet partitions with DuckDB and produces seven JSON files under grid-data/summary/. The mapping per area:
| Summary JSON | Source area(s) | Key transformation |
|---|---|---|
wind/latest.json |
vargen WIND × (ROI, NI, ALL) |
Split by FieldName into actual / forecast per region |
solar/latest.json |
vargen SOLAR × (ROI, NI, ALL) |
Same as wind |
demand/latest.json |
dashboard demandActual × (ROI, NI) |
ALL = merge-on-timestamp of ROI + NI values |
interconnection/latest.json |
interconn (ALL) | Grouped by Field_Name; INTER_ prefix stripped |
co2/latest.json |
dashboard co2Emission/ROI |
Region-keyed (ROI only — no upstream ALL/NI series) |
frequency/latest.json |
dashboard frequency/ROI |
date_trunc('hour', EffectiveTime) + avg(Value) (DuckDB query) |
snsp/latest.json |
dashboard SnspAll/ALL |
Passed through, filtered to Region = 'ALL' |
generation/latest.json |
dashboard fuelMix/ROI (daily snapshots) |
Grouped by FieldName; values normalised to % of total MWh |
All summaries follow the shape:
{
"generated_at": "2026-04-21T15:22:34Z",
"window_days": 30,
"series": {
"ROI": [ { "t": "2026-04-20T14:30:00", "value": 3866 }, ... ]
}
}
Wind and solar swap "value" for "actual"/"forecast". Generation keys each series by fuel type rather than region.
See also¶
- Data Model — S3 partitioning strategy and schemas
- API Reference — FastAPI response shapes
- Configuration —
DAYS_BACK,MAX_CONCURRENCY,MAX_RETRIES - Source of truth:
src/sources/eirgrid.py,src/export/summarise.py