/beta/query_datalinks#
This guide details available datalinks, table schemas, sample queries, and field definitions for
querying SynMax’s datasets via the unified /v4/beta/query_datalinks endpoint.
Key Benefits:
Query multiple datasets with a single endpoint (Hyperion, Vulcan, Leviaton)
Use standard SQL syntax across all datalinks
Access real-time and historical data in one place
Warning
Beta API Notice: This endpoint is currently in beta and may change as we refine the feature.
Endpoint#
Method |
POST |
URL |
|
Request Examples#
curl -X POST "https://hyperion.api.synmax.com/v4/beta/query_datalinks" \
-H "Access-Key: $API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"query": "SELECT * FROM hdl.production_by_well LIMIT 10"
}'
from synmax.hyperion.v4 import HyperionApiClient
SYNMAX_ACCESS_TOKEN = "<YOUR_API_TOKEN>"
client = HyperionApiClient(api_key=SYNMAX_ACCESS_TOKEN)
query = "SELECT * FROM hdl.production_by_well LIMIT 10"
response_generator = client.beta_query_datalinks(query=query)
response_df = response_generator.df()
print(response_df.head())
# Example of a Windows PowerShell request to the beta_query_datalinks endpoint.
$headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
$headers.Add("access_key", "<YOUR_API_TOKEN>")
$headers.Add("Content-Type", "application/json")
$body = @"
{`"query`":`"SELECT * FROM 1005_v_pipeline_flow WHERE gas_day >= '2024-01-01' LIMIT 100`",`"session_id`":`"123`"}
"@
$response = Invoke-RestMethod 'https://hyperion.api.synmax.com/v4/beta/query_datalinks' -Method 'POST' -Headers $headers -Body $body
$response | ConvertTo-Json
Hyperion#
Comprehensive upstream oil and gas intelligence covering the entire well lifecycle from drilling to production across U.S. and Canadian basins. Includes historical state-reported monthly production (1960s-present), well attributes, and short/long-term forecasts with 411M+ production records and 4.6M+ wells. Daily tracking of drilling rigs, frac crews, completions, and DUC inventory provides real-time visibility into well development activity (coverage begins Sept 2020 for most U.S. regions; May 2024 for Canada). Also includes daily interstate pipeline nomination and capacity data tracking natural gas movements across the U.S. transmission network at point-by-point level with scheduled flows, operating capacities, and location metadata.
hdl.production_by_well#
Well-level monthly oil and gas production data derived from state and provincial regulatory reporting, covering US states (TX, PA, CO, ND, NM, LA, OK, OH, WV, etc.) and Canadian provinces (AB, BC). Each row represents one well’s production for one specific month, with pre-calculated daily rates and production month indexing for decline curve analysis. Supports IP rate calculations using production_month = 2 for first full month performance and regional aggregation via region_natgas and sub_region_natgas fields.
Warning
Production data has significant reporting lag (4 months to 3+ years depending on state). Recent months appear to show declining production due to incomplete reporting—exclude the most recent 6-12 months for trend analysis.
Table ID: hdl.production_by_well
Sample Queries#
-- Regional gas production time series (Lower 48)
SELECT
date_prod,
region_natgas,
SUM(prod_wet_gas_mcf_day) / 1000000.0 AS prod_gas_bcf_day,
COUNT(DISTINCT well_id) AS well_count
FROM hdl.production_by_well
WHERE date_prod BETWEEN '2023-01-01' AND '2024-06-01'
AND region_natgas NOT IN ('Alaska', 'Canada')
GROUP BY date_prod, region_natgas
ORDER BY date_prod, region_natgas;
-- IP rates for new wells (month 2 = first full production month)
SELECT
well_id,
date_prod,
prod_wet_gas_mcf_day AS ip_gas_mcf_day,
prod_oil_bbl_day AS ip_oil_bbl_day,
operator,
sub_region_natgas
FROM hdl.production_by_well
WHERE production_month = 2
AND date_prod >= '2024-01-01'
AND prod_wet_gas_mcf_month > 10000
ORDER BY prod_wet_gas_mcf_day DESC
LIMIT 50;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
well_id |
varchar |
Unique well identifier (API number). Example: |
well_api |
varchar |
Well API number (alternate identifier) |
well_name |
varchar |
Well name |
well_no |
varchar |
Well number. Example: |
field_name |
varchar |
Name of the field that the well is located in. Example: |
date_prod |
date |
Production month (first day of month). Example: |
production_month |
integer |
Months since first production (1-indexed). Example: |
prod_wet_gas_mcf_month |
integer |
Monthly gas production in MCF/month |
prod_oil_bbl_month |
integer |
Monthly oil production in BBL/month |
prod_water_bbl_month |
integer |
Monthly water production in BBL/month |
prod_wet_gas_mcf_day |
decimal |
Daily gas production rate in MCF/day (pre-calculated) |
prod_oil_bbl_day |
decimal |
Daily oil production rate in BBL/day (pre-calculated) |
prod_water_bbl_day |
decimal |
Daily water production rate in BBL/day (pre-calculated) |
operator |
varchar |
Operating company name |
state_code |
varchar |
State/province code (TX, PA, CO, ND, AB, BC, etc.) |
county |
varchar |
County name |
region_natgas |
varchar |
Natural gas region (gulf, northeast, midwest, west, Canada) |
sub_region_natgas |
varchar |
Sub-region (Haynesville - TX, SW PA, Permian-NM, etc.) |
formation |
varchar |
Producing formation/reservoir |
township |
varchar |
Township location |
range |
varchar |
Range location |
section |
varchar |
Section location |
hdl.wells#
Comprehensive well master containing 4.6M+ wells (1900-2025) with detailed information from regulatory filings, permits, completion reports, and production records. Includes standardized drilling/completion dates, lateral lengths, wellpad assignments, coordinates, and geological data across 7,552 formations and 14 basins. Use well_id as primary key and wellpad_id to analyze multi-well pad development patterns.
Table ID: hdl.wells
Sample Queries#
-- Recent horizontal wells in Permian Basin
SELECT
well_id,
operator,
horizontal_length,
date_first_production,
produced_formation
FROM hdl.wells
WHERE sub_region_natgas = 'Permian-NM'
AND wellbore_type = 'Horizontal'
AND date_spud >= '2023-01-01'
ORDER BY horizontal_length DESC
LIMIT 20;
-- Count wells by operator and basin for 2024 completions
SELECT
operator,
sub_region_natgas,
COUNT(*) AS well_count,
AVG(horizontal_length) AS avg_lateral_ft
FROM hdl.wells
WHERE date_completion >= '2024-01-01'
AND date_completion < '2025-01-01'
GROUP BY operator, sub_region_natgas
ORDER BY well_count DESC
LIMIT 20;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
well_id |
nvarchar |
Unique well identifier/API number (primary key). Example: |
wellpad_id |
bigint |
Groups wells that share the same drilling pad location |
operator |
nvarchar |
Operating company name. Example: |
county |
nvarchar |
County or municipal district name |
state_code |
nvarchar |
State/province code (TX, PA, OH, WV, NM, OK, LA, ND, CO, AB, BC) |
region_natgas |
nvarchar |
Natural gas region (gulf, northeast, midwest, west, Alaska, Canada) |
sub_region_natgas |
nvarchar |
Sub-region/basin (Permian-NM, Haynesville - LA, SW PA, etc.) |
lat_surface |
decimal |
Surface latitude where drilling begins (decimal degrees) |
lon_surface |
decimal |
Surface longitude where drilling begins (decimal degrees) |
lat_bottomhole |
decimal |
Bottomhole latitude where horizontal wellbore ends |
lon_bottomhole |
decimal |
Bottomhole longitude where horizontal wellbore ends |
date_permit |
date |
Date drilling permit was issued. Example: |
date_spud |
date |
Date when drilling operations began (spud date) |
date_completion |
date |
Date when well completion operations finished |
date_first_production |
date |
Date when well first produced hydrocarbons |
depth_measured |
decimal |
Total measured depth of wellbore in feet |
depth_tvd |
decimal |
True vertical depth in feet |
horizontal_length |
decimal |
Length of horizontal/lateral section in feet. Example: |
wellbore_type |
nvarchar |
Well configuration (Horizontal, Vertical, Directional, Unknown) |
produced_formation |
nvarchar |
Primary producing formation (NIOBRARA, WOLFCAMP, MARCELLUS, etc.) |
produced_basin |
nvarchar |
Producing basin classification (PERMIAN BASIN, DENVER-CHEYENNE, etc.) |
hdl.long_term_forecast#
Three-year regional natural gas production forecasts at sub-region-quarter grain in BCF/d and MBD units. Based on public producer earnings for the public/reported portion of production, with growth rates adjusted by public/private frac activity ratios at the sub_region_natgas level. Updated quarterly to support long-term supply planning and regional production comparisons.
Table ID: hdl.long_term_forecast
Sample Queries#
-- Long-term forecast by region (time series)
SELECT
date_prod,
region_natgas,
SUM(prod_dry_gas_bcf_day) AS total_gas_bcf_day,
SUM(prod_oil_bbl_day) AS total_oil_bbl_day
FROM hdl.long_term_forecast
WHERE date_prod BETWEEN '2025-06-01' AND '2026-12-31'
GROUP BY date_prod, region_natgas
ORDER BY date_prod;
-- Sub-region production comparison (West and Northeast)
SELECT
date_prod,
sub_region_natgas,
prod_dry_gas_bcf_day
FROM hdl.long_term_forecast
WHERE region_natgas IN ('west', 'northeast')
AND date_prod >= '2025-06-01'
ORDER BY date_prod, prod_dry_gas_bcf_day DESC;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
date_prod |
date |
Production forecast date. Example: |
sub_region_natgas |
varchar |
Natural gas sub-region. Example: |
region_natgas |
varchar |
Natural gas region. Example: |
prod_dry_gas_bcf_day |
float |
Forecasted dry gas production (BCF/day). Example: |
prod_oil_bbl_day |
float |
Forecasted oil production (BBL/day). Example: |
hdl.long_term_forecast_history#
Historical archive of all long-term production forecast vintages, enabling comparison of how forecasts have evolved over time. Each row represents a forecast for a specific sub-region and production month, tagged with forecast_run_date indicating when that forecast was generated. Contains 19 forecast vintages from December 2022 to February 2026. Updated quarterly as new forecasts are generated.
The forecast methodology is based on public producer earnings guidance for the portion of production that is publicly reported. That growth rate is then adjusted by the ratio of public-to-private frac activity at the sub-region level — the growth rate increases when there is more private than public frac activity and decreases in the opposite case.
Note
The prod_oil_bbl_day and prod_dry_gas_bcf_day columns may contain NULLs in some forecast vintages. Not every vintage covers every sub-region or commodity. Filter by forecast_run_date to retrieve a single vintage.
Warning
Gas and oil forecasts are updated on different schedules. The latest forecast_run_date for gas and oil will often differ (e.g., gas updated 2026-02-17, oil updated 2025-11-21). A naive WHERE forecast_run_date = (SELECT MAX(forecast_run_date) ...) will return only the commodity that was updated most recently, with NULLs for the other. Always find the latest forecast_run_date independently per commodity. See the “Latest forecast” example below.
Warning
Values are daily rates (BCF/day for gas, BBL/day for oil), not monthly volumes. Use AVG() when aggregating across time periods and SUM() when aggregating across sub-regions.
Note
Sub-regions “L48” and “SD” are excluded from this table.
Table ID: hdl.long_term_forecast_history
Sample Queries#
-- Latest forecast (handles different gas/oil vintage dates)
-- IMPORTANT: Gas and oil may have different latest forecast_run_dates.
-- You must find the latest forecast_run_date independently for each commodity.
WITH latest_gas AS (
SELECT date_prod, sub_region_natgas, region_natgas, prod_dry_gas_bcf_day
FROM hdl.long_term_forecast_history
WHERE forecast_run_date = (
SELECT MAX(forecast_run_date)
FROM hdl.long_term_forecast_history
WHERE prod_dry_gas_bcf_day IS NOT NULL
)
AND prod_dry_gas_bcf_day IS NOT NULL
),
latest_oil AS (
SELECT date_prod, sub_region_natgas, region_natgas, prod_oil_bbl_day
FROM hdl.long_term_forecast_history
WHERE forecast_run_date = (
SELECT MAX(forecast_run_date)
FROM hdl.long_term_forecast_history
WHERE prod_oil_bbl_day IS NOT NULL
)
AND prod_oil_bbl_day IS NOT NULL
)
SELECT
COALESCE(g.date_prod, o.date_prod) AS date_prod,
COALESCE(g.sub_region_natgas, o.sub_region_natgas) AS sub_region_natgas,
COALESCE(g.region_natgas, o.region_natgas) AS region_natgas,
g.prod_dry_gas_bcf_day,
o.prod_oil_bbl_day
FROM latest_gas g
FULL OUTER JOIN latest_oil o
ON g.date_prod = o.date_prod
AND g.sub_region_natgas = o.sub_region_natgas
ORDER BY date_prod;
-- Compare two forecast vintages for Haynesville - LA
SELECT
a.date_prod,
a.prod_dry_gas_bcf_day AS gas_vintage_1,
b.prod_dry_gas_bcf_day AS gas_vintage_2,
b.prod_dry_gas_bcf_day - a.prod_dry_gas_bcf_day AS gas_revision
FROM hdl.long_term_forecast_history a
JOIN hdl.long_term_forecast_history b
ON a.date_prod = b.date_prod
AND a.sub_region_natgas = b.sub_region_natgas
WHERE a.forecast_run_date = '2025-06-06'
AND b.forecast_run_date = '2026-02-17'
AND a.sub_region_natgas = 'Haynesville - LA'
ORDER BY a.date_prod;
-- List all available forecast vintage dates
-- Gas and oil forecasts may have different vintage dates, so you may want to filter by commodity when listing vintages.
SELECT DISTINCT forecast_run_date
FROM hdl.long_term_forecast_history
ORDER BY forecast_run_date;
-- To list vintages for gas forecasts only:
SELECT DISTINCT forecast_run_date
FROM hdl.long_term_forecast_history
WHERE prod_dry_gas_bcf_day IS NOT NULL
ORDER BY forecast_run_date;
-- Aggregate by time period and geography for a specific vintage:
SELECT date_prod, region_natgas,
SUM(prod_dry_gas_bcf_day) AS total_gas_bcf_day,
SUM(prod_oil_bbl_day) AS total_oil_bbl_day
FROM hdl.long_term_forecast_history
WHERE forecast_run_date = '2025-06-15'
AND date_prod BETWEEN '2025-06-01' AND '2026-12-31'
GROUP BY date_prod, region_natgas
ORDER BY date_prod;
-- Track how a specific region's forecast has evolved over time:
SELECT forecast_run_date, date_prod,
SUM(prod_dry_gas_bcf_day) AS total_gas_bcf_day
FROM hdl.long_term_forecast_history
WHERE region_natgas = 'northeast'
AND date_prod = '2026-06-01'
GROUP BY forecast_run_date, date_prod
ORDER BY forecast_run_date;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
date_prod |
date |
Forecast target month. Example: |
sub_region_natgas |
varchar |
Sub-region. Example: |
region_natgas |
varchar |
High-level region. Example: |
forecast_run_date |
date |
Date the forecast was produced. Example: |
prod_dry_gas_bcf_day |
float |
as production forecast in BCF/day. Example: |
prod_oil_bbl_day |
float |
Oil production forecast in barrels/day. Example: |
hdl.tils#
TIL (Turned In Line) monitoring data tracking when wellpads transition from completed to producing status. Each row represents one wellpad being turned in line on a specific date, with fields for frac end date, TIL date, operator, and initial production rate. Use for analyzing completion-to-production timing, TIL inventory calculations, and production ramp-up analysis.
Warning
This dataset only covers Expand Energy, Apex, and EQT in Haynesville - LA and Northeast regions for the period beginning March 1, 2024. Do not use for analysis outside this scope.
Table ID: hdl.tils
Sample Queries#
-- TIL lag analysis by operator
SELECT
operator,
sub_region_natgas,
COUNT(*) AS wellpad_count,
SUM(wells) AS total_wells,
AVG(date_til - date_frac_end) AS avg_days_to_til
FROM hdl.tils
WHERE date_til IS NOT NULL
GROUP BY operator, sub_region_natgas
ORDER BY total_wells DESC;
-- Pending TIL inventory (completed but not yet producing)
SELECT
operator,
sub_region_natgas,
wellpad_id,
date_frac_end,
wells,
prod_wet_gas_bcf_day
FROM hdl.tils
WHERE date_til IS NULL
ORDER BY date_frac_end;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
wellpad_id |
varchar |
Wellpad identifier. Example: |
wells |
bigint |
Number of wells on the wellpad. Example: |
date_frac_end |
date |
Date frac operations ended. Example: |
date_til |
date |
Turned-in-line date (NULL if not yet TIL). Example: |
operator |
varchar |
Operator name (EXPAND, APEX, EQT). Example: |
sub_region_natgas |
varchar |
Natural gas sub-region (Haynesville - LA, NorthEast). Example: |
prod_wet_gas_bcf_day |
float |
Initial production rate (BCF/day). Example: |
hdl.dry_gas_factors#
Monthly dry gas conversion factors by subregion, representing the ratio of dry gas to wet gas production. Each row is one subregion-month combination. Use for converting wet gas volumes to dry gas equivalents in production analysis and forecasting. Multiply wet gas production by dry_gas_factor to get dry gas equivalent. Factors typically range from 0.7 to 1.0 depending on gas composition in the subregion.
Table ID: hdl.dry_gas_factors
Sample Queries#
-- Latest dry gas factors by subregion
SELECT sub_region_natgas, date_factor, dry_gas_factor
FROM hdl.dry_gas_factors
WHERE date_factor >= '2024-01-01'
ORDER BY sub_region_natgas, date_factor;
-- Latest month's dry gas factors
SELECT *
FROM hdl.dry_gas_factors
WHERE date_factor = (SELECT MAX(date_factor) FROM hdl.dry_gas_factors);
-- Latest factor per subregion (if subregions update differently)
WITH latest AS (
SELECT sub_region_natgas, MAX(date_factor) AS max_date
FROM hdl.dry_gas_factors
GROUP BY sub_region_natgas
)
SELECT d.*
FROM hdl.dry_gas_factors d
INNER JOIN latest l
ON d.sub_region_natgas = l.sub_region_natgas
AND d.date_factor = l.max_date;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
sub_region_natgas |
varchar |
Subregion divisions based on natural gas production profiles |
date_factor |
date |
The month during which this dry gas factor applies |
dry_gas_factor |
float |
The dry gas factor (ratio of dry to wet gas, typically 0.7–1.0) |
hdl.oil_production#
Daily modeled oil production rates by subregion. Each row is one subregion-day combination. Analogous to hdl.daily_production but for oil instead of gas. Use for oil supply monitoring, trend analysis, and basin-level oil production comparisons.
Warning
Critical Aggregation Rules: Use AVG() for time aggregation (monthly/quarterly/annual) since values are daily RATES, not volumes. Use SUM() for spatial aggregation across sub-regions on a single date.
Table ID: hdl.oil_production
Sample Queries#
-- Daily oil production by subregion
SELECT date_prod, sub_region_natgas, prod_oil_bbl_day
FROM hdl.oil_production
WHERE date_prod >= '2024-01-01'
ORDER BY date_prod, sub_region_natgas;
-- Latest date's oil production by subregion
SELECT *
FROM hdl.oil_production
WHERE date_prod = (SELECT MAX(date_prod) FROM hdl.oil_production);
-- Monthly average oil production by subregion (time aggregation - use AVG)
SELECT
DATE_TRUNC('month', date_prod) as month,
sub_region_natgas,
AVG(prod_oil_bbl_day) as avg_daily_bbl
FROM hdl.oil_production
WHERE date_prod >= '2024-01-01'
GROUP BY DATE_TRUNC('month', date_prod), sub_region_natgas
ORDER BY 1, 3 DESC;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
date_prod |
date |
The date of production in YYYY-MM-DD format |
sub_region_natgas |
varchar |
Subregion divisions based on natural gas production profiles |
region_natgas |
varchar |
Region divisions based on natural gas production profiles |
prod_oil_bbl_day |
float |
Daily oil production in barrels |
hdl.short_term_forecast#
Monthly well-level production forecasts (2-3 months ahead) with oil/gas volumes in daily and monthly units. Each row represents a forecast for a specific well and production month, with date_forecast_run indicating when the forecast model was executed for tracking forecast vintages. Supports aggregation by county, operator, region, and sub-region for supply planning and trend analysis.
Note
This table uses MCF for gas units. To convert to BCF: prod_dry_gas_mcf_day / 1000000. When querying US-only production, filter with WHERE region_natgas != 'Canada'.
Table ID: hdl.short_term_forecast
Sample Queries#
-- Forecast production by sub-region (U.S. only)
SELECT
date_prod,
sub_region_natgas,
SUM(prod_dry_gas_mcf_day) AS total_gas_mcf_day,
SUM(prod_oil_bbl_day) AS total_oil_bbl_day
FROM hdl.short_term_forecast
WHERE region_natgas != 'Canada'
AND date_prod >= CURRENT_DATE
GROUP BY date_prod, sub_region_natgas
ORDER BY date_prod, total_gas_mcf_day DESC
LIMIT 50;
-- Operator forecast comparison
SELECT
operator,
date_prod,
SUM(prod_oil_bbl_day) AS total_oil_bbl_day,
SUM(prod_dry_gas_mcf_day) AS total_gas_mcf_day,
COUNT(DISTINCT well_id) AS well_count
FROM hdl.short_term_forecast
WHERE operator IN ('EOG RESOURCES', 'PIONEER NATURAL')
AND date_prod >= CURRENT_DATE
GROUP BY operator, date_prod
ORDER BY date_prod, operator;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
well_id |
varchar |
Unique well identifier. Example: |
date_prod |
date |
Production forecast month. Example: |
date_forecast_run |
date |
Date the forecast model was executed. Example: |
prod_dry_gas_mcf_day |
float |
Forecasted daily dry gas production (MCF/day). Example: |
prod_dry_gas_mcf_month |
float |
Forecasted monthly dry gas production (MCF/month). Example: |
prod_oil_bbl_day |
float |
Forecasted daily oil production (BBL/day) |
prod_oil_bbl_month |
float |
Forecasted monthly oil production (BBL/month) |
operator |
varchar |
Well operator. Example: |
state_code |
varchar |
State/province code. Example: |
county |
varchar |
County name |
region_natgas |
varchar |
Natural gas region (west, northeast, gulf, Canada) |
sub_region_natgas |
varchar |
Sub-region (NE PA, West - TX, Haynesville - LA, AB) |
hdl.short_term_forecast_declines#
Well-level decline curves combining state-reported production history with modeled forecasts. Each row is one well-month observation with both production volumes and metadata about whether values are state-reported or modeled. Use for type-curve analysis, decline rate calculations, EUR estimation, and distinguishing between actual vs forecasted production. Requires filtering by date_forecast_run to select a specific forecast vintage.
Note
Use the modeled field to distinguish between state-reported actuals (false) and forecast values (true). Use hdl.short_term_forecast_history_dates to discover available date_forecast_run values.
Table ID: hdl.short_term_forecast_declines
Sample Queries#
-- Well decline curves for a subregion
SELECT production_month, AVG(prod_wet_gas_mcf_day) as avg_gas_day
FROM hdl.short_term_forecast_declines
WHERE sub_region_natgas = 'Haynesville - LA'
AND production_month BETWEEN 1 AND 60
GROUP BY production_month
ORDER BY production_month;
-- Latest forecast vintage's decline data
SELECT *
FROM hdl.short_term_forecast_declines
WHERE date_forecast_run = (
SELECT MAX(date_forecast_run)
FROM hdl.short_term_forecast_declines
)
LIMIT 100;
-- Compare latest vs previous forecast vintage for a subregion
WITH vintages AS (
SELECT DISTINCT date_forecast_run
FROM hdl.short_term_forecast_declines
ORDER BY date_forecast_run DESC
LIMIT 2
)
SELECT d.date_forecast_run, d.production_month,
AVG(d.prod_wet_gas_mcf_day) AS avg_gas_day
FROM hdl.short_term_forecast_declines d
WHERE d.date_forecast_run IN (SELECT date_forecast_run FROM vintages)
AND d.sub_region_natgas = 'Haynesville - LA'
GROUP BY d.date_forecast_run, d.production_month
ORDER BY d.date_forecast_run, d.production_month;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
well_id |
varchar |
A well identifier |
date_prod |
date |
The date of production in YYYY-MM-DD format |
prod_wet_gas_mcf_month |
float |
Monthly wet gas production in thousand cubic feet |
prod_oil_bbl_month |
float |
Monthly oil production in barrels |
prod_wet_gas_mcf_day |
float |
Daily wet gas production in thousand cubic feet |
prod_oil_bbl_day |
float |
Daily oil production in barrels |
date_completion |
date |
Date when well completion operations finished |
first_production_month |
date |
Date of first production month |
production_month |
int |
The number of months this well has been producing |
county |
varchar |
The county the well is in |
operator |
varchar |
A well operator |
sub_region_natgas |
varchar |
Subregion divisions based on natural gas production profiles |
region_natgas |
varchar |
Region divisions based on natural gas production profiles |
date_forecast_run |
date |
The date the short term forecast was simulated on in YYYY-MM-DD format |
modeled |
bit |
Indicates whether the production values come from the forecast model (true) or state-reported data (false) |
state_reported |
bit |
Whether the state published any production numbers for this well |
state_code |
varchar |
A state abbreviation |
hdl.short_term_forecast_history#
Historical snapshots of well-level short-term production forecasts (backcasts). Each row is one well-month from a specific forecast run, allowing comparison of how forecasts evolved over time. Requires a date_forecast_run filter. Use for forecast accuracy analysis, model validation, and tracking how production estimates changed between forecast vintages.
Note
Use hdl.short_term_forecast_history_dates to discover available date_forecast_run values before filtering this table.
Table ID: hdl.short_term_forecast_history
Sample Queries#
-- Backcast for a specific forecast run
SELECT date_prod, sub_region_natgas,
SUM(prod_wet_gas_mcf_day) as total_gas_mcf_day
FROM hdl.short_term_forecast_history
WHERE date_forecast_run = '2025-01-25'
GROUP BY date_prod, sub_region_natgas
ORDER BY date_prod;
-- Aggregated latest vintage by subregion
SELECT date_prod, sub_region_natgas,
SUM(prod_wet_gas_mcf_day) AS total_gas_mcf_day,
SUM(prod_oil_bbl_day) AS total_oil_bbl_day
FROM hdl.short_term_forecast_history
WHERE date_forecast_run = (
SELECT MAX(date_forecast_run)
FROM hdl.short_term_forecast_history
)
GROUP BY date_prod, sub_region_natgas
ORDER BY date_prod, sub_region_natgas;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
well_id |
varchar |
A well identifier |
date_prod |
date |
The date of production in YYYY-MM-DD format |
prod_wet_gas_mcf_month |
float |
Monthly wet gas production in thousand cubic feet |
prod_oil_bbl_month |
float |
Monthly oil production in barrels |
prod_wet_gas_mcf_day |
float |
Daily wet gas production in thousand cubic feet |
prod_oil_bbl_day |
float |
Daily oil production in barrels |
date_forecast_run |
date |
The date the short term forecast was simulated on in YYYY-MM-DD format |
state_code |
varchar |
A state abbreviation |
region_natgas |
varchar |
Region divisions based on natural gas production profiles |
sub_region_natgas |
varchar |
Subregion divisions based on natural gas production profiles |
county |
varchar |
The county the well is in |
operator |
varchar |
A well operator |
hdl.short_term_forecast_aggregated_history#
Pre-aggregated subregion-level history of all short-term forecast runs. Each row is one subregion-month from a specific forecast run. Use for quickly comparing regional production forecast trajectories across multiple forecast vintages without needing to aggregate well-level data.
Note
Use hdl.short_term_forecast_history_dates to discover available date_forecast_run values.
Table ID: hdl.short_term_forecast_aggregated_history
Sample Queries#
-- Compare forecast vintages for a subregion
SELECT date_forecast_run, date_prod, prod_dry_gas_bcf_day
FROM hdl.short_term_forecast_aggregated_history
WHERE sub_region_natgas = 'Haynesville - LA'
AND date_prod >= '2025-01-01'
ORDER BY date_forecast_run, date_prod;
-- Latest forecast vintage by subregion
SELECT *
FROM hdl.short_term_forecast_aggregated_history
WHERE date_forecast_run = (
SELECT MAX(date_forecast_run)
FROM hdl.short_term_forecast_aggregated_history
);
-- Compare latest two vintages side by side for a subregion
WITH vintages AS (
SELECT DISTINCT date_forecast_run
FROM hdl.short_term_forecast_aggregated_history
ORDER BY date_forecast_run DESC
LIMIT 2
)
SELECT a.date_forecast_run, a.date_prod, a.sub_region_natgas,
a.prod_dry_gas_bcf_day, a.prod_oil_mbbl_day
FROM hdl.short_term_forecast_aggregated_history a
WHERE a.date_forecast_run IN (SELECT date_forecast_run FROM vintages)
AND a.sub_region_natgas = 'Haynesville - LA'
ORDER BY a.sub_region_natgas, a.date_prod, a.date_forecast_run;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
date_prod |
datetime |
The date of production in YYYY-MM-DD format |
prod_dry_gas_bcf_day |
float |
Daily dry gas production in billion cubic feet |
prod_oil_mbbl_day |
float |
Daily oil production in thousand barrels |
sub_region_natgas |
varchar |
Subregion divisions based on natural gas production profiles |
date_forecast_run |
date |
The date the short term forecast was simulated on in YYYY-MM-DD format |
hdl.short_term_forecast_history_dates#
Reference table listing all available short-term forecast run dates. Each row is one date on which the STF model was executed. Use for discovering available forecast vintages before querying hdl.short_term_forecast_history, hdl.short_term_forecast_aggregated_history, or hdl.short_term_forecast_declines.
Table ID: hdl.short_term_forecast_history_dates
Sample Queries#
-- List available forecast run dates
SELECT date_forecast_run
FROM hdl.short_term_forecast_history_dates
ORDER BY date_forecast_run DESC;
-- Latest 5 forecast run dates
SELECT date_forecast_run
FROM hdl.short_term_forecast_history_dates
ORDER BY date_forecast_run DESC
LIMIT 5;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
date_forecast_run |
date |
The date the short term forecast was simulated on in YYYY-MM-DD format |
hdl.complete_report_date#
Data completeness dates by subregion, indicating the last date for which each data type (pipeline scrapes, completions, gas production, oil production) is considered complete and reliable. Use for determining safe date ranges for analysis and understanding data latency by region. Query this before analyzing production data to avoid using incomplete reporting periods.
Table ID: hdl.complete_report_date
Sample Queries#
-- Check data completeness by subregion
SELECT sub_region_natgas, date_pipe_scrape_report,
date_gas_prod_report, date_oil_prod_report
FROM hdl.complete_report_date
ORDER BY sub_region_natgas;
-- Use in a subquery to cap other tables' date ranges
SELECT o.*
FROM hdl.oil_production o
INNER JOIN hdl.complete_report_date c
ON o.sub_region_natgas = c.sub_region_natgas
WHERE o.date_prod <= CAST(c.date_oil_prod_report AS date);
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
sub_region_natgas |
varchar |
Subregion divisions based on natural gas production profiles |
date_pipe_scrape_report |
date |
The complete date for state data used by the daily production regression model which is derived from daily pipe scrapes |
date_completion_report |
varchar |
Well completions information up to this date is complete |
date_gas_prod_report |
varchar |
Date up to which state-reported gas production data is complete |
date_oil_prod_report |
varchar |
Date up to which state-reported oil production data is complete |
hdl.lower_48_stf_filled#
Regional short-term forecast with plug regions filled in, providing daily production by subregion for both gas and oil. This table provides subregion-level breakdowns including modeled “plug” regions that estimate production for states not covered by the core STF model. Because it contains every subregion, a simple SUM() across all subregions replicates the national total that was previously available in the separate lower_48_total_stf endpoint. Use for regional supply analysis, comparing STF production across subregions, and deriving headline U.S. gas supply figures.
Note
Replicating lower_48_total_stf: To get the exact same national total previously available from /v4/lower_48_total_stf, simply SUM(prod_dry_gas_bcf_day) grouped by date_prod across all subregions.
Table ID: hdl.lower_48_stf_filled
Sample Queries#
-- Regional STF with plug regions
SELECT date_prod, sub_region_natgas, region_natgas,
prod_dry_gas_bcf_day, prod_oil_mbbl_day
FROM hdl.lower_48_stf_filled
WHERE date_prod >= '2025-01-01'
ORDER BY date_prod, sub_region_natgas;
-- Exact replica of lower_48_total_stf (national daily total)
SELECT date_prod,
SUM(prod_dry_gas_bcf_day) AS prod_dry_gas_bcf_day
FROM hdl.lower_48_stf_filled
GROUP BY date_prod
ORDER BY date_prod;
-- Latest national total
SELECT date_prod,
SUM(prod_dry_gas_bcf_day) AS prod_dry_gas_bcf_day
FROM hdl.lower_48_stf_filled
WHERE date_prod = (SELECT MAX(date_prod) FROM hdl.lower_48_stf_filled)
GROUP BY date_prod;
-- Monthly average national total (temporal aggregation uses AVG of daily sums)
WITH daily_totals AS (
SELECT date_prod,
SUM(prod_dry_gas_bcf_day) AS total_bcf_day
FROM hdl.lower_48_stf_filled
GROUP BY date_prod
)
SELECT DATE_TRUNC('month', date_prod) AS month,
AVG(total_bcf_day) AS avg_bcf_day
FROM daily_totals
GROUP BY DATE_TRUNC('month', date_prod)
ORDER BY month;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
date_prod |
date |
The date of production in YYYY-MM-DD format |
sub_region_natgas |
varchar |
Subregion divisions based on natural gas production profiles |
region_natgas |
varchar |
Region divisions based on natural gas production profiles |
prod_dry_gas_bcf_day |
float |
Daily dry gas production in billion cubic feet |
prod_oil_mbbl_day |
float |
Daily oil production in thousand barrels |
hdl.geowells#
Lightweight well master with geographic coordinates and key attributes. A slimmer alternative to hdl.wells focused on location data and essential metadata. Each row is one well. Use for geographic mapping, spatial queries, and lightweight well lookups when the full wells table is not needed.
Table ID: hdl.geowells
Sample Queries#
-- Well locations for mapping
SELECT well_id, lat_surface, lon_surface, operator,
sub_region_natgas, well_status
FROM hdl.geowells
WHERE region_natgas = 'gulf'
AND date_spud >= '2024-01-01'
LIMIT 1000;
-- Wells spud in the last 90 days
SELECT *
FROM hdl.geowells
WHERE date_spud >= CURRENT_DATE - INTERVAL '90 days'
ORDER BY date_spud DESC;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
well_id |
varchar |
A well identifier |
lat_surface |
decimal(9,6) |
Latitude of the wellbore surface location |
lon_surface |
decimal(9,6) |
Longitude of the wellbore surface location |
county |
varchar |
The county the well is in |
date_spud |
date |
Spud date from well drilling permit in YYYY-MM-DD format |
depth_tvd |
decimal(10,2) |
True vertical depth of the wellbore |
horizontal_length |
decimal(10,2) |
Horizontal length of the wellbore |
operator |
varchar |
A well operator |
region_natgas |
varchar |
Region divisions based on natural gas production profiles |
sub_region_natgas |
varchar |
Subregion divisions based on natural gas production profiles |
state_code |
varchar |
A state abbreviation |
wellpad_id |
bigint |
The unique identifier for the wellpad |
well_status |
varchar |
The status of the well |
hdl.rigs#
Daily drilling rig activity at wellpad-day grain tracking active rigs across North American oil and gas operations. Each row represents one active drilling rig working on a specific wellpad for a single day. Average drilling duration is 44-73 days per wellpad. Drilling is the first step in well development, leading production by 3-6 months. Use COUNT(*) grouped by date_observed for daily rig counts; use COUNT(DISTINCT wellpad_id) to count unique wellpads drilled.
Warning
Regional Coverage Varies: Gulf/West regions have data from Sept 2020; Northeast from March 2021; Canada from May 2024. Always verify date coverage by region_natgas before cross-regional comparisons.
Table ID: hdl.rigs
Sample Queries#
-- Daily active rig count (time series)
SELECT
date_observed,
COUNT(*) as active_rigs
FROM hdl.rigs
WHERE date_observed >= '2024-01-01'
GROUP BY date_observed
ORDER BY date_observed;
-- Permian basin rig activity by operator
SELECT
operator,
COUNT(*) as rig_days,
COUNT(DISTINCT wellpad_id) as wellpads_drilled
FROM hdl.rigs
WHERE sub_region_natgas IN ('West - TX', 'Permian-NM')
AND date_observed >= '2024-01-01'
GROUP BY operator
ORDER BY rig_days DESC
LIMIT 20;
Note
Recreating /rigs/latest: The /rigs/latest API endpoint returns only the most recent date’s rig data per region. Each region independently returns its most current data, accounting for staggered update schedules (e.g., Canada may lag the U.S.).
-- Exact equivalent of /rigs/latest (per-region latest)
WITH latest_dates AS (
SELECT region_natgas, MAX(date_observed) AS max_date
FROM hdl.rigs
GROUP BY region_natgas
)
SELECT r.*
FROM hdl.rigs r
INNER JOIN latest_dates ld
ON r.region_natgas = ld.region_natgas
AND r.date_observed = ld.max_date;
-- Latest U.S.-only (single date, excludes Canada)
SELECT *
FROM hdl.rigs
WHERE date_observed = (
SELECT MAX(date_observed)
FROM hdl.rigs
WHERE region_natgas != 'Canada'
);
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
wellpad_id |
varchar |
Unique identifier for the wellpad being drilled. Example: |
date_observed |
date |
Date of rig activity. Example: |
lat |
float |
Wellpad latitude in decimal degrees. Example: |
long |
float |
Wellpad longitude in decimal degrees. Example: |
operator |
varchar |
E&P company operating the well (mineral rights owner). Example: |
county |
varchar |
County name. Example: |
state_code |
varchar |
State/province 2-letter abbreviation. Example: |
sub_region_natgas |
varchar |
Detailed basin/area classification. Example: |
region_natgas |
varchar |
High-level geographic region (gulf, west, midwest, northeast, Canada). Example: |
rig_class |
varchar |
Well target commodity type (oil, gas). Blank for Canada. Example: |
hdl.fraccrews#
Daily hydraulic fracturing (frac) crew activity at wellpad-day grain tracking active frac crews across North American oil and gas operations. Each row represents one active frac crew working on a specific wellpad for a single day. Average frac duration is 13-21 days per wellpad. Fracking is the completion phase that follows drilling, typically occurring 1-3 months after drilling and leading production by 1-3 months. Use COUNT(*) grouped by date_frac for daily crew counts.
Warning
Regional Coverage Varies: Gulf/West regions have data from Sept 2020; Northeast from March 2021; Canada from May 2024. Canadian records may have blank service_company and frac_class fields.
Table ID: hdl.fraccrews
Sample Queries#
-- Daily active frac crew count (time series)
SELECT
date_frac,
COUNT(*) as active_crews
FROM hdl.fraccrews
WHERE date_frac >= '2024-01-01'
GROUP BY date_frac
ORDER BY date_frac;
-- Frac activity by service company
SELECT
service_company,
COUNT(*) as crew_days,
COUNT(DISTINCT wellpad_id) as wellpads_fracked
FROM hdl.fraccrews
WHERE date_frac >= '2024-01-01'
AND service_company != ''
GROUP BY service_company
ORDER BY crew_days DESC
LIMIT 15;
Note
Recreating /fraccrews/latest: The /fraccrews/latest API endpoint returns only the most recent date’s frac crew data per region. Each region independently returns its most current data, accounting for staggered update schedules (e.g., Canada may lag the U.S.).
-- Exact equivalent of /fraccrews/latest (per-region latest)
WITH latest_dates AS (
SELECT region_natgas, MAX(date_frac) AS max_date
FROM hdl.fraccrews
GROUP BY region_natgas
)
SELECT f.*
FROM hdl.fraccrews f
INNER JOIN latest_dates ld
ON f.region_natgas = ld.region_natgas
AND f.date_frac = ld.max_date;
-- Latest U.S.-only (single date, excludes Canada)
SELECT *
FROM hdl.fraccrews
WHERE date_frac = (
SELECT MAX(date_frac)
FROM hdl.fraccrews
WHERE region_natgas != 'Canada'
);
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
wellpad_id |
varchar |
Unique identifier for the wellpad being fracked. Example: |
date_frac |
date |
Date of frac crew activity. Example: |
lat |
float |
Wellpad latitude in decimal degrees. Example: |
long |
float |
Wellpad longitude in decimal degrees. Example: |
operator |
varchar |
E&P company operating the well (mineral rights owner). Example: |
service_company |
varchar |
Frac service provider performing the work. Blank for Canada. Example: |
county |
varchar |
County name. Example: |
state_code |
varchar |
State/province 2-letter abbreviation. Example: |
sub_region_natgas |
varchar |
Detailed basin/area classification. Example: |
region_natgas |
varchar |
High-level geographic region (gulf, northeast, west, midwest, Canada). Example: |
frac_class |
varchar |
Well target commodity type (oil, gas). Blank for Canada. Example: |
hdl.completions#
Well-level completion records showing individual wells fractured as part of wellpad-level frac jobs. Each row represents a single well that was completed during a wellpad frac operation. The start_date and end_date represent the timing of the frac job at the wellpad level—multiple wells from the same wellpad share these dates. Always filter by end_date to measure completion activity over time, as this is when wells become ready for production.
Note
Date Field Usage: Use end_date (not start_date) for all completion activity trending and counts. This represents when the wellpad frac job was finished and wells are ready for production.
Table ID: hdl.completions
Sample Queries#
-- Monthly completions by region (use end_date)
SELECT
DATE_TRUNC('month', end_date) as completion_month,
region,
COUNT(*) as wells_completed
FROM hdl.completions
WHERE end_date >= '2024-01-01'
AND end_date < '2025-01-01'
GROUP BY DATE_TRUNC('month', end_date), region
ORDER BY completion_month, wells_completed DESC;
-- Top operators by completion count
SELECT
operator_name,
completion_class,
COUNT(*) as wells_completed
FROM hdl.completions
WHERE end_date >= '2024-01-01'
GROUP BY operator_name, completion_class
ORDER BY wells_completed DESC
LIMIT 20;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
well_id |
bigint |
Unique identifier for each individual well completed. Example: |
completion_class |
varchar |
Oil or gas classification based on EIA-defined production ratios. Example: |
county |
varchar |
County where the frac job occurred. Example: |
date_completion_end |
date |
Date when the wellpad frac job ended (use for activity trending). Example: |
date_completion_start |
date |
Date when the wellpad frac job started. Example: |
operator |
varchar |
Company that performed the frac job. Example: |
region_natgas |
varchar |
Natural gas production region. Example: |
state_code |
varchar |
State abbreviation. Example: |
sub_region_natgas |
varchar |
Specific sub-region/basin. Example: |
wellpad_id |
varchar |
Identifier for the wellpad where the frac job was performed. Example: |
hdl.ducs_by_operator#
Daily snapshot of Drilled but Uncompleted (DUC) well inventory at county-operator-day grain. Each row represents the count of DUCs for a single operator in a specific county on a given observation date. DUCs are wells that have been drilled but not yet completed (fractured). Rising DUC inventory indicates drilling outpacing completions (rig/frac ratio >3); falling inventory indicates completions outpacing drilling (ratio <2.5). Use SUM(ducs_count) for aggregation and preserve daily granularity for time-series visualization.
Note
Time-Series Visualization: Never aggregate DUC inventory into weekly/monthly sums—this is a daily snapshot, not a flow. Display as a daily time-series or use AVG() for period averages.
Table ID: hdl.ducs_by_operator
Sample Queries#
-- Current total DUC inventory (most recent snapshot)
SELECT
SUM(ducs_count) as total_ducs
FROM hdl.ducs_by_operator
WHERE date_observed = (SELECT MAX(date_observed) FROM hdl.ducs_by_operator);
-- DUC inventory time-series by region (daily granularity)
SELECT
date_observed,
region_natgas,
SUM(ducs_count) as duc_inventory
FROM hdl.ducs_by_operator
WHERE date_observed >= '2024-01-01'
GROUP BY date_observed, region_natgas
ORDER BY date_observed, duc_inventory DESC;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
county |
varchar |
County name. Example: |
date_observed |
date |
Observation date for the DUC inventory snapshot. Example: |
ducs_count |
int |
Number of drilled but uncompleted wells for this operator/county. Example: |
operator |
varchar |
Operator name. Example: |
region_natgas |
varchar |
High-level geographic region. Example: |
state_code |
varchar |
State abbreviation. Example: |
sub_region_natgas |
varchar |
Detailed basin/area classification. Example: |
hdl.daily_fracked_feet#
Daily fracked feet by subregion, tracking the rate of hydraulic fracturing completion activity. Each row represents the total lateral feet fracked on a given day within a subregion. Use for monitoring frac intensity trends, comparing basin-level completion rates, and correlating frac activity with production outcomes.
Table ID: hdl.daily_fracked_feet
Sample Queries#
-- Daily fracked feet by subregion
SELECT date_frac, sub_region_natgas, fracked_feet
FROM hdl.daily_fracked_feet
WHERE date_frac >= '2024-01-01'
ORDER BY date_frac, sub_region_natgas;
-- Latest day's fracked feet by subregion
SELECT *
FROM hdl.daily_fracked_feet
WHERE date_frac = (SELECT MAX(date_frac) FROM hdl.daily_fracked_feet);
-- Latest per subregion (if subregions update at different times)
WITH latest AS (
SELECT sub_region_natgas, MAX(date_frac) AS max_date
FROM hdl.daily_fracked_feet
GROUP BY sub_region_natgas
)
SELECT d.*
FROM hdl.daily_fracked_feet d
INNER JOIN latest l
ON d.sub_region_natgas = l.sub_region_natgas
AND d.date_frac = l.max_date;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
date_frac |
date |
The date observed via satellite that the well crew completed fracking in YYYY-MM-DD format |
region_natgas |
varchar |
Region divisions based on natural gas production profiles |
fracked_feet |
float |
Daily length of feet fracked for wells |
sub_region_natgas |
varchar |
Subregion divisions based on natural gas production profiles |
hdl.operator_classification#
Reference table classifying well operators as public or private companies, with stock ticker symbols for public operators. Use for segmenting drilling/production analysis by operator type and linking to financial data via ticker. Join to other hdl tables on the operator column to add public/private segmentation to any analysis.
Table ID: hdl.operator_classification
Sample Queries#
-- List all public operators with tickers
SELECT operator, ticker
FROM hdl.operator_classification
WHERE public_or_private = 'public'
ORDER BY operator;
-- Count operators by classification
SELECT public_or_private, COUNT(*) AS operator_count
FROM hdl.operator_classification
GROUP BY public_or_private;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
operator |
varchar |
A well operator name |
ticker |
varchar |
The stock ticker of the operator (null for private companies) |
public_or_private |
varchar |
Whether the operator is public or private |
hdl.wellpad_status#
Daily interpolated wellpad lifecycle status (Drilling or Fracking) by date, operator, and geography. Each row represents the status of a wellpad on a given day. Use for tracking wellpad progression through the drilling-to-completion lifecycle, counting active wellpads by status, and analyzing operator development timing.
Table ID: hdl.wellpad_status
Sample Queries#
-- Count of wellpads by status on a given date
SELECT date_status, wellpad_status, region_natgas, COUNT(*) as pad_count
FROM hdl.wellpad_status
WHERE date_status >= '2024-01-01'
GROUP BY date_status, wellpad_status, region_natgas
ORDER BY date_status;
-- Latest status snapshot (global max date)
SELECT *
FROM hdl.wellpad_status
WHERE date_status = (SELECT MAX(date_status) FROM hdl.wellpad_status);
-- Latest status per region (if regions update at different times)
WITH latest AS (
SELECT region_natgas, MAX(date_status) AS max_date
FROM hdl.wellpad_status
GROUP BY region_natgas
)
SELECT w.*
FROM hdl.wellpad_status w
INNER JOIN latest l
ON w.region_natgas = l.region_natgas
AND w.date_status = l.max_date;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
date_status |
date |
The date the wellpad status was applied in YYYY-MM-DD format |
wellpad_status |
varchar |
The status of the wellpad such as Drilling or Fracking |
operator |
varchar |
A well operator |
county |
varchar |
The county the wellpad is in |
state_code |
varchar |
A state abbreviation |
sub_region_natgas |
varchar |
Subregion divisions based on natural gas production profiles |
region_natgas |
varchar |
Region divisions based on natural gas production profiles |
wellpad_id |
varchar |
The unique identifier for the wellpad |
hdl.regions#
Reference table mapping counties to SynMax natural gas region and subregion classifications. Each row represents a unique county-to-region mapping. Use for joining geographic context to other datasets, building region filter dropdowns, or looking up which region/subregion a county belongs to.
Table ID: hdl.regions
Sample Queries#
-- List all subregions in a region
SELECT DISTINCT sub_region_natgas, state_code
FROM hdl.regions
WHERE region_natgas = 'gulf'
ORDER BY sub_region_natgas;
-- Find which region a county belongs to
SELECT state_code, county, sub_region_natgas, region_natgas
FROM hdl.regions
WHERE county = 'WEBB'
AND state_code = 'TX';
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
state_code |
varchar |
A state abbreviation |
sub_region_natgas |
varchar |
Subregion divisions based on natural gas production profiles |
region_natgas |
varchar |
Region divisions based on natural gas production profiles |
county |
varchar |
The county name |
hdl.glossary#
Reference table of available filter parameter values for all Hyperion API endpoints. Each row represents a unique combination of filter values available for a given target_function (API endpoint). Use this table to discover valid operators, regions, counties, and other filter values before querying the corresponding data table. Not all columns are populated for every target_function; irrelevant columns will be NULL (e.g., service_company is only populated for fraccrews).
Table ID: hdl.glossary
Sample Queries#
-- Get all valid operators for rigs data
SELECT DISTINCT operator
FROM hdl.glossary
WHERE target_function = 'rigs'
AND operator IS NOT NULL
ORDER BY operator;
-- Get all valid sub-regions for short term forecast
SELECT DISTINCT sub_region_natgas
FROM hdl.glossary
WHERE target_function = 'shorttermforecast'
AND sub_region_natgas IS NOT NULL
ORDER BY sub_region_natgas;
-- List all target functions (endpoints) available
SELECT DISTINCT target_function
FROM hdl.glossary
ORDER BY target_function;
-- Get available forecast run dates
SELECT DISTINCT date_forecast_run
FROM hdl.glossary
WHERE target_function = 'shorttermforecast'
AND date_forecast_run IS NOT NULL
ORDER BY date_forecast_run DESC;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
target_function |
nvarchar(50) |
The API endpoint / data table these filter values apply to (e.g. ‘rigs’, ‘fraccrews’, ‘wells’, ‘shorttermforecast’) |
region_natgas |
nvarchar(15) |
Region divisions based on natural gas production profiles |
state_code |
nvarchar(15) |
A state abbreviation |
sub_region_natgas |
nvarchar(30) |
Subregion divisions based on natural gas production profiles |
county |
nvarchar(50) |
The county the well is in |
operator |
nvarchar(120) |
A well operator |
service_company |
nvarchar(120) |
The service company for the well |
completion_class |
nvarchar(6) |
Oil/Gas classification based on EIA-defined gas/oil production ratios |
rig_class |
nvarchar(6) |
Oil/Gas classification based on EIA-defined gas/oil production ratios |
frac_class |
nvarchar(6) |
Oil/Gas classification based on EIA-defined gas/oil production ratios |
date_forecast_run |
nvarchar(10) |
The date the short term forecast was simulated on in YYYY-MM-DD format |
modeled |
nvarchar(6) |
Whether the record is modeled (‘True’) or state-reported (‘False’) |
rec_del |
float |
Record deletion flag |
category |
nvarchar(20) |
Pipeline location category (e.g. Production, LDC, Power, Storage, LNG, Interconnect) |
source |
nvarchar(20) |
Pipeline data source/cycle (e.g. Timely, Evening, ID1, ID2, ID3) |
hdl.pipeline_flow#
Simplified, cleaned daily pipeline nomination dataset containing flow volumes, capacities, and location metadata for interstate natural gas pipeline points across the United States. Each row represents one day’s best flow data for a specific pipeline location, enabling supply/demand analysis via location_category filtering (Production, LNG, Power, Industrial, Storage, Interconnect), directional flow separation via rec_del (1=receipt into pipeline, -1=delivery out), and time-series tracking via smx_location_id. Supports regional aggregation, capacity utilization analysis, and LNG export feedgas monitoring.
Table ID: hdl.pipeline_flow
Sample Queries#
-- Daily LNG export feedgas by terminal
SELECT
gas_day,
pipeline_name,
location_name,
SUM(total_scheduled_quantity) AS total_feedgas
FROM hdl.pipeline_flow
WHERE location_category = 'LNG'
AND rec_del = -1
AND gas_day >= '2024-01-01'
GROUP BY gas_day, pipeline_name, location_name
ORDER BY gas_day, pipeline_name;
-- Identify constrained points (high utilization)
SELECT
gas_day,
pipeline_name,
location_name,
location_category,
total_scheduled_quantity,
operating_capacity,
(total_scheduled_quantity / NULLIF(operating_capacity, 0)) * 100 AS utilization_pct
FROM hdl.pipeline_flow
WHERE operating_capacity > 0
AND gas_day = '2024-12-01'
AND (total_scheduled_quantity / NULLIF(operating_capacity, 0)) > 0.9
ORDER BY utilization_pct DESC;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
smx_location_id |
int |
Unique SynMax location identifier. One |
gas_day |
date |
The gas day this data represents (standard pipeline day). |
cycle_lookup |
varchar |
Standardized nomination cycle (Timely, Evening, Intraday1, etc.). |
cycle_desc |
varchar |
Raw pipeline-reported cycle description. |
posting_dt |
datetime |
When pipeline published this data. |
created_dt |
datetime |
When SynMax ingested this data. |
pipeline_name |
varchar |
Name of the pipeline. Example: |
location_name |
varchar |
Name of the specific point. Example: |
smx_tsp_id |
varchar |
SynMax-standardized pipeline ID. |
location_county |
varchar |
County name. |
location_state_ab |
varchar |
State abbreviation (TX, LA, PA, etc.). |
location_county_composite |
varchar |
Combined “County (State)” format. |
location_latitude |
float |
Latitude coordinate. |
location_longitude |
float |
Longitude coordinate. |
region_nat_gas |
varchar |
Regional classification (reflects regional source of gas for production points). |
sub_region_nat_gas |
varchar |
Sub-regional classification. |
rec_del |
int |
Canonical receipt/delivery flag: |
flow_indicator |
varchar |
Directional context from pipeline: |
flow_indicator_long |
varchar |
Full description (e.g., “Receipt Location”, “Delivery Location”). |
location_category |
varchar |
Functional role: Production, Interconnect, LDC, Power, Industrial, Storage, LNG, Compressor, etc. |
total_scheduled_quantity |
decimal |
Primary flow metric - nominated/scheduled gas volume for this day. |
design_capacity |
decimal |
Theoretical maximum built capacity. |
operating_capacity |
decimal |
Allowed maximum for this gas day. |
operationally_available_capacity |
decimal |
Remaining schedulable capacity (≈ operating_capacity − total_scheduled_quantity). |
mean_basis_desc |
varchar |
Unit of measurement: |
interconnect_location_id |
varchar |
ID of interconnected pipeline/facility. |
interconnect_company_name |
varchar |
Name of interconnected company. |
interconnect_location_name |
varchar |
Name of interconnected location. |
source |
varchar |
Data source identifier (pipeline-specific). |
hdl.pipeline_locations#
Master reported flow point dataset containing location metadata for interstate natural gas pipeline points across the United States. Each row represents one reported data point for a specific pipeline location with one smx_location_id per physical point. Use this table for location lookups and filtering, then join to hdl.pipeline_flow_only for flow data. Metadata is updated periodically as points are better classified or geographic data is added, providing the latest location info without re-querying flow data.
Table ID: hdl.pipeline_locations
Sample Queries#
-- Search for points by location name pattern
SELECT DISTINCT
smx_location_id,
pipeline_name,
location_name,
location_category,
location_county_composite
FROM hdl.pipeline_locations
WHERE location_name ILIKE '%STRATTON%'
OR location_name ILIKE '%FREEPORT%'
LIMIT 50;
-- Count locations by category and state
SELECT
location_category,
location_state_ab,
COUNT(*) AS location_count
FROM hdl.pipeline_locations
WHERE location_category IN ('Production', 'LNG', 'Power', 'Storage')
GROUP BY location_category, location_state_ab
ORDER BY location_category, location_count DESC;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
smx_location_id |
int |
Unique SynMax location identifier. One |
pipeline_name |
varchar |
Name of the pipeline. Example: |
smx_tsp_id |
varchar |
SynMax-standardized pipeline ID. Example: |
location_name |
varchar |
Name of the specific point. Example: |
pipeline_location_id |
varchar |
Pipeline-specific location identifier as reported by operator. Example: |
flow_indicator |
varchar |
Directional context: |
flow_indicator_short |
varchar |
Short flow indicator code. Example: |
flow_indicator_long |
varchar |
Full description (e.g., “Receipt Location”, “Delivery Location”, “Bi-Directional”). |
location_category |
varchar |
Functional role: Production, Interconnect, LDC, Power, Industrial, Storage, LNG, Compressor, etc. Example: |
rec_del |
int |
Canonical receipt/delivery flag: |
region_nat_gas |
varchar |
SynMax regional classification. Example: |
sub_region_nat_gas |
varchar |
SynMax sub-regional classification. Example: |
location_county_composite |
varchar |
Combined “County (State)” format. Example: |
location_county |
varchar |
County name. Example: |
location_state_ab |
varchar |
State abbreviation (TX, LA, PA, etc.). |
location_latitude |
float |
Latitude coordinate. Example: |
location_longitude |
float |
Longitude coordinate. Example: |
effective_dt |
datetime |
Date when this location became effective as reported by pipeline. Example: |
inactive_dt |
datetime |
Date when this location was deactivated (if applicable). Example: |
loc_status_code |
varchar |
Location status code (e.g., |
interconnect_location_id |
varchar |
ID of interconnected pipeline/facility. Example: |
interconnect_company_name |
varchar |
Name of interconnected company. Example: |
interconnect_location_name |
varchar |
Name of interconnected location. Example: |
update_dt |
datetime |
When this location metadata was last updated in SynMax. Example: |
hdl.pipeline_flow_only#
Simplified, cleaned daily pipeline nomination dataset containing flow volumes and capacities for natural gas pipeline points across the United States without location metadata. Each row represents one day’s best flow data for a specific pipeline location (gas_day). This table provides the same core data as hdl.pipeline_flow but with a streamlined schema—join to hdl.pipeline_locations using smx_location_id for location details. Ideal for time-series analysis, capacity utilization tracking, and regional aggregation workflows.
Note
Zero Values: If all four capacity metrics (total_scheduled_quantity, design_capacity, operating_capacity, operationally_available_capacity) are zero, interpret as inactive/not applicable, not an error.
Table ID: hdl.pipeline_flow_only
Sample Queries#
-- Track LNG export feedgas with location join
SELECT
f.gas_day,
l.pipeline_name,
l.location_name,
SUM(f.total_scheduled_quantity) AS total_feedgas
FROM hdl.pipeline_flow_only f
JOIN hdl.pipeline_locations l ON f.smx_location_id = l.smx_location_id
WHERE l.location_category = 'LNG'
AND l.rec_del = -1
AND f.gas_day >= '2024-01-01'
GROUP BY f.gas_day, l.pipeline_name, l.location_name
ORDER BY f.gas_day, l.pipeline_name;
-- Pipeline capacity utilization with constrained points
SELECT
f.gas_day,
l.pipeline_name,
l.location_name,
l.location_category,
f.total_scheduled_quantity,
f.operating_capacity,
(f.total_scheduled_quantity / NULLIF(f.operating_capacity, 0)) * 100 AS utilization_pct
FROM hdl.pipeline_flow_only f
JOIN hdl.pipeline_locations l ON f.smx_location_id = l.smx_location_id
WHERE f.operating_capacity > 0
AND f.gas_day = '2024-12-01'
AND (f.total_scheduled_quantity / NULLIF(f.operating_capacity, 0)) > 0.9
ORDER BY utilization_pct DESC;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
smx_location_id |
int |
Unique location identifier (join key to |
gas_day |
date |
The gas day this data represents (standard pipeline day in CT). Example: |
cycle_lookup |
varchar |
Standardized nomination cycle (Timely, Evening, Intraday1, etc.). Example: |
cycle_desc |
varchar |
Raw pipeline-reported cycle description. Example: |
total_scheduled_quantity |
decimal |
Primary flow metric - nominated/scheduled gas volume for this day. Example: |
design_capacity |
decimal |
Theoretical maximum built capacity. Example: |
operating_capacity |
decimal |
Allowed maximum for this gas day. Example: |
operationally_available_capacity |
decimal |
Remaining schedulable capacity (≈ operating_capacity − total_scheduled_quantity). Example: |
mean_basis_desc |
varchar |
Unit of measurement: |
source |
varchar |
Data source identifier (pipeline-specific). Example: |
created_dt |
datetime |
When SynMax ingested this data (in UTC). Example: |
posting_dt |
datetime |
When pipeline published this data (in UTC). Example: |
Vulcan#
Satellite-verified construction monitoring for energy infrastructure including power generation projects, data centers, and LNG facilities across the United States and globally. Tracks construction milestones from land clearing through commissioning via satellite imagery analysis, providing independent predictions of online dates that often differ from official EIA or developer estimates. Supports capacity forecasting, supply planning, and construction pipeline analysis with project-level visibility into solar, wind, battery, natural gas, hyperscale data centers, and LNG terminals.
vdl.under_construction#
Power generation projects (solar, wind, battery, natural gas) under construction with satellite-verified progress monitoring at the generator level. Each row represents a single generator with construction milestones, timeline predictions, and comparison of EIA-reported vs satellite-observed status. Supports technology filtering via technology, construction likelihood scoring via project_rank (0-7 scale), and schedule comparison via days_eia_minus_vulcan_earliest_online. Use vulcan_status = 'Confirmed' for satellite-verified projects and COUNT(DISTINCT plant_id) for unique facility counts since plants may have multiple generators.
Table ID: vdl.under_construction
Sample Queries#
-- Confirmed capacity under construction by technology
SELECT
technology,
COUNT(DISTINCT plant_id) AS plant_count,
COUNT(*) AS generator_count,
SUM(nameplate_capacity) AS total_capacity_mw
FROM vdl.under_construction
WHERE vulcan_status = 'Confirmed'
GROUP BY technology
ORDER BY total_capacity_mw DESC;
-- High-likelihood projects (rank >= 5) expected online in 12 months
SELECT
plant_name,
technology,
nameplate_capacity,
state_code,
date_vulcan_earliest_online,
project_rank,
days_eia_minus_vulcan_earliest_online
FROM vdl.under_construction
WHERE project_rank >= 5
AND vulcan_status = 'Confirmed'
AND date_vulcan_earliest_online BETWEEN CURRENT_DATE AND CURRENT_DATE + INTERVAL '365 days'
ORDER BY date_vulcan_earliest_online
LIMIT 25;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
synmax_id |
varchar |
Unique key combining plant_id + generator_id. Example: |
plant_id |
varchar |
EIA plant identifier. Example: |
plant_name |
varchar |
EIA plant name. Example: |
generator_id |
varchar |
EIA generator identifier. Example: |
entity_name |
text |
Operating entity/owner name |
total_generators |
integer |
Total count of generators at this plant |
technology |
text |
Generation technology (Solar Photovoltaic, Batteries, Natural Gas, Onshore Wind). Example: |
vulcan_tech |
varchar |
Simplified tech category for construction time estimates |
plant_status |
text |
EIA plant status with detail |
vulcan_status |
text |
Satellite verification status (Confirmed, none). Example: |
project_rank |
numeric |
Construction likelihood score (0-7 scale, higher = more likely). Example: |
state_code |
text |
2-letter state code. Example: |
balancing_authority |
text |
Grid balancing authority code. Example: |
latitude |
double precision |
Plant latitude. Example: |
longitude |
double precision |
Plant longitude. Example: |
nameplate_capacity |
bigint |
Generator rated capacity in MW. Example: |
nameplate_capacity_unit |
text |
Capacity unit (typically MW) |
date_planned_operation |
date |
EIA expected online date (normalized to 15th of month). Example: |
date_vulcan_earliest_online |
date |
Earliest predicted operational date based on construction progress. Example: |
date_vulcan_latest_online |
date |
Latest predicted operational date (conservative estimate). Example: |
date_vulcan_earliest_plus_7 |
date |
Earliest date if construction starts within 1 week |
date_land_cleared |
date |
First land clearing observed via satellite |
date_first_structures |
date |
First permanent structures identified |
date_construction_start |
date |
Construction start date |
date_construction_50_percent_complete |
text |
Visually assessed as 50% complete |
date_construction_completed |
text |
Construction complete per satellite imagery |
date_vulcan_status_change |
date |
Date status changed to under construction |
date_image |
date |
Latest satellite image date (data freshness indicator) |
observation |
text |
Satellite observation notes/comments |
days_eia_minus_vulcan_earliest_online |
integer |
Days difference: EIA - Vulcan earliest (positive = EIA later/more conservative). Example: |
days_eia_minus_vulcan_latest_online |
integer |
Days difference: EIA - Vulcan latest (negative = EIA earlier/more optimistic) |
vdl.datacenters#
Data center facility projects (hyperscale, co-location, enterprise, crypto mining) with satellite-verified construction progress, capacity forecasts, and power demand timelines at the unit level. Each facility (plant_id) may have multiple units (synmax_id), so aggregate to facility level when counting unique data centers. Supports filtering by data_center_type, vulcan_status = 'Confirmed' for verified construction, and owner analysis via owner_name. Use balancing_authority for grid-level power demand analysis.
Warning
Deduplication Required: This table contains historical snapshots. Each unit (synmax_id) may have multiple rows with different modified_at timestamps. Always filter for the latest modified_at per synmax_id using ROW_NUMBER() OVER (PARTITION BY synmax_id ORDER BY modified_at DESC) before analysis.
Note
State Code Inconsistency: The state_code field contains mixed formats—both full names (“Virginia”, “Texas”) and 2-letter codes (“VA”, “TX”). Normalize before aggregating by state.
Table ID: vdl.datacenters
Sample Queries#
-- Confirmed data centers under construction by type (deduplicated)
WITH latest_datacenters AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY synmax_id ORDER BY modified_at DESC) AS rn
FROM vdl.datacenters
)
SELECT
data_center_type,
COUNT(DISTINCT plant_id) AS facility_count,
SUM(unit_capacity) AS total_capacity_mw
FROM latest_datacenters
WHERE rn = 1
AND vulcan_status = 'Confirmed'
GROUP BY data_center_type
ORDER BY total_capacity_mw DESC;
-- Top data center owners by capacity (deduplicated)
WITH latest_datacenters AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY synmax_id ORDER BY modified_at DESC) AS rn
FROM vdl.datacenters
)
SELECT
owner_name,
COUNT(DISTINCT plant_id) AS facility_count,
SUM(unit_capacity) AS total_capacity_mw
FROM latest_datacenters
WHERE rn = 1
AND vulcan_status = 'Confirmed'
GROUP BY owner_name
ORDER BY total_capacity_mw DESC
LIMIT 15;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
synmax_id |
text |
Unique key: plant_id + unit_id. Example: |
plant_id |
bigint |
IIR plant identifier |
plant_name |
text |
IIR facility/plant name |
unit_id |
bigint |
IIR unit identifier |
unit_name |
text |
IIR unit name |
owner_name |
text |
Data center owner/operator |
data_center_type |
text |
Type: Hyperscale Data Center, Co-Location Data Center, Enterprise Data Center, Crypto Mining Data Center |
plant_status |
text |
IIR facility status (Under Construction, Planned, Operational, Engineered) |
unit_status |
text |
Unit status (Engineered, Planned, On Hold, Under Construction) |
vulcan_status |
text |
Satellite verification status (Confirmed, none). Example: |
state_code |
text |
State name or 2-letter code (mixed format—normalize before aggregation). Example: |
balancing_authority |
text |
Grid balancing authority (ERCOT, MISO, PJM, SPP). Example: |
unit_capacity |
numeric |
Unit power capacity in MW. Example: |
date_planned_operation |
date |
IIR expected online date |
date_vulcan_earliest_online |
date |
Earliest predicted operational date based on construction progress |
date_vulcan_latest_online |
date |
Latest predicted operational date (conservative estimate) |
date_vulcan_median_online |
date |
Median predicted operational date (middle estimate) |
date_vulcan_earliest_plus_7 |
text |
Earliest online if construction starts 1 week after last update |
date_land_cleared |
date |
First land clearing observed via satellite |
date_first_structures |
date |
First permanent structures identified |
date_construction_start |
date |
Construction start date |
date_construction_50_percent_complete |
text |
Visually assessed as 50% complete |
date_construction_completed |
text |
Construction complete per satellite imagery |
date_vulcan_status_change |
date |
Date status changed to under construction |
date_image |
date |
Latest satellite image date (data freshness indicator) |
observation |
text |
Satellite observation notes |
days_iir_minus_vulcan_earliest_online |
text |
Days difference: IIR - Vulcan earliest (positive = IIR later/more conservative) |
days_iir_minus_vulcan_latest_online |
text |
Days difference: IIR - Vulcan latest (negative = IIR earlier/more optimistic) |
source |
text |
Data source identifier. Example: |
created_at |
text |
Record creation timestamp |
modified_at |
date |
Last modification date (use for deduplication) |
vdl.lng_projects#
Global LNG facility development projects (liquefaction and regasification) with satellite-verified construction progress, capacity forecasts, and milestone tracking. Each row represents one project phase—use plant_name + phase_number as the composite unique key. Supports technology filtering via technology (Liquefaction for export terminals, Regasification for import terminals), schedule analysis via Vulcan date predictions, and capacity aggregation in MTPA (million tonnes per annum).
Table ID: vdl.lng_projects
Sample Queries#
-- LNG capacity under construction by technology type
SELECT
technology,
COUNT(*) AS project_count,
SUM(nameplate_capacity) AS total_capacity_mtpa,
SUM(trains) AS total_trains
FROM vdl.lng_projects
WHERE vulcan_status = 'Confirmed'
GROUP BY technology
ORDER BY total_capacity_mtpa DESC;
-- LNG projects expected online in next 24 months
SELECT
plant_name,
phase_number,
technology,
nameplate_capacity,
capacity_unit,
date_vulcan_earliest_online,
date_vulcan_latest_online,
entity_name
FROM vdl.lng_projects
WHERE vulcan_status = 'Confirmed'
AND date_vulcan_earliest_online BETWEEN CURRENT_DATE AND CURRENT_DATE + INTERVAL '730 days'
ORDER BY date_vulcan_earliest_online
LIMIT 20;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
synmax_id |
text |
Currently TBD for all LNG projects—do not use |
plant_name |
text |
LNG facility name (part of composite key). Example: |
phase_number |
integer |
Project phase number (part of composite key). Example: |
entity_name |
text |
Project owner/operator |
technology |
text |
Liquefaction (export) or Regasification (import). Example: |
vulcan_status |
text |
Satellite verification status (Confirmed, none). Example: |
latitude |
double precision |
Facility latitude |
longitude |
double precision |
Facility longitude |
nameplate_capacity |
double precision |
Rated capacity value. Example: |
capacity_unit |
text |
Capacity unit (typically mtpa = million tonnes per annum) |
trains |
integer |
Number of LNG trains or storage tanks |
date_planned_operation |
date |
Officially planned/expected operational date |
date_vulcan_earliest_online |
date |
Earliest predicted operational date based on construction progress |
date_vulcan_latest_online |
date |
Latest predicted operational date (conservative estimate) |
date_land_cleared |
date |
First land clearing observed via satellite |
date_first_structures |
date |
First permanent structures identified |
date_construction_start |
date |
Construction start date |
date_construction_50_percent_complete |
text |
Visually assessed as 50% complete |
date_construction_completed |
date |
Construction complete per satellite imagery |
date_vulcan_status_change |
date |
Date status changed to under construction |
date_image |
date |
Latest satellite image date (data freshness indicator) |
observation |
text |
Satellite observation notes/comments |
days_planned_operation_minus_vulcan_earliest_online |
text |
Days difference: planned - Vulcan earliest (positive = ahead of schedule) |
days_planned_operation_minus_vulcan_latest_online |
text |
Days difference: planned - Vulcan latest (positive = ahead of schedule) |
created_at |
date |
Record creation date |
modified_at |
date |
Last modification date |
vdl.project_rankings#
Construction likelihood rankings (0-7 scale) for power generation projects under development, providing probability scores and forecasted online dates. Each row represents a ranking snapshot for a specific project, with final_rank indicating construction likelihood (6-7 = very high, 4-5 = moderate, 2-3 = lower, 0-1 = unlikely). Use date_vulcan_proposed_v2_online for near-term (2-year) seasonal forecasts and date_vulcan_proposed_online for longer-term (5-year) conservative forecasts. Join with vdl.under_construction on synmax_id for full project details.
Warning
Deduplication Required: This table contains historical snapshots. Each project (synmax_id) may have multiple rows with different date_updated timestamps. Always filter for the latest date_updated per synmax_id using ROW_NUMBER() OVER (PARTITION BY synmax_id ORDER BY date_updated DESC) before analysis.
Table ID: vdl.project_rankings
Sample Queries#
-- High-likelihood projects (rank >= 5) deduplicated
WITH latest_ranks AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY synmax_id ORDER BY date_updated DESC) AS rn
FROM vdl.project_rankings
)
SELECT
synmax_id,
plant_id,
generator_id,
final_rank,
date_vulcan_proposed_v2_online,
date_updated
FROM latest_ranks
WHERE rn = 1
AND final_rank >= 5
ORDER BY final_rank DESC, date_vulcan_proposed_v2_online
LIMIT 25;
-- Rank distribution across all projects (deduplicated)
WITH latest_ranks AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY synmax_id ORDER BY date_updated DESC) AS rn
FROM vdl.project_rankings
)
SELECT
FLOOR(final_rank) AS rank_bucket,
COUNT(*) AS project_count
FROM latest_ranks
WHERE rn = 1
GROUP BY FLOOR(final_rank)
ORDER BY rank_bucket DESC;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
synmax_id |
character varying |
Unique project identifier (plant_id + generator_id). Example: |
plant_id |
bigint |
EIA plant identifier. Example: |
generator_id |
character varying |
EIA generator identifier. Example: |
final_rank |
real |
Construction likelihood score (0-7 scale, higher = more likely). Example: |
date_vulcan_proposed_online |
date |
Conservative forecast (5-year outlook). Example: |
date_vulcan_proposed_v2_online |
date |
Seasonal forecast (2-year outlook). Example: |
date_updated |
date |
Last ranking update date (use for deduplication). Example: |
created_at |
text |
Record creation timestamp (may be TBD) |
modified_at |
text |
Last modification timestamp (may be TBD) |
vdl.metadata_history#
Historical EIA metadata tracking for power generation projects containing one record per project per EIA monthly update. Enables analysis of capacity changes, schedule slippage, status transitions, and project lifecycle tracking over time. Use days_planned_operation_minus_first_seen_planned_operation to identify delayed projects (positive values indicate schedule slip in days). Supports technology analysis, state-level aggregation, and identifying projects that have retired or been cancelled.
Warning
Deduplication Required: This table contains multiple snapshots per project. Each project (synmax_id) appears once per EIA monthly update (date_eia_updated). Always filter for the latest date_eia_updated per synmax_id using ROW_NUMBER() OVER (PARTITION BY synmax_id ORDER BY date_eia_updated DESC) before analysis.
Table ID: vdl.metadata_history
Sample Queries#
-- Latest metadata for all projects (deduplicated)
WITH latest_metadata AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY synmax_id ORDER BY date_eia_updated DESC) AS rn
FROM vdl.metadata_history
)
SELECT
plant_name,
technology,
nameplate_capacity,
state_code,
date_planned_operations,
plant_status
FROM latest_metadata
WHERE rn = 1
ORDER BY nameplate_capacity DESC
LIMIT 25;
-- Projects with significant schedule slip (> 1 year delay)
WITH latest_metadata AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY synmax_id ORDER BY date_eia_updated DESC) AS rn
FROM vdl.metadata_history
)
SELECT
plant_name,
technology,
nameplate_capacity,
state_code,
days_planned_operation_minus_first_seen_planned_operation AS schedule_slip_days
FROM latest_metadata
WHERE rn = 1
AND days_planned_operation_minus_first_seen_planned_operation > 365
ORDER BY schedule_slip_days DESC
LIMIT 25;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
synmax_id |
text |
Unique project identifier (plant_id + generator_id) |
plant_id |
bigint |
EIA plant identifier |
plant_name |
varchar |
EIA plant name |
generator_id |
varchar |
EIA generator identifier |
generator_unit_code |
text |
Generator unit code |
entity_id |
bigint |
Operating entity identifier |
entity_name |
text |
Operating entity/owner name |
plant_status |
text |
Current EIA plant status |
first_seen_status |
varchar |
Status when first reported in EIA data |
last_seen_status |
varchar |
Status when last seen in EIA data |
state_code |
varchar |
2-letter state code. Example: |
county |
text |
County name |
latitude |
double precision |
Plant latitude |
longitude |
double precision |
Plant longitude |
balancing_authority_code |
text |
Grid balancing authority code |
technology |
text |
Generation technology (Solar PV, Natural Gas Combined Cycle, etc.) |
energy_source_code |
text |
Primary energy source (SUN, NG, WND) |
prime_mover_code |
text |
Prime mover type (BA = battery, CT = combustion turbine) |
sector |
text |
Ownership sector code |
sector_name |
text |
Ownership sector description |
nameplate_capacity |
double precision |
Rated/nameplate capacity in MW |
nameplate_capacity_unit |
text |
Capacity unit (typically MW) |
net_summer_capacity |
double precision |
Summer peak capacity (derated) |
net_winter_capacity |
double precision |
Winter peak capacity (derated) |
nameplate_energy_capacity |
double precision |
Storage capacity for batteries (MWh) |
date_eia_updated |
date |
EIA update date (use for deduplication) |
date_planned_operations |
date |
Planned/scheduled operation date |
date_operating |
date |
Actual operation date |
date_retirement |
date |
Retirement date |
date_first_seen |
date |
First appearance in EIA dataset |
date_last_seen |
date |
Last appearance in EIA dataset |
date_first_seen_planned_operations |
date |
First reported planned operation date |
date_planned_derate |
date |
Planned capacity reduction date |
planned_derate_of_summer_capacity |
double precision |
Planned reduction amount (MW) |
date_planned_uprate |
date |
Planned capacity increase date |
planned_uprate_of_summer_capacity |
double precision |
Planned increase amount (MW) |
date_planned_retirement |
date |
Planned retirement date |
days_last_seen_minus_first_seen |
double precision |
Days project has been in EIA dataset |
days_operation_minus_first_seen |
double precision |
Days from first seen to operation |
days_planned_operation_minus_first_seen_planned_operation |
double precision |
Schedule slip in days (positive = delayed). Example: |
created_at |
text |
Record creation timestamp |
modified_at |
text |
Last modification timestamp |
Daily Nat Gas Production#
Modeled daily dry gas production by sub-region (basin level) covering all major U.S. natural gas producing areas from January 2018 to present. Provides daily production rate estimates in Bcf/day with 60-90 day timeliness advantage over state-reported monthly production.
hdl.daily_production#
Daily dry gas production by sub-region for all major U.S. producing areas. Modeled daily production rates derived from pipeline flow data and other sources, providing more timely estimates than state-reported monthly production which lags by 60-90 days. Data represents dry gas (processed natural gas with NGLs removed) measured in Bcf/day across 30 sub-regions covering all major U.S. natural gas producing basins. Use date_prod for production date, sub_region_natgas for basin-level aggregation, and region_natgas for regional groupings. Supports real-time supply monitoring, basin-level trend analysis, and correlating upstream activity with production response.
Warning
Critical Aggregation Rules: Use AVG() for time aggregation (monthly/quarterly/annual) since values are daily RATES, not volumes. Use SUM() for spatial aggregation across sub-regions on a single date. Data is already in Bcf/day—no unit conversion needed.
Table ID: hdl.daily_production
Sample Queries#
-- Total U.S. daily production (spatial aggregation - use SUM)
SELECT
date_prod,
SUM(prod_dry_gas_bcf_day) as total_us_bcf_per_day
FROM hdl.daily_production
WHERE date_prod >= '2024-01-01'
GROUP BY date_prod
ORDER BY date_prod DESC
LIMIT 30;
-- Monthly basin production (time aggregation - use AVG)
SELECT
DATE_TRUNC('month', date_prod) as month,
sub_region_natgas,
AVG(prod_dry_gas_bcf_day) as avg_daily_bcf
FROM hdl.daily_production
WHERE date_prod >= '2024-01-01'
GROUP BY DATE_TRUNC('month', date_prod), sub_region_natgas
ORDER BY 1, 3 DESC
LIMIT 50;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
date_prod |
date |
Production date in YYYY-MM-DD format. Example: |
sub_region_natgas |
varchar |
Basin-level sub-region name. 30 unique sub-regions covering all major U.S. natural gas producing basins. Example: |
region_natgas |
text |
Regional grouping for sub-regions (gulf, northeast, west, midwest, southeast). Example: |
prod_dry_gas_bcf_day |
numeric |
Daily dry gas production rate in Bcf/day. Already in standard units—no conversion needed. Example: |
Leviaton#
Global LNG trade flow intelligence tracking cargo transactions, vessel movements, and fleet status for 921 LNG carriers worldwide. Provides vessel-level visibility into LNG supply chains from production terminals to consumption markets.
ldl.transaction_history_raw#
Comprehensive global LNG cargo tracking dataset recording every loading and offloading transaction for LNG carriers worldwide. Tracks complete cargo journeys from production facilities to consumption terminals with BCF volumes, enabling trade flow analysis via loading_origin_country for true production source identification and imo for individual vessel journey tracking.
Table ID: ldl.transaction_history_raw
Sample Queries#
-- US LNG exports by terminal (last 30 days)
SELECT
terminal,
COUNT(*) as transaction_count,
SUM(volume) as total_bcf
FROM ldl.transaction_history_raw
WHERE transaction_type = 'loading'
AND country = 'USA'
AND timestamp >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY terminal
ORDER BY total_bcf DESC
LIMIT 10;
-- Global trade flows by origin country
SELECT
loading_origin_country,
country as destination_country,
COUNT(*) as shipments,
SUM(volume) as total_volume_bcf
FROM ldl.transaction_history_raw
WHERE transaction_type = 'offloading'
AND loading_origin_country IS NOT NULL
AND timestamp >= '2024-01-01'
GROUP BY loading_origin_country, country
ORDER BY total_volume_bcf DESC
LIMIT 20;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
timestamp |
timestamp |
Exact time of the current transaction (loading/offloading). Example: |
imo |
bigint |
International Maritime Organization number - unique ship identifier. Example: |
terminal |
text |
Name of the current terminal where transaction occurred. Example: |
terminal_id |
text |
Unique identifier for the current terminal. Example: |
country |
text |
Country where current terminal is located. Example: |
country_code |
text |
Two-letter country code. Example: |
origin_timestamp |
timestamp |
Previous transaction timestamp (87% populated). Example: |
origin_terminal |
text |
Previous terminal in the cargo’s journey. Example: |
origin_terminal_id |
text |
Unique ID for previous terminal. Example: |
origin_country |
text |
Country of previous terminal. Example: |
origin_country_code |
text |
Country code of previous location. Example: |
loading_origin_timestamp |
timestamp |
When cargo was originally loaded (42% populated). Example: |
loading_origin_terminal |
text |
Original loading terminal (production facility). Example: |
loading_origin_terminal_id |
text |
Unique ID for original loading terminal. Example: |
loading_origin_country |
text |
Country where LNG was produced. Example: |
loading_origin_country_code |
text |
Producer country code. Example: |
volume |
double precision |
LNG cargo volume in BCF (billion cubic feet). Example: |
volume_unit |
text |
Volume unit, always |
transaction_type |
text |
Type of event: |
modified_at |
timestamp |
When this record was last updated in the database |
ldl.ais_history_raw#
Comprehensive global LNG vessel tracking database containing 26.7 million AIS records for 921 unique LNG vessels spanning January 2019 to present with real-time updates. Captures precise GPS coordinates, speed, heading, draught (loading indicator), and captain’s destination with standardized port codes for trade flow mapping, voyage optimization, and port congestion analysis.
Table ID: ldl.ais_history_raw
Sample Queries#
-- Track specific vessel journey (last 30 days)
SELECT
timestamp_dynamic,
latitude,
longitude,
speed,
captains_destination,
draught
FROM ldl.ais_history_raw
WHERE imo = 9636723
AND timestamp_dynamic >= CURRENT_DATE - INTERVAL '30 days'
ORDER BY timestamp_dynamic
LIMIT 100;
-- Daily active vessel count
SELECT
DATE(timestamp_dynamic) as date,
COUNT(DISTINCT vessel_id) as active_vessels
FROM ldl.ais_history_raw
WHERE timestamp_dynamic >= '2024-01-01'
GROUP BY DATE(timestamp_dynamic)
ORDER BY date
LIMIT 30;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
vessel_id |
varchar |
Unique internal vessel identifier. Example: |
imo |
integer |
International Maritime Organization number (7-digit unique vessel ID). Example: |
name |
varchar |
Vessel name. Example: |
mmsi |
integer |
Maritime Mobile Service Identity (some nulls present). Example: |
callsign |
varchar |
Radio call sign. Example: |
timestamp_dynamic |
timestamp with time zone |
Actual position timestamp for real-time tracking. Example: |
timestamp_static |
timestamp with time zone |
Static information timestamp. Example: |
longitude |
double precision |
Precise vessel longitude (-176° to 180°). Example: |
latitude |
double precision |
Precise vessel latitude (-36° to 78°). Example: |
location |
USER-DEFINED |
GeoJSON Point geometry with EPSG:4326 CRS |
speed |
double precision |
Vessel speed in knots (0-68.7 knots range). Example: |
heading |
double precision |
Vessel heading in degrees (0-511, where 511 = no data). Example: |
course |
double precision |
Course over ground in degrees (0-511). Example: |
captains_destination |
varchar |
Standardized port codes (100% complete). Example: |
captains_eta |
timestamp with time zone |
Captain’s estimated time of arrival. Example: |
draught |
double precision |
Vessel draught in meters (loading indicator, 0-19.1m). Example: |
flag |
varchar |
Flag state (currently 0% complete) |
ais_nav_status |
varchar |
Navigation status (currently 0% complete) |
rate_of_turn |
varchar |
Rate of turn data (mostly null) |
msg_collection_method |
varchar |
AIS message collection method (mostly null) |
created_at |
timestamp with time zone |
Record creation timestamp |
modified_at |
timestamp with time zone |
Record modification timestamp |
ldl.vessel_info_latest#
Real-time snapshot of 903 active LNG vessels combining latest AIS position with complete vessel specifications (capacity, dimensions, type) and forecasted destinations. Each vessel appears exactly once with its most current information, enabling fleet monitoring, supply forecasting, and load status analysis via draught thresholds. Use imo to join with ldl.ais_history_raw for historical tracking or ldl.transaction_history_raw for cargo movements.
Table ID: ldl.vessel_info_latest
Sample Queries#
-- Current fleet snapshot by vessel type
SELECT
vessel_type,
COUNT(*) as vessel_count,
AVG(maximum_capacity) as avg_capacity_cbm,
AVG(speed) as avg_speed_knots
FROM ldl.vessel_info_latest
WHERE synmax_status = 'ACTIVE'
GROUP BY vessel_type
ORDER BY vessel_count DESC;
-- Vessels heading to specific terminal (next 30 days)
SELECT
name,
imo,
maximum_capacity,
forecast_destination_terminal,
forecast_destination_timestamp,
speed,
draught
FROM ldl.vessel_info_latest
WHERE forecast_destination_terminal LIKE '%Sabine%'
AND forecast_destination_timestamp BETWEEN NOW() AND NOW() + INTERVAL '30 days'
ORDER BY forecast_destination_timestamp;
Field Reference#
Field Name |
Data Type |
Description |
|---|---|---|
vessel_id |
varchar |
Unique internal vessel identifier. Example: |
imo |
integer |
IMO number (7-digit unique vessel ID). Example: |
mmsi |
integer |
Maritime Mobile Service Identity. Example: |
name |
varchar |
Vessel name. Example: |
callsign |
varchar |
Radio callsign. Example: |
flag |
varchar |
Flag state. Example: |
vessel_type |
text |
Type of LNG vessel (LNG_CARRIER, FSRU, BUNKERING_TANKER, etc.). Example: |
commodity |
text |
Commodity type, always |
synmax_status |
text |
Vessel operational status. Example: |
maximum_capacity |
double precision |
Maximum cargo capacity in cubic meters. Example: |
maximum_capacity_unit |
text |
Capacity unit, always |
length |
integer |
Vessel length in meters. Example: |
width |
integer |
Vessel width in meters. Example: |
draught |
double precision |
Current vessel draft in meters (load indicator: <8m unladen, >10m loaded). Example: |
speed |
double precision |
Current speed in knots. Example: |
heading |
double precision |
Current heading in degrees. Example: |
course |
double precision |
Current course over ground. Example: |
latitude |
double precision |
Current latitude. Example: |
longitude |
double precision |
Current longitude. Example: |
location |
geometry |
GeoJSON Point geometry with EPSG:4326 CRS |
timestamp_dynamic |
timestamp with time zone |
Timestamp of latest position. Example: |
timestamp_static |
timestamp with time zone |
Static information timestamp |
captains_destination |
varchar |
Captain-reported destination. Example: |
captains_eta |
timestamp with time zone |
Captain-reported ETA. Example: |
forecast_destination_terminal_id |
text |
Unique terminal identifier for forecasted destination. Example: |
forecast_destination_terminal |
text |
SynMax forecasted destination terminal (85% populated). Example: |
forecast_destination_timestamp |
timestamp |
Predicted arrival time. Example: |
is_loaded |
boolean |
Load status indicator (unpopulated - use draught instead) |
current_charterer |
text |
Charter company (0% populated) |
ais_nav_status |
varchar |
Navigation status |
naval_flag_country_code |
char(2) |
Naval flag country code |
first_active_date |
timestamp with time zone |
First active date |
first_operation_date |
timestamp with time zone |
First operation date |
ais_created_at |
timestamp with time zone |
AIS record creation timestamp |
ais_modified_at |
timestamp with time zone |
AIS record modification timestamp |
vessel_info_created_at |
timestamp with time zone |
Vessel info creation timestamp |
vessel_info_modified_at |
timestamp with time zone |
Vessel info modification timestamp |
last_loading_terminal |
varchar |
Last terminal where vessel loaded cargo |
last_loading_country |
varchar |
Country of last loading |
last_loading_timestamp |
timestamp |
Timestamp of last loading |
ais_nav_status |
varchar |
AIS navigation status |
build_year |
int |
Year vessel was built |
modified_at |
timestamp |
Record modification timestamp |
created_at |
timestamp |
Record creation timestamp |