🚧 SportsPerp is currently live on devnet. Mainnet target: before Jun 12, 2026 (World Cup kickoff).
The OBV IndexData Pipeline

Data Pipeline

SportsPerp’s index is only as good as its inputs. This page documents the full path from a football match being played in a Premier League stadium to an updated oracle price on Solana.

Data sources

Two independent feeds from our institutional data partner power the engine:

SourceModeProtocolLatencyWhat it provides
Post-match REST feedPost-match, batchREST~30 minutes post-matchOfficial OBV, season aggregates, match outcomes
Live event feedIn-match, streamingGraphQL subscriptions5–15 seconds per eventRaw event stream (passes, shots, defensive actions, etc.)

The REST feed is the canonical source — its numbers are what the partner publishes and what professional clubs consume. The live feed delivers event-level data during matches so SportsPerp can price markets in-play rather than freezing between fixtures.

The data-partner vendor’s identity is not disclosed publicly for competitive reasons.

Post-match pipeline (the batch path)

The batch path is simpler and is the authoritative source for index values between matches.

Post-match REST feed
      │
      â–¼
 [ crank.ts ]  ◄─── every 5 min (fetchIntervalMs = 300000)
      │
      ├─► season team stats       (team_season_obv_pg, …)
      ├─► season player stats     (player_season_obv_90, …)
      └─► recent match results    (W/D/L, dates)
      │
      â–¼
 [ form-calculator.ts ]  ◄── exponential-decay form, PPG
      │
      â–¼
 [ index-calculator.ts ] ◄── z-score per population, compose, scale
      │
      ├─► candle-store.ts  (SQLite 1m OHLC; aggregated 1H/4H/1D on read)
      ├─► ws-server.ts     (broadcasts ticks to connected clients)
      └─► oracle-pusher    (update_oracle on-chain)

Each crank cycle:

  1. Fetch season stats for all 20 teams and eligible players via REST.
  2. Compute form scores (exponential decay over last 6 matches) and PPG (average over last 10).
  3. Calculate z-scored composite indices for teams (league-wide) and players (within-position).
  4. Persist ticks to SQLite candles and broadcast on WebSocket.
  5. Push to the on-chain oracle only if the change exceeds the threshold. Thresholds are tunable; the off-chain pipeline has separate gates at the crank layer (per-cycle) and the pusher layer (final on-chain gate) to keep transaction costs bounded.
  6. Heartbeat any market that hasn’t seen an on-chain push within the pusher’s heartbeat interval, so the on-chain staleness protection never triggers under normal operation.

The crank is self-healing: if a REST call fails, it logs the error and tries again on the next cycle. The on-chain oracle never receives a partial or inconsistent update.

Live pipeline (the streaming path)

During live matches, the engine switches to a parallel streaming path that layers in-match event data on top of the batch baseline.

Live event feed (GraphQL subscriptions)
      │
      â–¼
 [ live-processor.ts ]  ◄── subscribes to match event stream
      │
      â–¼
 [ id-bridge.ts ]  ◄── translates live IDs to canonical IDs (fail-closed)
      │
      â–¼
 [ obv-engine (Python sidecar) ]
      │      PV-GF and PV-GA XGBoost models
      │      annotates each event with OBV delta
      │      consumed via POST /api/live-obv/matches/{id}/{start,events,end}
      │
      â–¼
 [ obv-store.ts ]
      │      per-match authoritative state
      │      per-(category × {net, gf, ga}) cross-tab
      │
      â–¼
 [ live-index-overlay.ts ]
      │      pinned z-score population from last batch cycle
      │      only playing teams/players get overlay updates
      │
      â–¼
 [ fallback-chain.ts ]
      │      authoritative / aggregated / heuristic
      │
      â–¼
 same downstream (candles, WS, oracle push)

Live↔Canonical ID bridge

The partner’s REST and live feeds use independent ID spaces. For example, Arsenal might be REST id 1 but live id 21; a player might be REST 39461 but live 106232. roster.json is keyed by canonical (REST) id, so every live event must be translated to its canonical entity before it can be attributed to a market.

The ID bridge is a fail-closed translator: any unmapped live id drops the event and increments a counter rather than mis-attributing it. Operator overrides live in a companion config file. Behaviour is gated by USE_LIVE_REST_BRIDGE_V1, USE_LIVE_REST_BRIDGE_V2, SHADOW_DROP_DIAG, and ID_BRIDGE_OVERRIDE_STRICT environment variables. V2 is the current production setting and is verified periodically by a systemd timer on the production host.

Note: live-side IDs are not guaranteed stable across matches for the same entity — the bridge resolves the mapping against each match’s lineup, not against a fixed cross-match table.

Python OBV sidecar HTTP contract

The TypeScript bridge talks to the Python obv-engine over four HTTP endpoints, base URL OBV_ENGINE_BASE_URL (default http://127.0.0.1:8100):

EndpointPurpose
POST /api/live-obv/matches/{matchId}/startRegister a kicked-off match and lock in the rosters
POST /api/live-obv/matches/{matchId}/eventsPush normalized live events for scoring
POST /api/live-obv/matches/{matchId}/endMark match complete; close out per-match state
GET /api/live-obv/matches/{matchId}/snapshotRead authoritative team & player totals

The bridge is gated by ENABLE_REALTIME_OBV=true. When disabled (or when the sidecar is unreachable), the live processor falls back to a heuristic impact-estimation path so live tracking degrades rather than failing.

Key design decisions:

  • Pinned z-score population. During a live match, the mean and stdev used for z-scoring are frozen at the last completed batch cycle’s values. Only teams and players actually on the pitch move; everyone else’s index is mathematically invariant. This prevents a single live match from re-anchoring the entire league’s pricing.
  • Tiered fallback. If the Python OBV sidecar (obv-engine on the production host, port 8100) is unreachable, the engine degrades gracefully: authoritative per-event OBV → aggregated season rate → heuristic from shot/goal events. Each tier is clearly labeled in the data stream so consumers know the quality of what they’re pricing against.
  • Tighter change thresholds during live matches. The oracle push threshold tightens during live play so in-play price movement reaches the chain promptly.

See the Real-Time vs Post-Match page for how the live estimate reconciles with the official post-match OBV.

The raw fields we consume

The engine’s index calculation reads the following fields per entity. These are the partner’s canonical REST field names, preserved verbatim for auditability.

Team season stats:

  • team_season_obv_pg — aggregate OBV per match (primary signal)
  • team_season_obv_pass_pg, team_season_obv_shot_pg, team_season_obv_defensive_action_pg, team_season_obv_dribble_carry_pg, team_season_obv_gk_pg — per-category breakdowns (surfaced to traders, not currently weighted into the composite)
  • team_season_matches, team_season_gd, team_season_xgd, team_season_goals_pg, team_season_goals_conceded_pg

Player season stats:

  • player_season_obv_90 — OBV per 90 (primary signal)
  • player_season_obv_pass_90, player_season_obv_shot_90, player_season_obv_defensive_action_90, player_season_obv_dribble_carry_90, player_season_obv_gk_90
  • player_season_minutes, primary_position
  • player_season_goals_90, player_season_assists_90, player_season_np_xg_90, player_season_xa_90, player_season_tackles_90, player_season_interceptions_90, player_season_aerial_wins_90 (feed into position-specific form)

The engine preserves the partner’s native field names end-to-end. This means any claim SportsPerp makes about a market’s price can be audit-traced back to a specific set of source fields from a specific data version, without translation.

What traders see downstream

Once the index value is computed, it surfaces in three places:

  1. On-chain oracle. A 10^6-scaled fixed-point price, updated via the update_oracle instruction. This is what positions are marked against.
  2. REST candle API. GET /api/candles/{marketKey}?timeframe=1m|1H|4H|1D returns OHLC bars from the SQLite candle store.
  3. WebSocket feed. wss://…/ws streams real-time ticks, candle updates, and (in live matches) per-event overlay deltas so charts can render live.

All three are kept consistent: the oracle push, the candle write, and the WS broadcast are triggered by the same calculation step within a crank cycle. A trader cannot see a WebSocket tick that disagrees with the oracle, by construction.

Credentials and environment

Data-partner credentials are read from environment variables — never hardcoded — and live on the production server in a systemd EnvironmentFile= excluded from the repository via .gitignore.

Further reading