QUANT_API
UTCSign inGet a key ↗
COMPARE · CATEGORY GUIDE

Resolved features
vs raw data feeds

There are two legitimate ways to source crypto microstructure data: subscribe to raw websocket feeds — L2 books, trades, funding — and build everything yourself, or buy features already resolved point-in-time at the API seam. Neither is universally better. This page lays out the real trade — including the rows where the raw feed wins.

01 · THE HIDDEN WORK

What sits between a raw feed and a usable feature

A raw feed gives you everything the venue emits and nothing else: no normalization, no history, no time discipline. To turn it into something a model can train on, four layers of work have to exist somewhere — on your payroll or inside your vendor.

01 · CAPTURE, 24/7, EVERY VENUE

A raw feed is a firehose that assumes someone is always holding it. Websockets drop, venues rate-limit and rotate symbols, snapshots desync from deltas. Capture means reconnect logic, gap accounting and deduplication running around the clock — in our case across Binance, OKX, Bybit and Hyperliquid.

02 · NORMALIZATION

Every venue speaks its own dialect: different symbols, units, depth conventions, funding intervals, timestamp semantics. Before any feature exists, all of it has to be reconciled into one canonical, consistently-timestamped series per asset and signal.

03 · POINT-IN-TIME RESOLUTION

The step most pipelines get wrong. Every value must be computed strictly from data stamped at or before the requested as_of — our serving store caps every read at ts ≤ as_of, so a later row physically cannot leak into a backtest. The full discipline is documented on /methodology.

04 · ONE CODE PATH, LIVE AND HISTORICAL

The subtle failure mode of in-house pipelines: a live path and a backfill path that quietly drift apart. Here a live call is literally a historical call with as_of = now — same resolver, same transforms — so what the model trained on is what it gets in production.

02 · THE HONEST TABLE

What you own vs what we resolve

A comparison table where the vendor wins every row is marketing, not analysis. Raw feeds genuinely win on control, latency and granularity — if those rows decide your use case, you should run raw feeds.

DIMENSIONRAW FEED — YOU OWN THE PIPELINERESOLVED FEATURES — WE OWN IT
Schema & flexibility
RAW WINS
Entirely yours. Any shape, any encoding, any feature you can imagine.A fixed catalog of signals, windows (1s to 24h) and transforms. Expressive, but defined by us.
Latency floor
RAW WINS
Your colocation, your network stack — as low as you are willing to pay for.An HTTPS API round-trip. Built for bar-level models, the wrong tool for HFT execution.
Granularity
RAW WINS
Every tick, every L2 delta, exactly as the venue emitted it.Resolved values at defined windows — not raw ticks. Aggregation choices are ours.
Vendor dependency
RAW WINS
None beyond the venues themselves.You depend on our uptime and roadmap. Exports you make are yours to keep, under license.
Engineering cost
RESOLVED WINS
Capture, storage, normalization, monitoring and the point-in-time discipline are your headcount, indefinitely.Included in the subscription. Your team spends its time on models, not plumbing.
Look-ahead safety
RESOLVED WINS
Yours to design, enforce and prove — the hardest part to get right and the easiest to get silently wrong.Enforced by construction (ts ≤ as_of on every read), with a public protocol to falsify it.
Live / backtest parity
RESOLVED WINS
Two pipelines to keep byte-identical, forever.One resolver answers both; live is historical with as_of = now.
Historical depth
HONEST TIE
Exactly what you have recorded — or what you can buy and trust.Our live archive is young and we say so; every analytics response declares its real covered window.
03 · WHEN TO BUY WHICH

Buy raw when…

  • You have a dedicated data-engineering team — and keeping it is part of your edge.
  • Your signals need tick-level or custom L2 constructions no catalog will ever ship.
  • Execution latency matters more than research throughput.
  • Full schema control and zero vendor dependency are hard requirements.

Buy resolved when…

  • You want leak-free inputs ready to model on, not a pipeline project.
  • Your horizons are bar-level — seconds to daily — and the signal catalog covers what you trade.
  • Live/backtest parity matters more to you than schema control.
  • Your team’s time is better spent on models than on reconnect logic.

In practice many desks do both: raw feeds where execution lives, resolved features where research lives. If you are weighing the leak-free question specifically, start with the leak-free backtesting guide — it walks through the failure modes the point-in-time discipline exists to prevent.

04 · DON’T TRUST — VERIFY

“Resolved point-in-time” is a checkable claim, not a slogan. Record a live response at time T, replay the same keys historically with as_of = T, and compare — if they ever differ, our claim is broken and we treat it as a critical bug. The full protocol is on /methodology, and the sample needs no account at all.

Download the sample CSV ↓
GET /v1/sample.csv · public · no key required
IF RESOLVED IS YOUR SIDE OF THE TABLE

Every new account starts with a 14-day trial of the Signal plan — no card required. Browse the signal catalog, check pricing, and run the verification protocol before you pay anything.

Start the free trialRead the methodology