Match · Finance · 2026 obside.com

One LLM call, repeated for every new headline. Replaced.

Obside is a French fintech that lets retail traders write strategies in plain English and fires the order when matching news arrives. Every headline × every active intent was a frontier-LLM call. TRACER moved 95% of it off GPT-5, without losing accuracy.

Saved vs teacher 95% $0.0000094 / call
vs $0.00019 on GPT-5
Routed accuracy 99.9% at 95% ML coverage
parity-gated, never silent
Partition 38 cells, from 26,591
labelled pairs
Obside TRACER dashboard, density heatmap of 38 partition cells. Blue regions are in-distribution traffic handled by the ML surrogate, orange regions defer to a cheap LLM menu, the red region defers to the GPT-5 teacher. Visible cell labels include 'a stock within the Dow Jones', 'announcement from Donald Trump', 'Apple announces a new phone', 'OPEC announces a production change', 'Tencent or another big player', 'Renault announces a dividend', 'any kind of global virus', 'earthquake or big wildfire', 'Retail Sales in Estonia', 'Bardella is through the second round'.
Production TRACER dashboard · Obside · v 20260511T053435Z · 38 cells
The workflow shape

A cross product of user intents and news headlines

Each Obside user defines one or more strategies as natural-language intents. As news comes in, every headline must be checked against every active intent. The output is a single bit per pair: match or no_match.

The volume grows with both user count and news velocity, fast enough that paying a frontier model per check stops making sense within months of launch. This is the canonical TRACER shape: one repeated decision with structured output, at scale, where the LLM is overqualified for the average case but indispensable on the long tail.

The approach

Teacher → 38-cell surrogate, with a tiered fallback

26,591 intent-news pairs labelled by the GPT-5 teacher Obside was already running. A surrogate trained against the teacher's labels on openai-3-small embeddings, partitioned into 38 cells via density clustering, gated by a second-stage acceptor. The result is a tiered routing menu:

  • 84% 32 cells → trained surrogate In-distribution traffic. Sub-10ms inference, near-zero cost. Auditable per cell.
  • 11% Ambiguous cells → cheap-LLM menu GPT-4o-mini plus two siblings, for regions where the surrogate's confidence is uneven.
  • 5% OOD zone → GPT-5 teacher Novel events the partition hasn't seen. Full frontier capacity, deferred only when needed.

Nothing was rebuilt in Obside's stack. The system sits behind an OpenAI-compatible endpoint (model="tracer-auto"), production version 20260511T053435Z took over routing without an outage. Three versions have shipped to prod since.

The cells, named by users

38 cells emerge from the data, not from a taxonomy

The partition comes from the embedding space, each cell is a cluster of intents Obside users have actually written. The granularity is the user's, not ours. A few labels lifted directly from production:

a stock within the Dow Jones announcement from Donald Trump Apple announces a new phone OPEC announces a production change Tencent or another big player Renault announces a dividend any kind of global virus earthquake or big wildfire Retail Sales in Estonia Bardella is through the second round
Sample pairs

What the classifier sees

Sampled from the production training corpus. The surrogate matches at the semantic level, not on string overlap.

match
Intent A stock within the Dow Jones announces a buyback of more than 5% of shares outstanding.
News $MMM told the market it can now repurchase nearly 7.5% of shares outstanding under a newly approved plan.
match
Intent An announcement from the US president that might be considered risky for the stock market.
News POTUS told advisers he wants to revive a tariff campaign against foreign steel and would not rule out broader duties if negotiations stall.
match
Intent Coinbase delists a major stablecoin.
News The digital-asset exchange said trading in a major stablecoin on Coinbase will be stopped, with the token to be removed entirely after a brief wind-down.
The technical insight

Why frontier reasoning is overqualified here

Intent-news matching is a semantic similarity decision on a small, self-similar distribution of user-written queries. With a teacher's labels on tens of thousands of pairs, an embedding-space surrogate closes the gap to within 0.1 accuracy points, and the residual uncertainty becomes a deferral signal rather than a quality cost.

Paying for frontier reasoning on every headline is paying for a capacity that only the OOD tail actually needs.

"We moved the LLM out of the loop on 95% of matches and kept it for the calls that genuinely needed it. The savings are auditable per cell, not aggregated away."

Does your stack look like this?

If you can write the sentence "we use a frontier LLM to [classify / route / tag / match / triage / detect / score / choose] something that happens thousands of times per day," TRACER fits. Start with the open-source SDK, graduate to hosted when you're ready.