Obside is a French fintech (founded 2018, Laval) that lets retail traders write trading strategies in plain English and automatically fires the order when matching news is detected. As of January 2026 the company has raised €515k seed plus a €500k follow-on round at a €2.5M valuation.

How much LLM cost did Obside save with TRACER?

Obside reduced its intent-vs-news matching cost by 95% versus the GPT-5 teacher baseline. The cost per match call dropped from $0.00019 to $0.0000094 (about 20× cheaper), while routed accuracy held at 99.9% at 95% machine-learning coverage.

How does the TRACER surrogate handle ambiguous or novel events?

Three tiers. The trained surrogate handles 32 of the 38 partition cells, covering in-distribution traffic. A cheap-LLM menu (GPT-4o-mini plus two siblings) handles ambiguous cells where the surrogate's confidence is uneven. The out-of-distribution zone defers to the GPT-5 teacher. The routing decision is auditable per cell.

How many labelled examples did Obside need to train the surrogate?

26,591 labelled intent-news pairs, all generated by the GPT-5 teacher Obside was already running in production. No manual labelling. The teacher's existing classification traces became the training set.

Match · Finance · 2026 obside.com

One LLM call, repeated for every new headline. Replaced.

Q: What workflow shape did TRACER optimize for Obside?

Intent-vs-news matching: every incoming financial news headline must be checked against every active user intent. The output is a single bit per pair, match or no_match. The volume is the cross product of users, intents, and news velocity, which makes per-call frontier-LLM pricing unsustainable.

Obside is a French fintech that lets retail traders write strategies in plain English and fires the order when matching news arrives. Every headline × every active intent was a frontier-LLM call. TRACER moved 95% of it off GPT-5, without losing accuracy.

Saved vs teacher 95% $0.0000094 / call
vs $0.00019 on GPT-5

Routed accuracy 99.9% at 95% ML coverage
parity-gated, never silent

Partition 38 cells, from 26,591
labelled pairs

Obside TRACER dashboard, density heatmap of 38 partition cells. Blue regions are in-distribution traffic handled by the ML surrogate, orange regions defer to a cheap LLM menu, the red region defers to the GPT-5 teacher. Visible cell labels include 'a stock within the Dow Jones', 'announcement from Donald Trump', 'Apple announces a new phone', 'OPEC announces a production change', 'Tencent or another big player', 'Renault announces a dividend', 'any kind of global virus', 'earthquake or big wildfire', 'Retail Sales in Estonia', 'Bardella is through the second round'. — Production TRACER dashboard · Obside · v 20260511T053435Z · 38 cells

The workflow shape

A cross product of user intents and news headlines

Each Obside user defines one or more strategies as natural-language intents. As news comes in, every headline must be checked against every active intent. The output is a single bit per pair: match or no_match.

The volume grows with both user count and news velocity, fast enough that paying a frontier model per check stops making sense within months of launch. This is the canonical TRACER shape: one repeated decision with structured output, at scale, where the LLM is overqualified for the average case but indispensable on the long tail.

The approach

Teacher → 38-cell surrogate, with a tiered fallback

26,591 intent-news pairs labelled by the GPT-5 teacher Obside was already running. A surrogate trained against the teacher's labels on openai-3-small embeddings, partitioned into 38 cells via density clustering, gated by a second-stage acceptor. The result is a tiered routing menu:

84% 32 cells → trained surrogate In-distribution traffic. Sub-10ms inference, near-zero cost. Auditable per cell.
11% Ambiguous cells → cheap-LLM menu GPT-4o-mini plus two siblings, for regions where the surrogate's confidence is uneven.
5% OOD zone → GPT-5 teacher Novel events the partition hasn't seen. Full frontier capacity, deferred only when needed.

Nothing was rebuilt in Obside's stack. The system sits behind an OpenAI-compatible endpoint (model="tracer-auto"), production version 20260511T053435Z took over routing without an outage. Three versions have shipped to prod since.

The cells, named by users

38 cells emerge from the data, not from a taxonomy

The partition comes from the embedding space, each cell is a cluster of intents Obside users have actually written. The granularity is the user's, not ours. A few labels lifted directly from production:

a stock within the Dow Jones announcement from Donald Trump Apple announces a new phone OPEC announces a production change Tencent or another big player Renault announces a dividend any kind of global virus earthquake or big wildfire Retail Sales in Estonia Bardella is through the second round

Sample pairs

What the classifier sees

Sampled from the production training corpus. The surrogate matches at the semantic level, not on string overlap.

match

Intent A stock within the Dow Jones announces a buyback of more than 5% of shares outstanding.

News $MMM told the market it can now repurchase nearly 7.5% of shares outstanding under a newly approved plan.

match

Intent An announcement from the US president that might be considered risky for the stock market.

News POTUS told advisers he wants to revive a tariff campaign against foreign steel and would not rule out broader duties if negotiations stall.

match

Intent Coinbase delists a major stablecoin.

News The digital-asset exchange said trading in a major stablecoin on Coinbase will be stopped, with the token to be removed entirely after a brief wind-down.

The technical insight

Why frontier reasoning is overqualified here

Intent-news matching is a semantic similarity decision on a small, self-similar distribution of user-written queries. With a teacher's labels on tens of thousands of pairs, an embedding-space surrogate closes the gap to within 0.1 accuracy points, and the residual uncertainty becomes a deferral signal rather than a quality cost.

Paying for frontier reasoning on every headline is paying for a capacity that only the OOD tail actually needs.

"We moved the LLM out of the loop on 95% of matches and kept it for the calls that genuinely needed it. The savings are auditable per cell, not aggregated away."

Does your stack look like this?

If you can write the sentence "we use a frontier LLM to [classify / route / tag / match / triage / detect / score / choose] something that happens thousands of times per day," TRACER fits. Start with the open-source SDK, graduate to hosted when you're ready.

Try the hosted version → View the OSS repo All case studies

Adam Rida Founder · TRACER · github.com/adrida