Market Structure

Evaluating Market-Maker KPIs: What Good Looks Like

Chris Newhouse/April 1, 2026

The Part of the Proposal Most Founders Skip

You've reviewed the loan terms, stressed the option tranches at different vol levels, and negotiated the strikes higher. You understand what you're giving away. But there's a second page to every market-making proposal that often gets less attention: the KPI commitments. Spread targets, depth at various distances from mid-price, uptime percentages, venue coverage.

Most founders skim this section because the numbers feel abstract. What does "15bp spread across T1 venues with $100K depth at 200bp" actually mean for your token? Is that good?

It's the most important question you can ask, because KPIs are the only enforceable accountability mechanism in a market-making agreement. The loan terms define what you pay. The KPIs define what you get back. If the commitments are vague, aggregated, or set well below achievable benchmarks, you've given away call options on your treasury in exchange for a promise that nobody can hold the market maker to.

What the Metrics Actually Measure

Three metrics form the core of any market-making KPI framework, and understanding how they interact is essential to evaluating whether a proposal represents genuine liquidity commitment.

Spread is the difference between the best bid and best ask, measured in basis points relative to the mid-price. A 10bp spread on a $1.00 token means the MM is quoting $0.9995 bid and $1.0005 ask. Tighter spreads reduce execution cost for traders and signal an active, confident market maker. But spread alone is misleading — a market maker can quote a 5bp spread with $100 of depth on each side and technically meet a spread target while providing no real liquidity.

Depth is the cumulative dollar value of resting orders at specified distances from mid-price: typically measured at 25bp (very tight), 50bp, 100bp (1% from mid), and 200bp (2% from mid). Depth at 25bp captures the order book right at the top — this is what market orders hit first. Depth at 200bp captures the broader book that provides a cushion against larger trades or sudden moves. A well-structured book has meaningful depth at each band, not just at the edges.

This is where the interaction matters. Tight spreads with thin depth is a phantom book — it looks good on a dashboard but provides no protection against any real trading flow. Deep books with wide spreads aren't providing useful liquidity either — nobody wants to cross a 50bp spread to access depth that sits far from the current price. The combination of tight spreads AND deep books at multiple bands is what constitutes real market making. When you see a proposal that specifies only a spread target without depth bands, or depth only at the widest band (200bp) without inner levels, that's a proposal designed to be easy to satisfy, not one designed to provide quality liquidity.

Uptime is the percentage of time the MM's quotes are live on the exchange. A 95% uptime commitment sounds reasonable until you realize that 5% downtime is approximately 36 hours per month. If that downtime coincides with volatile periods — precisely when liquidity matters most — the effective coverage is much worse than the number suggests. Daily uptime granularity (measured per 24-hour period) is more meaningful than monthly averages, because monthly averaging allows the MM to be offline for extended stretches during stress and make up the numbers during quiet periods.

Why Exchange Tier Matters

One of the most common errors in KPI evaluation is applying a single benchmark across all venues. What constitutes strong performance varies dramatically by exchange, and the reason is structural: different exchanges have different maker fee schedules, different native liquidity profiles, different hedging infrastructure, and different user bases.

The way I frame this for founders is a four-tier system based on native liquidity and institutional infrastructure:

Tier 1 exchanges — Binance, OKX, Coinbase, Kraken, and Bybit — have the deepest native liquidity, the most sophisticated maker fee tiers, and the broadest hedging infrastructure (perps, margin, lending). A competent market maker should be able to quote 10bp spreads and support up to $500,000 of depth at 200bp on these venues for a liquid mid-cap token. The baseline expectation is highest here because the tools available to the MM are best.

Tier 2 — KuCoin, Gate.io, HTX, MEXC, and Bitget — have less native liquidity and narrower maker incentives. Base spread targets widen to roughly 15bp, and depth caps drop to around $250,000 at 200bp. These venues are still important for global coverage, but benchmarks need to reflect the reality that making markets here is harder and more capital-intensive relative to the liquidity environment.

Tier 3 — Bitfinex, Crypto.com, Bithumb, LBank, XT.COM — are tertiary venues where base spreads of 25bp are typical and depth caps are around $100,000 at 200bp. The MM's ability to hedge efficiently on these exchanges is limited, so the cost of providing tight, deep liquidity is higher.

Beyond these tiers, any exchange not in the above categories is effectively a Tier 4 venue where 35bp spreads and $50,000 depth caps are realistic benchmarks. Proposing ambitious KPIs on an exchange where the MM can't efficiently hedge is either uninformed optimism or a commitment designed to be missed quietly.

Live orderbook depth across exchanges illustrates the natural variation in native liquidity by venue tier. T1 exchanges consistently support deeper books.

Bid-ask spreads vary by exchange tier. Reference lines show target benchmarks for T1 and T2 venues — anything significantly above should prompt questions.

Benchmarking: What the Numbers Should Look Like

Concrete benchmarks depend on two things: the exchange tier and the token's hedgability profile. A token with active perpetual futures on multiple exchanges, high average daily volume relative to market cap, and listings across many centralized venues is fundamentally easier to make markets in than one that trades spot-only on three exchanges with no perps and low daily turnover.

The mechanics are straightforward. A market maker hedges their spot inventory using perps, other spot venues, or options. If the hedging infrastructure is robust — perps trading at 3x or more the spot ADV, multiple perps venues, deep DEX liquidity — the MM can quote tighter and deeper because their risk per unit of depth is lower. If hedging is thin, every dollar of depth the MM commits represents more unhedged directional exposure, which gets priced into wider spreads and shallower books.

To illustrate, consider a token with $5 million average daily spot volume across all venues, moderate hedging infrastructure (hedgability composite around 0.6 on a 0-1 scale), and listings on both T1 and T2 exchanges. On a T1 venue like Binance, a reasonable benchmark for depth at 200bp from mid is approximately $500,000 (capped at the venue's structural limit). On a T2 venue like KuCoin, the same token's benchmark drops to around $250,000 — the depth is shallower because the hedging cost per dollar of inventory is higher on a less liquid exchange.

Spread targets follow a similar logic. With moderate hedgability, a T1 venue should support spreads in the 8-12bp range. A T2 venue, 12-18bp. These aren't aspirational numbers — they're what a competent MM with adequate hedging infrastructure can sustain without taking on excessive inventory risk.

Reference benchmarks by exchange tier. These are starting points — actual targets should be calibrated to the token's specific hedgability profile and trading volume.

There's an important aggregate constraint worth understanding. The total depth committed across all venues shouldn't exceed a reasonable fraction of the token's hedgeable volume. If a market maker commits $2 million of aggregate depth at 200bp but the token's total ADV is only $5 million, that's a utilization rate that implies the MM would need to deploy a large share of its inventory as resting orders. For tokens with strong hedging infrastructure, aggregate depth can safely reach 15-20% of reference ADV. For tokens with limited hedging, 5-10% is more realistic. Proposals that implicitly exceed these envelopes are making promises they'll struggle to keep during volatile periods.

Red Flags in KPI Proposals

Having reviewed hundreds of market-making proposals across different deal structures, certain patterns consistently signal weak commitments:

Aggregated metrics across venues. This is the single most common red flag. A proposal that commits to "$500K aggregate depth at 200bp across 6 exchanges" sounds reasonable until you consider that the MM could concentrate all their depth on one exchange (the easiest to make markets on) and provide almost nothing on the other five. Per-venue commitments with specific targets for each exchange are the minimum acceptable standard.

No depth bands specified. A proposal that mentions only depth at 200bp (the widest band) without committing to tighter levels is optimizing for the easiest metric to hit. Depth at 200bp is the cheapest to provide because those orders are unlikely to get filled in normal trading. Depth at 25bp and 50bp — where actual market orders execute — is harder and more expensive to maintain. Proposals should specify depth at multiple bands.

Uptime measured monthly, not daily. Monthly uptime averaging allows extended offline periods during stress events to be masked by uptime during quiet periods. Daily granularity means the MM must be present every day, including the days when liquidity matters most.

No reporting cadence or format defined. If the proposal doesn't specify how and when KPI performance will be reported — weekly reports, daily dashboards, specific metrics included — you have no mechanism to monitor whether commitments are being met. By the time you notice underperformance, months may have passed.

No penalty or clawback for misses. A KPI commitment without a consequence for failure is a suggestion, not a commitment. The strongest deals include provisions where sustained underperformance triggers loan reduction, forfeiture of option tranches, or early termination rights. Without teeth, KPIs are aspirational.

Spreads above 25bp on T1 exchanges. For any token with reasonable volume, a competent market maker should quote tighter than 25bp on Binance or OKX. If the proposal's spread targets on T1 venues start at 30bp or higher, either the token is genuinely difficult to make markets in (very low volume, no perps, no hedging infrastructure) or the commitments have been set conservatively to ensure they're always met.

What to Demand

Based on what good actually looks like, here's the concrete framework for what to require in a market-making agreement:

Per-venue reporting. Every exchange should have its own spread target, depth targets at multiple bands, and uptime commitment. Aggregate metrics are supplementary context, not the primary accountability layer.

Daily granularity. Uptime and spread compliance measured daily, reported weekly. The MM should provide either a dashboard or regular reports that show performance at the venue level for each trading day.

Depth at multiple bands. At minimum, commitments at 50bp, 100bp, and 200bp from mid-price. The ratios between these bands tell you about the book's shape. A reasonable structure is roughly 25% of 200bp depth at 50bp, 60% at 100bp, and 100% at 200bp — a gradual build from tight to wide. If the proposal only commits to the widest band, the inner book may be hollow.

Clear penalty mechanisms. Define what happens when KPIs are missed: for how many consecutive days, by how much, and what the remedy is. Graduated penalties — warning, reduced allocation, tranche forfeiture — create real incentives for consistent performance.

Transparent methodology. How is spread measured — at a snapshot, or as a time-weighted average? How is depth measured — bid-side only, ask-side only, or total? Is the reporting based on the MM's own monitoring or independent verification? These methodological details determine whether the numbers you see reflect reality.

Independent verification where possible. Several third-party services now provide market-making monitoring. Where available, independent verification of spread, depth, and uptime adds a layer of accountability that removes reliance on the MM's self-reported data.

The negotiation isn't adversarial. Market makers want to know what success looks like so they can allocate resources appropriately. Specific, measurable, per-venue KPIs with clear reporting and penalty provisions benefit both sides: the founder gets accountability for their option value transfer, and the market maker gets clear targets that define their obligations. The deals that work best are the ones where both parties know exactly what they're agreeing to.

Chris Newhouse is the founder of Thalassa Labs, an independent research and education initiative focused on advancing public understanding of crypto derivatives and market structure. Learn more at thalassalabs.xyz.