Why Dolphin Isn't On The Dashboard Yet
A widely-shared X thesis on Dolphin AI is doing the rounds. The architecture is real. The numbers in the post deserve a closer reading than they're getting. Here's the verification audit and the catalyst we're waiting on.
A long, detailed bull thesis for Dolphin AI ($POD) circulated on X on 12 May 2026, against a backdrop of multi-week accumulation and rising on-chain volume that the post itself walks through in detail. It is doing what good bull theses do.
We are not adding Dolphin to the dashboard.
Watchlist status is a statement about our verification process. The question of whether Dolphin is doing good work is separate, and largely positive. The point of writing this up is to make the process legible while the trade is hot, rather than after.
What The Bull Thesis Got Right
Steelman first. Most of the architectural story in the X post holds up under a closer read.
Dolphin is the AI lab behind Venice’s uncensored chat. The relationship is not a marketing line. Venice’s own product blog confirms the Dolphin Mistral 24B Venice Edition is the project’s flagship uncensored model, and dphn.ai/summary documents the inference flow at roughly 60,000 prompts per hour across Venice’s user base. Two independent attestations, one from each side.
The peer-to-pool architecture is the right diagnosis of why prior decentralised GPUGPUGraphics Processing Unit. Originally designed to render video game graphics, GPUs turned out to be exceptionally good at the massively parallel math that AI models need. Modern AI training and inference runs almost entirely on GPUs.Like a factory with 10,000 workers doing the same simple task in parallel, versus a CPU which is more like 10 workers each doing different complex tasks. AI training involves doing simple math a million times per second on a million numbers, which is exactly what the GPU factory is designed for.Read more → compute networks have struggled. We made the same observation in our own coverage of inferenceInferenceRunning a trained AI model to produce an answer. Inference is what happens when you type a prompt into ChatGPT and get a response. The model takes your input, computes a best guess, and returns it.Like asking an expert for their opinion. The training was the decades they spent becoming an expert. The inference is the 30 seconds it takes them to answer your specific question.Read more → privacy: 1:1 GPU rental forces consumer hardware operators into uptime commitments that consumer GPU owners will not give. Pooling supply behind a load balancer is the architectural concession that makes consumer GPUs usable. Dolphin is engineering against the right failure mode.
The training infrastructure claim checks out. Targon (Bittensor subnet 4) is independently described across CoinMarketCap, IQ.wiki and simplytao.ai as running over 1,500 H200 GPUs and processing 20B+ paid inference tokens per day. Manifold Labs, Targon’s operator, co-authored a confidential-compute whitepaper with Intel published 23 March 2026. The compute backbone behind Dolphin’s flagship model training is real, well-funded, and independently visible. This is one of the genuinely verified pillars of the thesis.
The economic design is also worth taking seriously. Operating as a DAODAODecentralised Autonomous Organisation. A way to coordinate decisions and manage a treasury using token-weighted voting instead of a traditional company structure. Token holders propose and vote on changes directly.Like a shareholder-run company where every shareholder can vote on every decision, the votes are public, and the company can't do anything the shareholders don't approve. The coordination is messier than a normal company but nobody has unilateral control.Read more → with no equity, routing 100% of network revenue into $POD buybacksBuybackUsing protocol revenue to purchase tokens on the open market, usually to burn them or return them to a treasury. Buybacks convert business income into upward pressure on the token by reducing circulating supply.Like a public company using profits to repurchase and retire its own shares. The cash leaves the company's balance sheet, the share count drops, and every remaining shareholder owns a slightly bigger slice of the same business.Read more →, and using ETH-style slashable bonds for node operator alignment is a clean tokenomic structure on paper. Whether the cash actually flows through it is the open question. We will come back to that.
Fact: Targon (Bittensor SN4) operates 1,500+ H200 GPUs processing 20B+ paid inference tokens daily, independently corroborated across three third-party sources as of May 2026.
Take: Of all the numbers in the X post, the Targon compute story is the most solidly verified. It tells you Dolphin has the training infrastructure to ship frontier-grade uncensored models. It does not tell you the inference network is paying for itself yet.
What The Post Glides Over
Now the verification audit. Three claims in the thesis read more confidently than the primary sources support.
Hugging Face downloads. The post and the team’s own project summary both reference “5M+ monthly Hugging Face downloads.” Fetched today, the Hugging Face API for the dphn org returns roughly 4.8M cumulative downloads across 56 models. One model, dolphin-2.9.1-yi-1.5-34b, accounts for 4.7M of that total. The flagship “Dolphin-Mistral-24B-Venice-Edition” the post centres on shows 14,400 downloads. Cumulative on one legacy model is a different statistic from monthly across the lineup. The cumulative figure is accurate. Framing it as monthly is what doesn’t hold up.
The 0% refusal claim. The X post cites “0% refusals on Venice’s 45-question benchmark” for Dolphin’s “newest model.” Venice’s own published benchmark, documented when the Dolphin Mistral 24B Venice Edition launched, scores Dolphin at 2.2% refusal. For context, Claude scores 71% and ChatGPT 64% on the same benchmark. The directional claim, that Dolphin leads the uncensored open-model field by a wide margin, is verified. The specific “0% on our newest” figure is uncited and may refer to an unreleased successor model not yet on the public benchmark. Treat as team-reported.
The OpenRouter listing as a future catalyst. This one is the most material. The bull thesis frames “OpenRouter listing 4-6 weeks out” as the catalyst that activates the buyback flywheel. Two things are true that the post does not pull apart. Dolphin Mistral 24B Venice Edition is already on OpenRouter, served by Venice as the provider. That is not the listing the thesis is talking about. The catalyst the team is describing is Dolphin Network listed as its own inference provider on OpenRouter, routing inference through its peer-to-pool network rather than Venice’s stack. That is a real and material event when it happens. It is also not on the public dphn.ai/roadmap with a date. The “4-6 weeks” figure comes from team chatter quoted in the X post, not the public roadmap, which currently marks only Stage 1 of 10 as complete.
The distinction matters because it determines what is actually being bought.
What Hasn’t Been Verified Outside Team Materials
The harder gates are the technical claims that load-bear on the long thesis.
The Proof-of-Weights verification stack is described as running with 0.1% overhead versus 100% for full re-inference, a 100-1000× efficiency edge. If true, this is a meaningful piece of cryptographic engineering for distributed inference. As of this writing there is no third-party audit, no independent latency benchmark, and no peer-reviewed paper documenting the claim. The technical post explaining it is the team’s own.
The same applies to the cost benchmarks. The 6,200 tokens per second number for Llama 3.1 8B on an RTX 4090, the 363 concurrent request batching, the 10× performance-per-dollar advantage versus H100s, all originate from Dolphin’s own bench infrastructure. The architectural direction is plausible. Consumer GPUs have been competitive with datacentre GPUs on per-dollar inference for small-to-medium models for a couple of years. The specific multiples need independent replication before we treat them as evidence rather than marketing.
The on-chain flow analysis in the bull thesis, $6.61M buy / $5.66M sell / 773 net long wallets, comes from a single analyst’s wallet tagging methodology that we have not been able to reproduce from raw BaseScan data in the time we have spent on this. Holder concentration on BaseScan beyond the headline 2,383 figure was not extractable from the public page without API access we don’t have set up. These are tractable verifications, not refutations. They just haven’t been done.
The Gates We Want Closed
For Dolphin to move from active watch to the dashboard, three things need to land. None of them are speculative. All of them are verifiable when they happen.
First, OpenRouter listing as a Dolphin Network provider, with public pricing and observable on-chain buyback flow against the POD contract. This converts the 100% revenue-to-buyback design from a tokenomic claim into a measured cash flow. It is the test that separates revenue mechanism from revenue activation.
Second, independent benchmark replication of the Proof-of-Weights overhead claim, ideally with the methodology published in a form a third-party engineer can run against a Dolphin worker node. The verification stack is the technical claim the entire thesis rests on. If it doesn’t hold, the peer-to-pool architecture loses its trust guarantee and the cost advantage gets eaten by reputational drag as soon as a node is caught serving degraded inference.
Third, presence on DeFiLlamaDeFiDecentralised Finance. Financial services like lending, trading, and yield farming built on smart contracts instead of traditional banks or brokerages. DeFi protocols are usually permissionless and global.Like a vending machine that can give you a loan, swap your currencies, or invest your savings. Nobody is behind the counter, the rules are written into the machine itself, and anyone with money in the right format can use it.Read more → or a comparable independent fee dashboard. The team can show what’s flowing through the network. They can also publish charts. An independent fee tracker is what makes “100% of revenue routes to buyback” auditable rather than asserted.
When two of those three land, we run the full nine-step research protocol and consider dashboard inclusion. When all three land, we don’t need to think about it. The case writes itself.
Position
Watchlist, with a real upgrade path. The architectural story is one of the better ones in decentralised inference right now. The peer-to-pool diagnosis is correct, the Venice integration is meaningful, the Targon compute backbone is verified, and the tokenomic design is clean. What’s missing is independent revenue activation, an audited verification stack, and a fee dashboard. Each of those is tractable. Each of those is the bar.
The momentum trade is a separate question from the editorial question. We don’t gate dashboard inclusion on price action in either direction, and we don’t add projects to the dashboard because the market is bidding them up. Bid trumps verification means the dashboard becomes a leaderboard for narrative velocity, which is the opposite of what it’s for.
If you read the X thesis and bought, that’s your call. If you read it and want to know whether we think the project deserves a dashboard slot today, the answer is no, and the answer is conditional. The conditions are above. Two of them are likely landing inside the next 90 days. The third has been asked for in this space for two years and might keep being asked for.
Watch the OpenRouter listing. Watch dphn.ai/network for live worker counts that can be cross-referenced against on-chain activity. Watch for any third-party engineering write-up of Proof-of-Weights that isn’t on the dphn.ai domain. When two of the three show up, we’ll be back.
The press releases lag the on-chain data, and the X threads lag both.