Skip to main content
NeuronFeed
CATEGORY

Best Foundation Models AI Tools

109 tools compared · 2026

DeepSeek-V3 reset inference pricing in a single quarter — every other lab is still recalibrating

109 foundation models startups tracked, with the largest concentration in US. Total tracked funding: $384.6B.

Tracked
109
Total Raised
$384.6B
Countries
17
Active Deals
9

Editor's picks

6

Top by score

View all 109 →

Funding by year — Foundation Models

2019 → 2026
$11M
’19
$1.6B
’21
$1.1B
’22
$24.9B
’23
$28.9B
’24
$89.6B
’25
$235.9B
’26

Market overview

Open with the price chart: in late 2024, DeepSeek-V3 shipped a 671B-parameter MoE that matched GPT-4-class quality at roughly 1/30th the inference cost, and DeepSeek-R1 followed with reasoning at frontier parity. OpenAI cut GPT-4o pricing twice in the next two quarters, Anthropic released Claude 4 and a Haiku tier priced to compete, and Google's Gemini 2.5 / 3 family undercut both on long-context. 68 published labs now sit under that pricing pressure, with $293B in cumulative disclosed funding — the most capital-intensive category on the platform.

The two-lab gravity well

OpenAI ($193.3B raised, ~$852B valuation, Series E) and Anthropic ($67.6B raised, Series G) absorb most of the western enterprise spend. Mistral AI ($6.3B raised including debt, France) holds the European open-weight position; Black Forest Labs ships frontier image generation from Germany. China runs a parallel stack: Zhipu AI ($1.49B, GLM-4 family) and MiniMax ($2.2B, IPO'd, Hailuo video plus Talkie) compete inside markets the western labs cannot serve directly. Behind them, Cohere ($1.77B Series E ext.), AI21 Labs ($336M Series C), 01.AI ($200M Series A, Yi family), and Reka AI ($180M Series B) defend narrower enterprise and multimodal slices.

The cost compression below

DeepSeek's reported $5.6M training run for V3 — even with the usual caveats about hardware accounting — broke the assumption that frontier-class quality required nine-figure compute spend. Llama 4 from Meta extended the open-weight pressure. Liquid AI ($300M Series B) is going the other direction with state-space architectures aimed at edge and on-device deployment. Skild AI ($2.2B Series C) and Physical Intelligence ($735M Series B) are training models for robotics, where the data bottleneck is the moat, not the GPU budget. 20 disclosed rounds in the trailing 12 months averaged $9.5B — the highest of any NeuronFeed category and an order of magnitude above the platform median.

What 2026 actually tests

Whether scaling laws hold above $1B per training run. Ineffable Intelligence raised $1.1B at seed on a pure scaling thesis. Safe Superintelligence raised $3B Series B for the same. If quality-per-dollar keeps compressing the way DeepSeek and Mistral have shown, the value moves to whoever owns distribution, fine-tuning, and proprietary data. If the next training generation produces another step-change, capital concentration tightens further.

Key trends 2026

  • DeepSeek-V3/R1 reset the cost curve. A reported $5.6M training run for V3 plus reasoning parity from R1 forced GPT-4o, Claude, and Gemini price cuts in 2025 — every economics deck in the category was rewritten.
  • Open-weight pressure is now a constant. Mistral ($6.3B), Llama 4, DeepSeek, and 01.AI's Yi family ship competitive weights on staggered cadence; closed-source labs price against the best open release each quarter.
  • Robotics models are the next data moat. Skild AI ($2.2B Series C) and Physical Intelligence ($735M) train on physical-world data nobody else has — the bottleneck is sensors and demonstrations, not GPUs.
  • Average round size dwarfs every other category. $9.5B average across 20 disclosed rounds — frontier-model training is still the single most capital-intensive activity in tech.

Benchmarks vs global

Total funding tracked
$293B
largest of any category by capital
Avg round (last 12mo)
$9.5B
order of magnitude above platform median
DeepSeek-V3 reported training cost
~$5.6M
forced GPT-4o / Claude / Gemini price cuts
Companies tracked
68
24 US HQs (37%), 6 China

Top countries

By startup count

Stage breakdown

Latest round type
  • Series A 20
  • Series B 17
  • Seed 13
  • Series C 8
  • Series E 6
  • Series A Extension 2
  • Other 2
  • IPO 2

Top investors backing Foundation Models

See all →

FAQ

Frequently asked

Did DeepSeek-V3 actually train for $5.6M?
The figure DeepSeek published refers to the final pretraining run on H800s and excludes prior research, ablations, and infrastructure amortization. Independent estimates put the all-in cost at $50M-$100M+, still an order of magnitude below frontier western labs. The 2025 pricing reaction from OpenAI, Anthropic, and Google confirms the market took the implied efficiency seriously regardless of how the headline number is parsed.
Where does GPT-5 fit relative to Claude 4 and Gemini 2.5 / 3?
GPT-5 leads on agent-trace reasoning benchmarks; Claude 4 (Opus and Sonnet 4.5) leads on coding-specific eval suites including SWE-Bench Verified; Gemini 2.5 / 3 leads on long-context retrieval and multimodal reasoning at scale. Pricing has converged: all three frontier families now ship Haiku/mini/Flash tiers priced within roughly 2x of each other for comparable capability.
Why is Mistral the only European frontier lab at scale?
Capital and timing. Mistral closed €600M+ rounds early enough to assemble a frontier training team and shipped open-weight Mixtral and Mistral Large before EU AI Act compliance overhead made greenfield labs harder to fund. Black Forest Labs (Germany) holds image generation; Aleph Alpha pivoted to enterprise sovereignty plays. The European thesis now rests on open weights, sovereign deployment, and regulatory positioning rather than chasing frontier text directly.
Will the frontier club stay small?
The capital trend says yes; the efficiency trend says no. $9.5B average rounds and $293B cumulative funding favor incumbents. Distillation, MoE efficiency, DeepSeek-class training tricks, and open-weight catch-up cycles work the other way. The 2026 question is whether the gap to frontier compresses faster than the cost to reach it grows. So far in 2025-26, the compression has been winning.
Where do robotics foundation models like Skild and Physical Intelligence fit?
They sit in this category because they train large pretrained models from scratch — just on physical-world data instead of text. Skild AI ($2.2B Series C) is building a generalist robot brain; Physical Intelligence ($735M Series B) ships pi-class models for manipulation. The economics differ: data acquisition (teleoperation, sim-to-real) is the bottleneck rather than GPU spend, which is why their cap tables look more like deep-tech rounds than text-LLM mega-rounds.

Recent rounds in Foundation Models

All rounds →
Date Startup Round Amount
Jun 2026 Generalist AI Series B $400M
May 2026 Anthropic Series H $65B
May 2026 Decart Other $300M
May 2026 Cerebras Systems Other $5.5B
May 2026 Recursive Superintelligence Seed $650M
May 2026 Moonshot AI Series C $2B
Apr 2026 Ineffable Intelligence Seed $1.1B
Apr 2026 Cohere Series E $600M

All Foundation Models startups

Page 5