Skip to main content
NeuronFeed
CATEGORY

AI Agents startups (2026)

Coding agents are clearing 70%+ on SWE-Bench Verified — the rest of the agent stack is still arguing about reliability

414 ai agents startups tracked, with the largest concentration in US. Total tracked funding: $107.9B.

Tracked
414
Total Raised
$107.9B
Countries
34
Active Deals
12

Editor's picks

6

Top by score

View all 414 →

Funding by year — AI Agents

2020 → 2026
$3.1M
’20
$52M
’21
$357M
’22
$3.2B
’23
$23.0B
’24
$45.2B
’25
$29.0B
’26

Market overview

The SWE-Bench Verified leaderboard turned over four times in 2025 alone. Anthropic's Claude Sonnet 4.5 pushed the public ceiling past 77%, OpenAI's GPT-5 agent traces matched it, and Cognition's Devin shipped solo runs that closed real GitHub issues at scale. Cursor ($7.6B raised, $29.3B valuation) and GitHub Copilot still own the IDE seat, but the open question for 132 tracked startups is whether anything outside coding ever reaches the same reliability bar.

The reliability ceiling

GAIA, WebArena, and OS-World show the gap. Coding clears 70%+ on Verified; general computer-use agents — Anthropic's Computer Use beta, OpenAI's Operator, Manus AI's autonomous browser — sit in the 40-60% band on multi-step web tasks. Decagon ($296M Series B) and Sierra ($1.27B Series C) sidestep the ceiling by narrowing scope: they run customer-facing dialogue inside a tight tool surface where every action is logged and reversible. Cognition's Devin chose the harder path and competes head-on with Cursor and Augment Code ($252M) on async engineering work where a wrong commit costs real money.

Where capital is concentrating

The top of the funding table is mostly cross-listed labs — xAI's $56B and Mistral AI's $6.3B carry over from Foundation Models because both ship agent surfaces. Among pure agent plays, Cursor sits ahead of Cognition ($1.14B Series C, $10.2B valuation), Glean ($750M Series F), Notion AI ($343M Series C), and Imbue ($232M Series B). Andreessen Horowitz, Sequoia, Thrive, and General Catalyst lead most rounds. Cline raised $4M seed and ships a VS Code agent that competes with $7B-funded incumbents — a useful proof that distribution and BYO-model flexibility still beat raw capital in this segment.

What 2026 actually decides

Whether agent reliability transfers. Coding works because compilers fail loudly, tests are cheap, and rollback is git reset. Booking a flight, filing an expense, or modifying a Salesforce record has none of those properties. The startups building eval harnesses, recovery loops, and explicit human-handoff UX — not the ones shipping flashier browser demos — are the ones likely to be alive in 18 months.

Key trends 2026

  • SWE-Bench Verified is the only benchmark anyone trusts. Claude Sonnet 4.5 and GPT-5 agent traces both cleared 77%, but only on coding — GAIA and OS-World still sit 30 points lower for general computer-use.
  • Anthropic Computer Use, Operator, and Manus AI compete on the same OS-World task set. Public scores hover 40-60%; nobody has shipped a horizontal agent at coding-grade reliability yet.
  • Vertical narrowing beats horizontal autonomy commercially. Decagon ($296M) and Sierra ($1.27B) win deals by constraining tool surface; Devin and Augment Code win by going deep on engineering specifically.
  • Open-source agents punch above weight. Cline raised $4M seed, runs in VS Code, and ships against products with 1000x the capital — distribution still routes around funding.

Benchmarks vs global

SWE-Bench Verified ceiling
77%+
Claude Sonnet 4.5 / GPT-5 agent traces
OS-World general computer-use
40-60%
Operator, Computer Use, Manus AI band
Total funding tracked
$87.9B
second-largest category by capital
Companies tracked
132
43 US HQs (33%), 4 China

Top countries

By startup count

Stage breakdown

Latest round type
  • Seed 182
  • Series A 65
  • Series B 33
  • Pre-Seed 23
  • Series C 21
  • Series D 10
  • Series A Extension 3
  • Corporate 3

Top investors backing AI Agents

See all →

FAQ

Frequently asked

Which AI agents actually pass SWE-Bench Verified above 70%?
As of late 2025, Anthropic's Claude Sonnet 4.5 and OpenAI's GPT-5 both posted public agent traces above 77% on SWE-Bench Verified. Cognition's Devin reports comparable numbers on its full agent harness, and Cursor's Composer mode clears 60%+ in independent runs. Outside coding, GAIA and OS-World numbers for Operator, Computer Use, and Manus AI still sit in the 40-60% range — the reliability gap between code and general computer-use is roughly 20 points.
What separates AI Agents from AI Chatbots in this directory?
Chatbots reply; agents commit side effects. A chatbot drafts the email; an agent sends it, files the response in CRM, and books the follow-up. The same Claude or GPT model can power both — the difference is the scaffolding around it: tool registries, browser control, persistent memory, and recovery logic. NeuronFeed lists 132 agent startups separately from 50 chatbot startups because the buyer expectations and reliability bars are not comparable.
Why do investors keep funding coding agents specifically?
Coding is the one domain where agent failure is cheap and detectable. Tests run, compilers throw, and `git revert` exists. Cursor at $29.3B, Cognition at $10.2B, Augment Code at $977M, and Imbue at $1B reflect that revenue actually shows up — engineering teams will pay $20-40 per seat for a Copilot that ships PRs. No other agent vertical has a comparable feedback loop yet.
Is Manus AI competitive with Operator and Computer Use?
On public OS-World numbers, all three sit in roughly the same band. Manus AI scaled fastest on consumer demand in early 2025; OpenAI's Operator has the GPT-5 reasoning trace; Anthropic's Computer Use is the API path enterprises actually deploy. None of them are reliable enough yet for unattended 50-step workflows on production data — that gap is the central commercial bet of the category.
Where are agent startups concentrated geographically?
The US holds 43 of 132 published HQs (33%). China contributes 4 entries including Zhipu AI and Moonshot AI on the foundation side. Israel has 3, with Sweden, the Netherlands, and Czechia (JetBrains) at 1 each. The pattern mirrors AI Developer Tools — most of the agentic frontier is currently coding-focused, and that work tracks where the model labs and IDE incumbents already operate.

Recent rounds in AI Agents

All rounds →
Date Startup Round Amount
Jun 2026 Lassie Series A $35M
May 2026 Cognition Series D $1B
May 2026 Flick Seed $6M
May 2026 Outmarket AI Series A $17M
May 2026 Vapi Series B $50M
May 2026 Exaforce Series B $125M
May 2026 Moonshot AI Series C $2B
May 2026 Fazeshift Series A $17M

All AI Agents startups

Page 14

Integral AI

JP est. 2021

Self-learning AI models for industrial robots and autonomous systems

Raised
$15.5M
Stage
Seed
43

ActionPower

KR est. 2016

Multimodal AI workflow automation built for Korean enterprises

Raised
$13.7M
Stage
S-B
43

AppliedAI

AE est. 2023

Agentic AI for back-office automation in regulated industries

Raised
$55M
Stage
S-A
43

Anterior

US est. 2022

Clinician-led AI platform that automates prior authorization and clinical administrative work for health plans.

Raised
$64M
Stage
S-C
43

/dev/agents

US est. 2024

An operating system purpose-built for the era of AI agents.

Raised
$56M
Stage
Seed
43

Gradient Labs

GB est. 2023

Procedural AI agents that deliver specialist customer support for regulated industries.

Raised
$13M
Stage
S-A
43

Wordsmith AI

GB est. 2023

Fleets of legal AI agents that turn in-house legal teams into revenue drivers.

Raised
$30M
Stage
S-A
43

Tana

US est. 2021

AI-native workspace built around a live knowledge graph, Supertags, and voice-driven capture.

Raised
$25M
Stage
S-A
43

Patagon AI

AR est. 2024

Generative AI sales agents built for Latin American enterprises

Raised
$3.9M
Stage
Seed
42

Sitch

US est. 2024

AI matchmaker app that blends LLM curation with human matchmaking expertise.

Raised
$6.7M
Stage
Seed
42

Elastics

PL est. 2025

AI-native operating system for prediction markets - trade with words, not order forms.

Raised
$2M
Stage
Pre-S
42

Coval

US est. 2024

Simulation and evaluation platform for voice and chat AI agents, modeled on self-driving car testing.

Raised
$3.3M
Stage
Seed
42

Forethought

US est. 2017

AI for customer support automation

Raised
$25M
Stage
S-D
38

Avoca AI

US est. 2023

The AI Front Office for Home Services, providing always-on agents to answer calls, fill schedules, and streamline operations.

Raised
$125M
Stage
S-B
37

JUPUS

DE est. 2022

Cologne-based legal AI that handles client intake, scheduling, and document drafting for German law firms.

Raised
$8.5M
Stage
Seed
37

Carecode

BR est. 2024

AI agents for healthcare contact centers, accessed via WhatsApp

Raised
$4.3M
Stage
Pre-S
37

Leta

KE est. 2021

AI-driven logistics platform making last-mile delivery cheaper in Africa

Raised
$8M
Stage
Seed
37

Qme

EG est. 2022

AI-driven booking and customer-journey infrastructure for MENA

Raised
$3M
Stage
Seed
37

Caveduck

KR est. 2023

Korean AI character chat platform with creator revenue sharing, backed by a $2.85M Series A.

Raised
$2.9M
Stage
S-A
37

Netomi

US est. 2016

The only agentic AI platform built for what comes after the pilot, designed for enterprises that can't afford to get it wrong.

Raised
$110M
Stage
S-C
36

Refact.ai

NL est. 2022

Self-hosted AI coding assistant — run on your own infra, own your data

Stage
Seed
35

PolyAI

GB est. 2017

The world's most lifelike voice AI agents for enterprise customer service.

Raised
$170M
Stage
S-C
35

Tavily

US est. 2023

Connect your AI agents to the web with real-time search, extraction, research, and web crawling through a single, secure API.

Raised
$20M
Stage
S-A
35

Factory

US est. 2023

The only software development agents that work everywhere you do, from IDE to CI/CD.

Raised
$200M
Stage
S-C
35