AI startup news.
Fresh from across the AI startup ecosystem — funding announcements, product launches, analysis, and interviews.
product OpenAI Improves ChatGPT Safety with Context-Aware Risk Detection
OpenAI launched new safety features that help ChatGPT recognize evolving risks across conversations by analyzing subtle warning signs and maintaining safety summaries for high-risk situations.
OpenAI brings Codex AI coding assistant to mobile phones
OpenAI launched Codex mobile preview in ChatGPT apps, letting developers manage AI coding work remotely across laptops and cloud environments with real-time sync.
Hugging Face Spaces hits 1.3 million AI apps as platform becomes go-to directory
Hugging Face's Spaces platform now hosts over 1.3 million AI applications, establishing itself as the largest directory for machine learning demos and tools.
Anthropic launches Claude Code agentic coding system
Anthropic released Claude Code, an autonomous coding agent that reads codebases, makes multi-file changes, runs tests, and commits code without line-by-line guidance.
Higgsfield Launches "Supercomputer" — One Chat, 30+ AI Models, Zero Tool-Hopping
Higgsfield AI just unveiled Supercomputer — a single chat-driven workspace that picks the right video and image model for you, plans the shot, and renders it. Routes between Sora 2, Kling 3.0, Veo 3.1, Seedance 2.0, Cinema Studio 3.5, Soul, Nano Banana Pro and more.
Perplexity upgrades Search API with better extraction and dynamic benchmarks
Perplexity enhanced its Search API with improved snippet quality, span-level evaluation, and multi-query support while outperforming competitors on time-sensitive retrieval benchmarks.
Perplexity Expands Computer AI Platform Across Personal and Enterprise Markets
Perplexity launches Personal Computer running on dedicated Mac minis, Computer for Enterprise with app connectors, and new APIs as it positions AI as the operating system.
Perplexity launches Agent API for managed agentic workflows
Perplexity released Agent API, a managed runtime that orchestrates search, tool execution, and multi-model operations in a single integration point.
Anthropic Economic Index shows Claude usage diversifying across lower-wage tasks
Anthropic's latest Economic Index report reveals Claude usage has diversified to lower-wage tasks as adoption broadens, while experienced users achieve 10% higher success rates through learned strategies.
Researchers Build Compliance-Grade LLM Stack for Fraud Detection and AML
New research demonstrates workload-aware LLM serving architecture that improved fraud detection throughput from 650 to 3,600 requests per hour while reducing latency by 80%.
ReVision Method Cuts Computer-Use Agent Token Usage by 46%
New research introduces ReVision, a technique that reduces visual token consumption in computer-use agents by nearly half while improving performance across three benchmarks.
ReAD Framework Uses Reinforcement Learning to Optimize AI Model Distillation
Researchers propose ReAD, a reinforcement-guided capability distillation framework that uses contextual bandits to efficiently compress large language models while preserving task-relevant abilities and reducing harmful capability spillover.
PIVOT Framework Cuts AI Agent Planning Failures by 94%
Researchers introduce PIVOT, a self-supervised framework that iteratively refines AI agent trajectories through environment feedback, achieving up to 94% improvement in constraint satisfaction while using 3-5x fewer tokens than competing methods.
Stanford Researchers Develop Analogical Reasoning Method to Boost LLM Scientific Creativity
Stanford researchers created an analogical reasoning approach that improves large language model diversity in scientific problem-solving by 90-173%, generating novel solutions over 50% of the time compared to 1.6% for baseline methods.
Vision-Language Models Show Systematic Bias From Embedded Numbers in Images
New research reveals that numeric anchors embedded in images create systematic bias in Vision-Language Model quality judgments, with effects 2.5 times larger than severe image degradation across six different models.
Researchers Test Vision-Language Models on Physics Puzzle Games
New benchmark VLATIM reveals significant gaps between AI reasoning and execution capabilities when solving point-and-click physics puzzles from The Incredible Machine 2.