How is DeepInfra priced?

Inference is billed on a usage basis, typically per token for LLMs and per compute time for image and audio models, positioned below hyperscaler list prices.

Is the API compatible with OpenAI?

Yes, DeepInfra exposes OpenAI-compatible endpoints, so many applications can switch by changing the base URL and key.

Can I run a private or custom model?

Yes. DeepInfra offers dedicated deployments that run a specific model on reserved GPUs for isolated, predictable performance.

What kinds of models are available?

The catalog includes open-source LLMs, embedding models, text-to-image models, and speech-to-text systems.

Who has invested in DeepInfra?

Investors include 500 Global, Georges Harik, Felicis, NVIDIA, A.Capital Ventures, Crescent Cove, Peak6, Samsung Next, Supermicro, and Upper90, totaling over $130 million.

Startups AI Infrastructure DeepInfra

DeepInfra

Active

High-throughput, low-cost AI inference cloud

📍 Palo Alto, United States 📅 Founded 2022 👥 11-50 🏷 AI Infrastructure

Visit website

Total raised

$125M

2 rounds

Stage

Series A

Team

11-50

since 2022

Pricing

Freemium

free plan

Founded

2022

Palo Alto, United States

Agent-ready

API

Score 55/100

About DeepInfra

DeepInfra was founded in September 2022 in Palo Alto by Nikola Borisov, Yessenzhar Kanapin, and Georgios Papoutsis, engineers with deep backgrounds in large-scale networking and distributed systems. Their thesis was straightforward: as open-source models matured, the bottleneck for most companies would not be access to weights but the cost and operational burden of serving those models reliably at high throughput. DeepInfra was built to make running inference as simple as calling an API while keeping the per-token price as low as possible.

The platform exposes a large catalog of open models, including popular LLMs, embedding models, text-to-image models, and speech-to-text systems, each available behind an OpenAI-compatible endpoint. Developers pay only for what they use, billed by tokens or by compute time for image and audio workloads. Behind the API, DeepInfra manages GPU clusters, batching, and autoscaling, absorbing the complexity of capacity planning and keeping utilization high enough to sustain aggressive pricing.

For teams that need isolation or custom models, DeepInfra also offers dedicated deployments where a specific model runs on reserved GPUs. This gives predictable latency and throughput for production traffic while retaining the simplicity of the managed platform. The company emphasizes throughput and price-performance as its core differentiators against both hyperscalers and other inference startups.

DeepInfra's funding accelerated alongside demand for inference capacity. It raised an $8 million seed, an $18 million Series A in April 2025 led by Felicis and Georges Harik, and a $107 million Series B co-led by 500 Global and Georges Harik, with participation from A.Capital Ventures, Crescent Cove, Felicis, NVIDIA, Peak6, Samsung Next, Supermicro, and Upper90, bringing total funding above $130 million. NVIDIA's participation reflects the strategic importance of inference-focused neoclouds that drive GPU consumption.

The company sits in a competitive segment alongside other serverless inference providers, but its emphasis on raw price-per-token and a broad open-model catalog makes it attractive to developers and startups building high-volume AI products who want to avoid both vendor lock-in and the overhead of self-hosting GPUs.

Key capabilities

Pay-per-token API for hundreds of open models

OpenAI-compatible inference endpoints

Autoscaling managed GPU infrastructure

LLM, embedding, image, and speech model catalog

Dedicated deployments on reserved GPUs

Low price-per-token positioning

High-throughput batched serving

Simple REST and Python access

Agent readiness

55/100

Developing

MCP server

Public API

Webhooks

OAuth 2.0

SDKs

Funding history

2 · $125M

— Series A $18M Undisclosed

— Series B $107M Undisclosed

Key operators

Georgios Papoutsis

Founder

Nikola Borisov

Founder

Yessenzhar Kanapin

Founder

Alternatives

6 All →

Nebius

Full-stack AI cloud with large-scale GPU clusters for training and inference

Foundation ModelsAI Infrastructure

Celestial AI

Photonic Fabric optical interconnect for AI infrastructure

AI Infrastructure

d-Matrix

Digital in-memory compute (DIMC) chiplet-based hardware purpose-built for AI inference in the

AI Infrastructure

Zyphra

Open-science AI lab building efficient multimodal models and the Maia superagent

Foundation ModelsOpen Source AI

Chainguard

Secure, minimal container images for software and AI supply chains

AI InfrastructureAI for Cyber Defense

Alcatraz AI

AI facial authentication that replaces the access badge with your face

AI InfrastructureAI for Cyber Defense

Frequently asked

How is DeepInfra priced?: Inference is billed on a usage basis, typically per token for LLMs and per compute time for image and audio models, positioned below hyperscaler list prices.
Is the API compatible with OpenAI?: Yes, DeepInfra exposes OpenAI-compatible endpoints, so many applications can switch by changing the base URL and key.
Can I run a private or custom model?: Yes. DeepInfra offers dedicated deployments that run a specific model on reserved GPUs for isolated, predictable performance.
What kinds of models are available?: The catalog includes open-source LLMs, embedding models, text-to-image models, and speech-to-text systems.
Who has invested in DeepInfra?: Investors include 500 Global, Georges Harik, Felicis, NVIDIA, A.Capital Ventures, Crescent Cove, Peak6, Samsung Next, Supermicro, and Upper90, totaling over $130 million.

Discussion

Watching

Get DeepInfra updates

New funding, product launches, and team changes — to your inbox.

Follow startup

Claim ownership

Verify with your work email to manage this listing.

Explore more around DeepInfra

Contextual paths to related AI startups, deals and rankings.

Similar to DeepInfra

Country

United States AI startups

Compare

Alternatives

All alternatives to DeepInfra

DeepInfra

Claim DeepInfra

Enter your code

Claim approved

Claim received

Claim DeepInfra

Enter your code

Claim approved

Claim received

About DeepInfra

Key capabilities

Agent readiness

Funding history

Key operators

Georgios Papoutsis

Nikola Borisov

Yessenzhar Kanapin

Alternatives

Nebius

Celestial AI

d-Matrix

Zyphra

Chainguard

Alcatraz AI

Frequently asked

Explore more around DeepInfra

Similar to DeepInfra

Categories

Country

Compare

Alternatives

Rankings