A Bento is a standardized, deployable unit that packages a model together with its dependencies and serving code so it runs consistently anywhere.

Is BentoML open source?

Yes. The core BentoML framework is open source, and the company offers the managed Bento Inference Cloud on top of it.

Can BentoML serve LLMs?

Yes. It supports serving LLMs and generative models, multi-model pipelines, and adaptive batching for high throughput.

Can I deploy on my own infrastructure?

Yes. BentoML can run on the Bento Inference Cloud or be self-hosted on your own Kubernetes and cloud environments.

Investors across its rounds include DCM Ventures, Bow Capital, Greylock Partners, and Bessemer Venture Partners, for around $20 million total.

Startups AI Infrastructure BentoML

BentoML

Active

Unified inference platform to run AI at scale

📍 San Francisco, United States 📅 Founded 2019 👥 11-50 🏷 AI Infrastructure

Visit website

Total raised

$18M

2 rounds

Stage

Seed

Team

11-50

since 2019

Pricing

Freemium

free plan

Founded

2019

San Francisco, United States

Agent-ready

API

Score 55/100

About BentoML

BentoML was founded in 2019 by Chaoyu Yang, growing out of the widely adopted open-source BentoML project that became a standard way for machine learning teams to package and serve models. The company's premise is that getting a model into reliable, scalable production is still one of the hardest parts of applied AI, and that developers need a unified, framework-agnostic way to turn any model plus its surrounding code into a deployable service.

The open-source framework lets developers wrap models from any library, along with custom pre- and post-processing logic, into a standardized unit called a Bento. That artifact captures the model, dependencies, and serving code so it can run consistently anywhere. On top of this, BentoML offers the Bento Inference Cloud, a managed platform that deploys these services with GPU autoscaling, scale-to-zero, fast cold starts, and observability, removing much of the infrastructure work involved in production serving.

BentoML is built for the full range of modern AI workloads. It supports serving LLMs and generative models, building multi-model inference pipelines, adaptive request batching for throughput, and composing several models into a single endpoint. Teams can deploy on Bento's cloud or bring the platform to their own Kubernetes and cloud environments, giving flexibility between fully managed and self-hosted operation.

The company raised an initial $9 million in 2023 from investors including DCM Ventures and Bow Capital, followed by a $9 million Series A reported at a roughly $50 million valuation with participation from firms such as Greylock and Bessemer Venture Partners, bringing total funding to around $20 million across its rounds. Its large open-source community has been a key driver of adoption and a funnel into the commercial cloud.

BentoML competes with serverless GPU and inference platforms, but its framework-agnostic packaging model, strong open-source roots, and support for complex multi-model pipelines make it especially appealing to ML engineering teams that want portable, production-grade serving without committing to a single proprietary runtime.

Key capabilities

Open-source framework to package any model as a service

Bento Inference Cloud with GPU autoscaling

Scale-to-zero and fast cold starts

Adaptive request batching for throughput

Multi-model inference pipelines

LLM and generative model serving

Self-hosted deployment on Kubernetes and cloud

Built-in observability and monitoring

Agent readiness

55/100

Developing

MCP server

Public API

Webhooks

OAuth 2.0

SDKs

Funding history

2 · $18M

— Seed $9M Undisclosed

— Series A $9M Undisclosed

Key operators

Chaoyu Yang

Founder

News & coverage

1 All →

Critical Starlette Vulnerability Exposes Thousands of AI Applications to Auth Bypass

Hacker News 2mo ago

Alternatives

6 All →

Nebius

Full-stack AI cloud with large-scale GPU clusters for training and inference

Foundation ModelsAI Infrastructure

Celestial AI

Photonic Fabric optical interconnect for AI infrastructure

AI Infrastructure

d-Matrix

Digital in-memory compute (DIMC) chiplet-based hardware purpose-built for AI inference in the

AI Infrastructure

Zyphra

Open-science AI lab building efficient multimodal models and the Maia superagent

Foundation ModelsOpen Source AI

Chainguard

Secure, minimal container images for software and AI supply chains

AI InfrastructureAI for Cyber Defense

Alcatraz AI

AI facial authentication that replaces the access badge with your face

AI InfrastructureAI for Cyber Defense

Frequently asked

What is a Bento?: A Bento is a standardized, deployable unit that packages a model together with its dependencies and serving code so it runs consistently anywhere.
Is BentoML open source?: Yes. The core BentoML framework is open source, and the company offers the managed Bento Inference Cloud on top of it.
Can BentoML serve LLMs?: Yes. It supports serving LLMs and generative models, multi-model pipelines, and adaptive batching for high throughput.
Can I deploy on my own infrastructure?: Yes. BentoML can run on the Bento Inference Cloud or be self-hosted on your own Kubernetes and cloud environments.
Who funded BentoML?: Investors across its rounds include DCM Ventures, Bow Capital, Greylock Partners, and Bessemer Venture Partners, for around $20 million total.

Discussion

Watching

Get BentoML updates

New funding, product launches, and team changes — to your inbox.

Follow startup

Claim ownership

Verify with your work email to manage this listing.

Explore more around BentoML

Contextual paths to related AI startups, deals and rankings.

Similar to BentoML

Country

United States AI startups

Compare

Alternatives

All alternatives to BentoML

BentoML

Claim BentoML

Enter your code

Claim approved

Claim received

Claim BentoML

Enter your code

Claim approved

Claim received

About BentoML

Key capabilities

Agent readiness

Funding history

Key operators

Chaoyu Yang

News & coverage

Critical Starlette Vulnerability Exposes Thousands of AI Applications to Auth Bypass

Alternatives

Nebius

Celestial AI

d-Matrix

Zyphra

Chainguard

Alcatraz AI

Frequently asked

Explore more around BentoML

Similar to BentoML

Categories

Country

Compare

Alternatives

Rankings