Inferact is commercializing vLLM, the widely adopted open-source LLM inference engine that originated at UC Berkeley's Sky Computing Lab in 2023. Founded by vLLM's core maintainers, the company's goal is to build a next-generation commercial inference engine and a 'universal inference layer' while continuing to support and fund the open-source project.
The company plans to offer a paid serverless version of vLLM, adding the operational capabilities enterprises need to run inference reliably in production, such as observability, troubleshooting, and disaster recovery, likely orchestrated on Kubernetes. Rather than competing with existing inference providers, Inferact aims to collaborate across the ecosystem by refining the shared inference layer everyone builds on.
Inferact's founding team includes vLLM creators Simon Mo, Woosuk Kwon, Kaichao You, and Roger Wang, alongside Databricks co-founder and Berkeley professor Ion Stoica, who directs the Sky Computing Lab. That academic and open-source pedigree gives the company unusual technical depth in inference performance.
In January 2026, Inferact launched with $150M in seed funding at an $800M valuation. The round was co-led by Andreessen Horowitz and Lightspeed Venture Partners, with Sequoia Capital, Altimeter Capital, Redpoint Ventures, ZhenFund, and Databricks' venture arm participating.
By turning the de facto open-source inference standard into a supported commercial platform, Inferact positions itself as core infrastructure for organizations that rely on vLLM and want enterprise-grade reliability without abandoning the open ecosystem.