What SiliconFlow does

SiliconFlow is an AI inference platform built around the idea of a 'token factory': one OpenAI-compatible API for 200+ language and multimodal models, including DeepSeek, Qwen, Kimi, GLM, and Gemma, with context windows up to 1M tokens. It runs models on NVIDIA H100/H200, AMD MI300, and other GPUs, and offers serverless pay-per-use inference, reserved and elastic GPUs, fine-tuning, and an AI Gateway for routing and rate limiting. Token-based pricing is competitive, with examples in the cents-per-million-tokens range.

Scale and funding

Founded in August 2023 and headquartered in Beijing, SiliconFlow has grown to millions of users and thousands of enterprise customers generating hundreds of billions of daily tokens. It raised a Series A led by Alibaba in mid-2025 and a large Series B of roughly $294M in 2026, with investors including China Renaissance-advised parties, Jinko, Nio Capital, and China Unicom, making it one of the best-funded inference startups in Asia and a major alternative for serving open-source models at scale.