StepFun is an AI platform offering a suite of tools including knowledge-base Q&A, image creation and editing, and multimodal reasoning models. It provides specialized agents and models aimed at improving productivity across diverse tasks.

What agents and models does StepFun provide?

StepFun features specialized agents such as StepClaw for deep research, Diligence Check for data verification, and Step-3o Vision for image editing. It also offers multimodal models like Step-R1-V-mini for high-precision visual perception and complex reasoning.

Can StepFun handle images?

Yes, StepFun includes image creation and editing capabilities, with tools like Step-3o Vision for image editing and multimodal models for visual perception. This makes it suited to tasks that combine text and visual understanding.

What is StepClaw used for?

StepClaw is StepFun's agent for deep research, designed to help users investigate topics and gather information in depth. It is one of several task-specific agents on the platform.

What is StepFun best suited for?

StepFun targets users who want multimodal AI for productivity tasks, spanning research, data verification, knowledge-base question answering, and image work. Its combination of specialized agents and reasoning models is meant to assist with a broad range of activities.

Startups AI Agents StepFun

StepFun

Active

StepFun offers a suite of AI tools including knowledge base Q&A, image creation, and advanced multimodal reasoning models.

📍 China 🏷 AI Agents

Visit website

Total raised

—

Stage

—

Team

—

Pricing

Freemium

Founded

—

Agent-ready

—

About StepFun

What StepFun does

StepFun (Jieyue Xingchen) is a China-based AI company that develops foundation models and AI tools. Headquartered in Shanghai, it builds large language models and multimodal systems spanning language, vision, video, and audio, and offers applications and an AI assistant built on those models.

Key capabilities

StepFun has released a series of Step foundation models, including a large Mixture-of-Experts language model and multimodal models. It has open-sourced models such as Step-Video for video generation and Step-Audio for speech interaction. Its product offerings include knowledge-base question answering, image creation, and multimodal reasoning capabilities.

Who it's for

StepFun serves developers and organizations building AI applications that require language understanding, multimodal generation, and reasoning. As a foundation-model and AI agent provider, it supports use cases ranging from content creation to interactive assistants. StepFun is among the prominent Chinese AI model startups.

Key capabilities

Knowledge Base Q&A

Image Creation

Agent Studio with specialized AI agents

Multi-source data cross-verification (Diligence Check)

Image editing capabilities (Step-3o Vision)

Multimodal reasoning (Step-R1-V-mini)

Lightning-fast ASR (StepAudio Studio)

Deep Research Beta Testing (StepClaw)

Technology stack

2detected May 30, 2026

Est. monthly stack spend ~$100/mo

Framework

Express

Infra

Nginx

Agent readiness

35/100

Early

MCP server

Public API

Webhooks

OAuth 2.0

SDKs

API docs ↗ No public agent surfaces detected yet.

Alternatives

6 All →

OpenAI

Creator of ChatGPT, GPT-4, and the leading frontier AI lab.

AI ChatbotsAI Developer Tools

Anthropic

AI safety lab building Claude — a helpful, harmless, honest AI assistant.

AI ChatbotsFoundation Models

Databricks

The data + AI company

AI AgentsAI Infrastructure

Safe Superintelligence

Building safe superintelligence

Foundation ModelsAI Safety

Perplexity

AI-powered answer engine delivering real-time, cited responses to complex queries.

AI SearchAI Productivity

xAI

AI designed to understand the universe

AI ChatbotsAI Agents

Frequently asked

What is StepFun?: StepFun is an AI platform offering a suite of tools including knowledge-base Q&A, image creation and editing, and multimodal reasoning models. It provides specialized agents and models aimed at improving productivity across diverse tasks.
What agents and models does StepFun provide?: StepFun features specialized agents such as StepClaw for deep research, Diligence Check for data verification, and Step-3o Vision for image editing. It also offers multimodal models like Step-R1-V-mini for high-precision visual perception and complex reasoning.
Can StepFun handle images?: Yes, StepFun includes image creation and editing capabilities, with tools like Step-3o Vision for image editing and multimodal models for visual perception. This makes it suited to tasks that combine text and visual understanding.
What is StepClaw used for?: StepClaw is StepFun's agent for deep research, designed to help users investigate topics and gather information in depth. It is one of several task-specific agents on the platform.
What is StepFun best suited for?: StepFun targets users who want multimodal AI for productivity tasks, spanning research, data verification, knowledge-base question answering, and image work. Its combination of specialized agents and reasoning models is meant to assist with a broad range of activities.

Discussion

Watching

Get StepFun updates

New funding, product launches, and team changes — to your inbox.

Follow startup

Claim ownership

Verify with your work email to manage this listing.

Explore more around StepFun

Contextual paths to related AI startups, deals and rankings.

Similar to StepFun

Country

China AI startups

Compare

Alternatives

All alternatives to StepFun

StepFun

Claim StepFun

Enter your code

Claim approved

Claim received

Claim StepFun

Enter your code

Claim approved

Claim received

About StepFun

What StepFun does

Key capabilities

Who it's for

Key capabilities

Technology stack

Agent readiness

Alternatives

OpenAI

Anthropic

Databricks

Safe Superintelligence

Perplexity

xAI

Frequently asked

Explore more around StepFun

Similar to StepFun

Categories

Country

Compare

Alternatives

Rankings