What\u2019s new at Groq
Fastest AI inference on the planet
- feature major
Prompt Caching added
Prompt caching automatically reuses computation from recent requests when they share a common prefix, delivering 50% cost savings for cached portions and improved response times. The feature works automatically with no code changes required and data expires within hours for privacy.
- improvement minor
Python SDK v0.31.1, TypeScript SDK v0.32.0 changed
Updated Python SDK to v0.31.1 and TypeScript SDK to v0.32.0 with improved chat completion message type definitions and added support for new Groq Compound tools. Fixes compatibility issues with different message formats.
- feature major
Moonshot AI Kimi K2 Instruct 0905 added
Kimi K2-0905 brings Moonshot AI's cutting-edge model to GroqCloud with day zero support, featuring a 256K context window and enhanced agentic coding capabilities. The model offers prompt caching with up to 50% cost savings and runs at 200+ tokens/second.
- feature major
Prompt Caching
Automatic prompt caching feature that reuses computation from recent requests with common prefixes, delivering 50% cost savings on cached tokens and improved response times. Initially available for Kimi K2 model.
- improvement minor
Python SDK v0.31.1, TypeScript SDK v0.32.0
SDK updates with improved chat completion message type definitions for better OpenAI compatibility and added support for new Groq Compound tools including Wolfram Alpha and Browser Automation.
- feature major
Groq Compound and Compound Mini
Production-ready agentic AI systems moved from beta to general availability, integrating web search, code execution, and browser automation in a single API call. Built on GPT-OSS-120B and Llama models.
- feature major
Moonshot AI Kimi K2 Instruct 0905
New model release with 256K context window, prompt caching support, and enhanced agentic coding capabilities. Offers 200+ t/s performance at $1.50/M tokens blended pricing.
- feature major
Remote Model Context Protocol (MCP)
Beta release of Remote MCP server integration on GroqCloud, enabling AI models to connect to thousands of external tools through Anthropic's open MCP standard. Fully compatible with OpenAI APIs for seamless migration.
- improvement minor
Python SDK v0.31.1, TypeScript SDK v0.32.0 changed
Updated SDKs with improved chat completion message type definitions for better OpenAI compatibility and added support for new Groq Compound tools including Wolfram Alpha and Browser Automation.
- feature major
Groq Compound and Compound Mini added
Compound and Compound Mini are Groq's production-ready agentic AI systems that integrate web search, code execution, and browser automation into a single API call. Moving from beta to general availability, these systems deliver ~25% higher accuracy and ~50% fewer mistakes across benchmarks.
- feature major
Moonshot AI Kimi K2 Instruct 0905 added
Kimi K2-0905 brings Moonshot AI's cutting-edge model to GroqCloud with enhanced agentic coding capabilities and a 256K context window. The model delivers improved frontend development performance and includes prompt caching for up to 50% cost savings.
- feature major
Remote Model Context Protocol (MCP) added
Remote Model Context Protocol (MCP) server integration is now available in Beta on GroqCloud, connecting AI models to thousands of external tools through Anthropic's open MCP standard. Developers can connect any remote MCP server to models hosted on GroqCloud with zero code changes from OpenAI.
- improvement minor
Python SDK v0.31.1, TypeScript SDK v0.32.0
Updated SDKs with improved chat completion message type definitions for better OpenAI compatibility and added support for new Groq Compound tools including Wolfram Alpha and Browser Automation.
- feature major
Groq Compound and Compound Mini
Compound and Compound Mini are Groq's production-ready agentic AI systems that integrate web search, code execution, and browser automation into a single API call. Moving from beta to general availability, these systems deliver ~25% higher accuracy and ~50% fewer mistakes across benchmarks.
- feature major
Moonshot AI Kimi K2 Instruct 0905
Kimi K2-0905 brings Moonshot AI's cutting-edge model to GroqCloud with day zero support, featuring a 256K context window and enhanced agentic coding capabilities. The model offers prompt caching with up to 50% cost savings and runs at 200+ t/s at $1.50/M tokens blended pricing.
- feature major
Remote Model Context Protocol (MCP)
Remote Model Context Protocol (MCP) server integration is now available in Beta on GroqCloud, connecting AI models to thousands of external tools through Anthropic's open MCP standard. Developers can connect any remote MCP server to models hosted on GroqCloud with zero code changes from OpenAI.