Call the Kimi K2.7 Code model through the OpenAI-compatible API or DashScope SDK on Qwen Cloud.
This guide shows how to call the Kimi K2.7 Code model via the OpenAI-compatible API or DashScope SDK.
kimi-k2.7-code is the most capable Kimi model for coding. It follows long-context instructions more reliably and achieves higher success rates on programming tasks. Supports text, image, and video input, thinking mode, conversation, and agent tasks.
The Kimi series are large language models from Moonshot AI.
If a model call fails and returns an error message, see Error codes.
Quick start
kimi-k2.7-code is the most capable Kimi model for coding. It follows long-context instructions more reliably and achieves higher success rates on programming tasks. Supports text, image, and video input, thinking mode, conversation, and agent tasks.
kimi-k2.7-code is a thinking-only model: thinking mode is always enabled (enable_thinking defaults to true and cannot be disabled), and preserve_thinking defaults to true.
Before you begin, get an API key and set it as an environment variable. If you call the model through an SDK, install the OpenAI or DashScope SDK.
- OpenAI compatible
- DashScope
The
enable_thinking parameter is not part of the standard OpenAI API. In the OpenAI Python SDK, pass it through extra_body. In the Node.js SDK, pass it as a top-level parameter.- Python
- Node.js
- curl
Supported features
| Feature | kimi-k2.7-code |
|---|---|
| Multi-turn conversation | ✓ |
| Deep thinking | ✓ (always on) |
| Function calling | ✓ |
| Structured output | — |
| Web search | — |
| Context cache | ✓ |
Parameter defaults
| Parameter | kimi-k2.7-code |
|---|---|
| enable_thinking | true (thinking mode only) |
| temperature | 1.0 |
| top_p | 0.95 |
| presence_penalty | 0.0 |
Models and billing
The Kimi series are large language models from Moonshot AI.
- kimi-k2.7-code: The most capable Kimi model for coding. It follows long-context instructions more reliably and achieves higher success rates on programming tasks. Supports text, image, and video input, thinking mode, conversation, and agent tasks.
In thinking mode, the chain of thought counts as output tokens.