Desktop LLM client with MCP
Cherry Studio is a desktop AI client that combines chat, MCP (Model Context Protocol) tools, and local knowledge bases. Connect it directly to Qwen Cloud's pay-as-you-go API endpoints for powerful AI capabilities with external tool integration.
Get running in a few minutes:
You should see: qwen3.5-plus explains neural networks with detailed information
Configure Cherry Studio to use Qwen Cloud:
To use local knowledge bases, add an embedding model:
"enable_thinking is restricted to True"
Models like
ModelScope provides various MCP servers:
Quick start
Get running in a few minutes:
Configuration
Basic setup
Configure Cherry Studio to use Qwen Cloud:
- API endpoint:
https://dashscope-intl.aliyuncs.com/compatible-mode/v1 - Authentication: API key required
- Model selection: Enter any Qwen model ID
Step-by-step configuration
1
Open settings
Click the settings button in the upper-right corner
2
Find Alibaba Cloud
In Model Provider list, find Alibaba Cloud
3
Enter credentials
- API Key: Your API key
- API Address:
https://dashscope-intl.aliyuncs.com/compatible-mode/v1
4
Add provider
Click Add
5
Add models
Enter Model ID (e.g.,
qwen3.5-plus, qwen3.5-flash)Embedding model setup (for knowledge bases)
To use local knowledge bases, add an embedding model:
1
Find Qwen Cloud
In Model Service section, find Qwen Cloud
2
Add embedding model
Add
text-embedding-v4 as the embedding modelThe multimodal embedding model
multimodal-embedding-v1 is not supported in Cherry Studio.Limitations
- Thinking mode: Some Qwen3 models require thinking mode always on
- Embedding: Only
text-embedding-v4supported, not multimodal embeddings
Examples
- Basic chat
- MCP tools integration
- Local knowledge base
Enter a message to chat with Qwen models. Use the thinking mode toggle
for thinking-capable models like 
for thinking-capable models like qwen3.5-plus or qwen3-max.
Troubleshooting
"enable_thinking is restricted to True"
Solution: Some Qwen3 models require thinking mode always on. Either update Cherry Studio or keep thinking mode enabled for these models.Charges despite free quota
Solution:Model not responding
- Free quota is region and model specific
- Each model has independent quota (qwen-max ≠ qwen-max-latest)
- Quota display updates hourly, not real-time
- Check Free quota details for coverage
- Enable Free quota only to prevent unexpected charges
Solution: Verify API key is valid and has quota. Check Model list for supported model IDs.Knowledge base embedding fails
Solution: Ensure you've added text-embedding-v4 in Model Service section and have API quota.
Advanced features
Thinking mode models
Models like qwen3.5-plus and qwen3-max support thinking mode for complex reasoning:
- Toggle thinking mode with the button in input box
- Some models require thinking mode always on
- View thinking process in responses
MCP server options
ModelScope provides various MCP servers:
- Fetch: Web scraping and content extraction
- File operations: Local file management
- API connectors: External service integration
- No self-hosting required
Knowledge base tips
- All data stored locally on your machine
- Supports PDF, TXT, MD, and web URLs
- Use for documentation, research papers, or internal docs
- Combine with MCP tools for powerful workflows
Related resources
- Models: Available models and pricing →
- Thinking mode: Reasoning models guide →
- API docs: OpenAI-compatible reference →
- MCP servers: ModelScope MCP collection →
→ Select tool → Ask questions requiring the tool
→ Add Knowledge Base