Skip to main content
Best Practices

Add visual understanding capabilities

Vision for coding models

Models like qwen3.5-plus and kimi-k2.5 support image understanding natively. Text-only models (glm-5, MiniMax-M2.5) need a local skill for visual capabilities.
Image understanding skills consume Coding Plan quota. No additional charges apply.

Prerequisites

  1. Subscribe to Coding Plan. See Getting started.
  2. Set up Coding Plan. See your tool's setup guide.

Visual support status

Native support: qwen3.5-plus, kimi-k2.5 — pass images directly, no configuration needed. Via skill/agent: qwen3-max-2026-01-23, qwen3-coder-next, qwen3-coder-plus, glm-5, glm-4.7, MiniMax-M2.5 — add a skill or agent for visual capabilities. Switch to these models if you frequently work with images.
ToolHow to switch
Claude Code/model qwen3.5-plus or /model kimi-k2.5
OpenCode/models then search for and select qwen3.5-plus or kimi-k2.5
Qwen Code/model then select qwen3.5-plus or kimi-k2.5
Qwen Code using OpenAI-compatible API doesn't support image input. For image understanding tasks, use Claude Code or OpenCode instead.
Reference image paths directly or drag images into conversations.

Method 2: Add visual capabilities using a skill or agent

For text-only models (glm-5, MiniMax-M2.5), configure a skill or agent.
  • Claude Code
  • OpenCode

Add a skill

  1. Create a skills/image-analyzer folder in the .claude directory:
mkdir -p .claude/skills/image-analyzer
  1. Create SKILL.md:
---
name: image-analyzer
description: Analyzes images for text-only models. Use when you need to extract information from screenshots, charts, diagrams, or any visual content. Pass the image path.
model: qwen3.5-plus
---
qwen3.5-plus has visual understanding capabilities. Use the qwen3.5-plus model directly for image understanding.
  1. Folder structure:
.claude/
└── skills/
  └── image-analyzer/
    └── SKILL.md

Getting started

  1. Start Claude Code in your project directory. Switch to glm-5 with /model glm-5.
  2. Place an image in your project directory, then ask: Load the image-analyzer skill and describe the information in <your-image>.

FAQ

Cause: OpenCode doesn't enable visual capabilities by default — declare modalities in the configuration.Solution: In the OpenCode configuration, add modalities and set input to ["text", "image"]:
Replace sk-sp-xxx with your API key.
{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "qwen-cloud-coding-plan": {
      "npm": "@ai-sdk/anthropic",
      "name": "Qwen Cloud Coding Plan",
      "options": {
        "baseURL": "https://coding-intl.dashscope.aliyuncs.com/apps/anthropic/v1",
        "apiKey": "sk-sp-xxx"
      },
      "models": {
        "qwen3.5-plus": {
          "name": "Qwen3.5 Plus",
          "modalities": {
            "input": [
              "text",
              "image"
            ],
            "output": [
              "text"
            ]
          },
          "options": {
            "thinking": {
              "type": "enabled",
              "budgetTokens": 1024
            }
          }
        },
        "kimi-k2.5": {
          "name": "Kimi K2.5",
          "modalities": {
            "input": [
              "text",
              "image"
            ],
            "output": [
              "text"
            ]
          },
          "options": {
            "thinking": {
              "type": "enabled",
              "budgetTokens": 1024
            }
          }
        }
      }
    }
  }
}
Cause: OpenClaw determines visual support based on the input field in the configuration.Solution:
  1. In ~/.openclaw/openclaw.json, ensure the model definition includes "input": ["text", "image"].
{
  "models": {
    "mode": "merge",
    "providers": {
      "bailian": {
        "baseUrl": "https://coding-intl.dashscope.aliyuncs.com/v1",
        "apiKey": "sk-sp-xxx",
        "api": "openai-completions",
        "models": [
          {
            "id": "qwen3.5-plus",
            "name": "qwen3.5-plus",
            "reasoning": false,
            "input": ["text", "image"],
            "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 },
            "contextWindow": 1000000,
            "maxTokens": 65536
          },
          {
            "id": "kimi-k2.5",
            "name": "kimi-k2.5",
            "reasoning": false,
            "input": ["text", "image"],
            "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 },
            "contextWindow": 262144,
            "maxTokens": 32768
          }
        ]
      }
    }
  },
  "agents": {
    "defaults": {
      "model": {
        "primary": "bailian/qwen3.5-plus"
      },
      "models": {
        "bailian/qwen3.5-plus": {},
        "bailian/kimi-k2.5": {}
      }
    }
  },
  "gateway": {
    "mode": "local"
  }
}
  1. Clear the model cache and restart OpenClaw:
rm ~/.openclaw/agents/main/agent/models.json
openclaw gateway restart