Skip to main content
Tool calling

Image search

Find images via Responses

The Responses API provides two built-in image search tools: text-to-image search finds images matching a text description, and image-to-image search finds visually similar images from an input image. Both tools return a JSON array of results and a model-generated analysis. These tools are only available through the Responses API. Search the internet for images that match a text description, then let the model describe and reason about them. Pass {"type": "web_search_image"} in the tools parameter -- the model decides when to search based on the input.

Example

import os
import json
from openai import OpenAI

client = OpenAI(
  api_key=os.getenv("YOUR_API_KEY"),
  base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
)

response = client.responses.create(
  model="qwen3.6-plus",
  input="Find a tech-themed background image suitable for a PowerPoint cover",
  tools=[
    {
      "type": "web_search_image"
    }
  ]
)

for item in response.output:
  if item.type == "web_search_image_call":
    print(f"[Tool call] Text-to-image search (status: {item.status})")
    if item.output:
      images = json.loads(item.output)
      print(f"  Found {len(images)} images:")
      for img in images[:5]:
        print(f"  [{img['index']}] {img['title']}")
        print(f"      {img['url']}")
      if len(images) > 5:
        print(f"  ... Total {len(images)} images")
  elif item.type == "message":
    print(f"\n[Model response]")
    print(response.output_text)

print(f"\n[Token usage] Input: {response.usage.input_tokens}, Output: {response.usage.output_tokens}, Total: {response.usage.total_tokens}")
Sample output:
[Tool call] Text-to-image search (status: completed)
  Found 30 images:
  [1] Best Free Information Technology Background S Google Slides Themes ...
      https://image.slidesdocs.com/responsive-images/slides/0-technology-line-network-information-training-courseware-powerpoint-background_17825ea41f__960_540.jpg
  [2] Data Technology Blue Abstract Business Glow Powerpoint Background ...
      https://image.slidesdocs.com/responsive-images/background/data-technology-blue-abstract-business-glow-powerpoint-background_e667bfafcb__960_540.jpg
  ...

[Model response]
Here are several tech-themed background images perfect for your PowerPoint cover...

[Token usage] Input: 4326, Output: 645, Total: 4971

Response format

The response output array contains two item types:
Item typeDescription
web_search_image_callThe raw search results as a JSON array. Each object includes index, title, and url.
messageThe model's analysis and recommendations based on the search results.

Billing

Text-to-image search incurs two types of charges:
Charge typeDetails
Model call feesImage search results are added to the prompt, which increases the input token count. Standard model rates apply. See Pricing for details.
Tool calling fees$8 per 1,000 calls.
Find visually similar images on the internet from an input image, then let the model analyze the results. Pass {"type": "image_search"} in the tools parameter and provide the image using the input_image content type. Optionally, include an input_text message to provide additional search context.

Example

Replace image_url in the example code with a publicly accessible image URL (the OpenAI SDK does not support local file paths).
import os
import json
from openai import OpenAI

client = OpenAI(
  api_key=os.getenv("YOUR_API_KEY"),
  base_url="https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
)

input_content = [
  {"type": "input_text", "text": "Find landscape images with a similar style to this one"},
  {"type": "input_image", "image_url": "https://img.alicdn.com/imgextra/i4/O1CN01YbrnSS1qtmsAkw0Ud_!!6000000005554-2-tps-788-450.png"}
]

response = client.responses.create(
  model="qwen3.6-plus",
  input=[{"role": "user", "content": input_content}],
  tools=[{"type": "image_search"}]
)

for item in response.output:
  if item.type == "image_search_call":
    print(f"[Tool call] Image-to-image search (status: {item.status})")
    if item.output:
      images = json.loads(item.output)
      print(f"  Found {len(images)} images:")
      for img in images[:5]:
        print(f"  [{img['index']}] {img['title']}")
        print(f"      {img['url']}")
      if len(images) > 5:
        print(f"  ... Total {len(images)} images")
  elif item.type == "message":
    print(f"\n[Model response]")
    print(response.output_text)

print(f"\n[Token usage] Input: {response.usage.input_tokens}, Output: {response.usage.output_tokens}, Total: {response.usage.total_tokens}")
Sample output:
[Tool call] Image-to-image search (status: completed)
  Found 2 images:
  [1] QingMing Festival Holiday Notice 2024
      https://www.healthcabin.net/blog/wp-content/uploads/2024/04/QingMing-Festival-Holiday-Notice-2024.jpg
  [2] Serene Asian Landscape Stone Bridge Reflecting in Misty Water
      https://thumbs.dreamstime.com/b/serene-asian-landscape-stone-bridge-reflecting-misty-water-tranquil-illustration-traditional-arch-spanning-lake-style-376972039.jpg

[Model response]
Okay, I have found several landscape images with a similar style for you...

[Token usage] Input: 2753, Output: 181, Total: 2934

Response format

The response output array contains two item types:
Item typeDescription
image_search_callThe tool call result containing a JSON array of matched images. Each object includes index, title, and url.
messageThe model's analysis of the search results, accessible through response.output_text.

Billing

Image-to-image search incurs two types of charges:
Charge typeDetails
Model input tokensSearch results are appended to the prompt, which increases input token count. Billed at the model's standard rate. See Pricing for details.
Tool call fee$8 per 1,000 calls.

Supported models

Both image search tools support the same set of models.
Model familyModel IDs
Qwen-Plusqwen3.6-plus, qwen3.5-plus, qwen3.5-plus-2026-02-15
Qwen-Flashqwen3.5-flash, qwen3.5-flash-2026-02-23
Open source Qwenqwen3.5-397b-a17b, qwen3.5-122b-a10b, qwen3.5-27b, qwen3.5-35b-a3b

Streaming

For general streaming concepts (SSE protocol, how to enable streaming, billing, and token usage), see Streaming output. This section covers only the streaming behavior specific to image search.
Image search can take several seconds. Enable streaming to receive results incrementally by setting stream=True (Python) or stream: true (Node.js/curl). The response emits events in the following order:
Event typeTriggerAction
response.output_item.addedTool call startsDisplay a loading indicator.
response.output_item.doneTool call completesParse event.item.output as JSON to get the image list.
response.content_part.addedModel starts respondingPrepare to render streamed text.
response.output_text.deltaModel sends a text chunkAppend event.delta to the output.
response.completedFull response readyRead final usage statistics.

FAQs

What image formats and input methods are supported?

See Image limits for supported formats and size constraints, and File input methods for how to pass images.
The OpenAI SDK does not support local file path input.

How many images can I pass as input?

The total token count for all images and text must stay within the model's maximum input length. The model searches one image at a time but can invoke the tool multiple times in a single response to cover several images.
The model determines the number of images to search.

How many results does a search return?

The model determines the number of results per search. The count is not fixed, but the maximum is 100 images.