Skip to main content
Reranking

OpenAI compatible reranking

OpenAI-compatible reranking API

POST
/reranks
qwen3-rerank
curl --request POST \
  --url https://dashscope-intl.aliyuncs.com/compatible-api/v1/reranks \
  --header "Authorization: Bearer $DASHSCOPE_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
        "model": "qwen3-rerank",
        "documents": [
                "Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance.",
                "Quantum computing is a cutting-edge field of computer science.",
                "The development of pre-trained language models has brought new advancements to rerank models."
        ],
        "query": "What is a rerank model?",
        "top_n": 2,
        "instruct": "Given a web search query, retrieve relevant passages that answer the query."
}'
{
  "id": "<string>",
  "results": [
    {
      "document": {
        "text": "<string>"
      },
      "index": 0,
      "relevance_score": 0.9334521178273196
    }
  ],
  "meta": {
    "tokens": {
      "input_tokens": 0,
      "output_tokens": 0
    }
  }
}
Rerank documents by semantic relevance to a query using qwen3-rerank.
Before you call the API, get an API key and set it as an environment variable. If you use the OpenAI SDK, install it first.
Supported model: qwen3-rerank only.

Endpoint

  • HTTP: POST https://dashscope-intl.aliyuncs.com/compatible-api/v1/reranks
  • SDK base_url: https://dashscope-intl.aliyuncs.com/compatible-api/v1

Model overview

ModelMax DocumentsMax Tokens/DocMax Request TokensLanguagesPrice (per 1M tokens)Free QuotaUse Cases
qwen3-rerank5004,000120,000100+ languages$0.11M tokens (valid for 90 days)Text semantic search, RAG
Parameter definitions:
  • Max Tokens/Doc: Maximum token count per query or document. Content exceeding this limit is truncated, which may affect ranking accuracy.
  • Max Documents: Maximum number of documents per request.
  • Max Request Tokens: Calculated as Query Tokens x Document Count + Total Document Tokens. Must not exceed the limit.

Authorizations

string
header
required

Qwen Cloud API Key. Create one in the console.

Body

application/json
enum<string>
required

Model name. Must be qwen3-rerank for the text reranking endpoint.

qwen3-rerank
qwen3-rerank
string
required

Query text. Max 4,000 tokens.

What is a reranking model
string[]
required

Documents to rank. An array of strings. Max 500 documents.

[
  "Reranking models are widely used in search engines and recommendation systems to sort candidates by relevance",
  "Quantum computing is a frontier field of computer science",
  "The development of pre-trained language models has brought new advances to reranking"
]
integer

Return only the top N results. Defaults to returning all documents.

2
x >= 1
string

Custom ranking task instruction. English recommended. Default behavior is QA retrieval: "Given a web search query, retrieve relevant passages that answer the query."

Given a web search query, retrieve relevant passages that answer the query.

Response

200-application/json
string

Unique request identifier.

object[]

Ranked results, sorted by relevance_score descending.

object

Token usage statistics.