Skip to main content
Reranking

OpenAI compatible reranking

OpenAI-compatible reranking API

POST
/reranks
qwen3-rerank
curl --request POST \
  --url https://dashscope-intl.aliyuncs.com/compatible-api/v1/reranks \
  --header "Authorization: Bearer $DASHSCOPE_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
        "model": "qwen3-rerank",
        "documents": [
                "Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance.",
                "Quantum computing is a cutting-edge field of computer science.",
                "The development of pre-trained language models has brought new advancements to rerank models."
        ],
        "query": "What is a rerank model?",
        "top_n": 2,
        "instruct": "Given a web search query, retrieve relevant passages that answer the query."
}'
{
  "id": "<string>",
  "object": "list",
  "model": "qwen3-rerank",
  "results": [
    {
      "document": {
        "text": "<string>"
      },
      "index": 0,
      "relevance_score": 0.9334521178273196
    }
  ],
  "usage": {
    "total_tokens": 0
  }
}
Rerank documents by semantic relevance to a query using qwen3-rerank.
The gte-rerank model will be discontinued on May 30, 2026. Switch to qwen3-rerank for continued service.
Before you call the API, get an API key and set it as an environment variable. If you use the OpenAI SDK, install it first.
Supported model: qwen3-rerank only.

Endpoint

  • HTTP: POST https://dashscope-intl.aliyuncs.com/compatible-api/v1/reranks
  • SDK base_url: https://dashscope-intl.aliyuncs.com/compatible-api/v1

Model overview

ModelMax DocumentsMax Tokens/DocMax Request TokensLanguagesPrice (per 1M tokens)Free QuotaUse Cases
qwen3-rerank5004,000120,000100+ languages$0.11M tokens (valid for 90 days)Text semantic search, RAG
Parameter definitions:
  • Max Tokens/Doc: Maximum token count per query or document. Content exceeding this limit is truncated, which may affect ranking accuracy.
  • Max Documents: Maximum number of documents per request.
  • Max Request Tokens: Calculated as Query Tokens x Document Count + Total Document Tokens. Must not exceed the limit.

Authorizations

string
header
required

Qwen Cloud API Key. Create one in the console.

Body

application/json
enum<string>
required

Model name. Must be qwen3-rerank for the text reranking endpoint.

qwen3-rerank
qwen3-rerank
string
required

Query text. Max 4,000 tokens.

What is a reranking model
string[]
required

Documents to rank. An array of strings. Max 500 documents.

[
  "Reranking models are widely used in search engines and recommendation systems to sort candidates by relevance",
  "Quantum computing is a frontier field of computer science",
  "The development of pre-trained language models has brought new advances to reranking"
]
integer

Return only the top N results. Defaults to returning all documents.

2
x >= 1
string

Custom ranking task instruction. English recommended. Default behavior is QA retrieval: "Given a web search query, retrieve relevant passages that answer the query."

Given a web search query, retrieve relevant passages that answer the query.

Response

200-application/json
string

Unique request identifier.

string

Object type. Always list.

list
string

Model used for reranking.

qwen3-rerank
object[]

Ranked results, sorted by relevance_score descending.

object

Token usage statistics.