Skip to main content
Embedding & reranking

Text embeddings

Convert data to vectors

Prerequisites

Get an API key and set it as an environment variable. To use the SDK, install it.

Get embeddings

Text embedding

To make an API request, specify the text to embed and the model to use.
  • OpenAI compatible API
  • DashScope
import os
from openai import OpenAI

input_text = "The quality of the clothes is excellent"

client = OpenAI(
  api_key=os.getenv("DASHSCOPE_API_KEY"),
  base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)

completion = client.embeddings.create(
  model="text-embedding-v4",
  input=input_text
)

print(completion.model_dump_json())

Supported models

ModelEmbedding dimensionsBatch sizeMax tokens per batchSupported languages
text-embedding-v4 (Part of the Qwen3-Embedding series)2,048, 1,536, 1,024 (default), 768, 512, 256, 128, 64108,192100+ major languages, including Chinese, English, Spanish, French, Portuguese, Indonesian, Japanese, Korean, German, Russian, and multiple programming languages
text-embedding-v31,024 (default), 768, 512108,19250+ major languages, including Chinese, English, Spanish, French, Portuguese, Indonesian, Japanese, Korean, German, and Russian
Batch size is the maximum number of texts per API call. For example, text-embedding-v4 has a batch size of 10, which lets you include up to 10 texts for vectorization per request, where each text is limited to 8,192 tokens. This limit applies to:
  • String array input: The array can contain a maximum of 10 elements.
  • File input: The text file can contain a maximum of 10 lines of text.

Core features

Switch embedding dimensions

text-embedding-v4 and text-embedding-v3 support custom embedding dimensions. Higher dimensions retain richer semantic information but also increase storage and computation costs.
  • General scenarios (Recommended): 1024 dimensions offer the best balance between performance and cost, suitable for most semantic retrieval tasks.
  • High-precision scenarios: For applications that require high precision, choose 1536 or 2048 dimensions. This improves precision but significantly increases storage and computation overhead.
  • Resource-constrained scenarios: In cost-sensitive scenarios, choose 768 or lower dimensions. This significantly reduces resource consumption but results in some loss of semantic information.
  • OpenAI compatible API
  • DashScope
import os
from openai import OpenAI

client = OpenAI(
  api_key=os.getenv("DASHSCOPE_API_KEY"),
  base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)

resp = client.embeddings.create(
  model="text-embedding-v4",
  input=["I like it and will buy from here again"],
  # Set the embedding dimensions to 256
  dimensions=256
)
print(f"Embedding dimensions: {len(resp.data[0].embedding)}")

Distinguish between query and document text (text_type)

This parameter can currently only be enabled through the DashScope SDK and API.
To achieve the best results in search-related tasks, process different types of content with targeted embedding to fully leverage their respective roles. The text_type parameter is designed for this purpose:
  • text_type: 'query': Use for user-provided query text. The model generates a "title-like" vector that is more directional and optimized for information retrieval.
  • text_type: 'document' (default): Use for document text stored in the database. The model generates a "body-like" vector that contains more comprehensive information and is optimized for being retrieved.
When using short text to match long text, distinguish between query and document. For tasks where all texts have the same role, such as clustering or classification, you do not need to set this parameter.

Use instructions to improve performance (instruct)

This parameter can currently only be enabled through the DashScope SDK and API.
You can provide a clear English task instruction to guide text-embedding-v4 in optimizing vector quality for specific retrieval scenarios, improving precision. When using this feature, you must set the text_type parameter to query.
# Example: Add an instruction to optimize retrieval quality when building document vectors.
resp = dashscope.TextEmbedding.call(
  model="text-embedding-v4",
  input="Research papers on machine learning",
  text_type="query",
  instruct="Given a research paper query, retrieve relevant research paper"
)

Dense and sparse vectors

This parameter can currently only be enabled through the DashScope SDK and API.
text-embedding-v4 and text-embedding-v3 support three types of vector output to accommodate different retrieval strategies.
Vector type (output_type)Core advantagesMain drawbacksTypical application scenarios
denseDeep semantic understanding that identifies synonyms and context for more relevant results.Higher computational and storage costs. Does not guarantee an exact match for keywords.Semantic search, AI chat, content recommendation.
sparseHigh computational efficiency, focusing on an exact match for keywords and enabling fast filtering.Lacks semantic understanding and cannot process synonyms or context.Log retrieval, product SKU search, precise information filtering.
dense&sparseCombines semantic and keyword matching for optimal search results. The generation cost is unchanged, and the API call overhead is identical to the single-vector mode.Large storage requirements. More complex system architecture and retrieval logic.High-quality, production-grade hybrid search engines.

Use examples

The following code is for demonstration purposes only. For production, pre-compute and store embeddings in a vector database. This way, you only need to generate the query embedding for retrieval.
Perform precise semantic matching by calculating the similarity between the query embedding and the document embeddings.
import dashscope
import numpy as np
from dashscope import TextEmbedding

dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

def cosine_similarity(a, b):
  """Calculate cosine similarity"""
  return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

def semantic_search(query, documents, top_k=5):
  """Perform semantic search"""
  # Generate the query vector
  query_resp = TextEmbedding.call(
  model="text-embedding-v4",
  input=query,
  dimension=1024
  )
  query_embedding = query_resp.output['embeddings'][0]['embedding']

  # Generate the document vectors
  doc_resp = TextEmbedding.call(
  model="text-embedding-v4",
  input=documents,
  dimension=1024
  )

  # Calculate similarities
  similarities = []
  for i, doc_emb in enumerate(doc_resp.output['embeddings']):
    similarity = cosine_similarity(query_embedding, doc_emb['embedding'])
    similarities.append((i, similarity))

  # Sort and return the top_k results
  similarities.sort(key=lambda x: x[1], reverse=True)
  return [(documents[i], sim) for i, sim in similarities[:top_k]]

# Example usage
documents = [
  "Artificial intelligence is a branch of computer science",
  "Machine learning is an important method for achieving artificial intelligence",
  "Deep learning is a subfield of machine learning"
]
query = "What is AI?"
results = semantic_search(query, documents, top_k=2)
for doc, sim in results:
  print(f"Similarity: {sim:.3f}, Document: {doc}")

Recommendation system

Analyze a user's behavioral history embeddings to identify their interests and recommend similar items.
import dashscope
import numpy as np
from dashscope import TextEmbedding

dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

def cosine_similarity(a, b):
  """Calculate cosine similarity"""
  return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
def build_recommendation_system(user_history, all_items, top_k=10):
  """Build a recommendation system"""
  # Generate user history vectors
  history_resp = TextEmbedding.call(
  model="text-embedding-v4",
  input=user_history,
  dimension=1024
  )

  # Calculate the user preference vector by averaging
  user_embedding = np.mean([
  emb['embedding'] for emb in history_resp.output['embeddings']
  ], axis=0)

  # Generate all item vectors
  items_resp = TextEmbedding.call(
  model="text-embedding-v4",
  input=all_items,
  dimension=1024
  )

  # Calculate recommendation scores
  recommendations = []
  for i, item_emb in enumerate(items_resp.output['embeddings']):
    score = cosine_similarity(user_embedding, item_emb['embedding'])
    recommendations.append((all_items[i], score))

  # Sort and return the recommendation results
  recommendations.sort(key=lambda x: x[1], reverse=True)
  return recommendations[:top_k]

# Example usage
user_history = ["Science Fiction", "Action", "Suspense"]
all_movies = ["Future World", "Space Adventure", "Ancient War", "Romantic Journey", "Superhero"]
recommendations = build_recommendation_system(user_history, all_movies)
for movie, score in recommendations:
  print(f"Recommendation Score: {score:.3f}, Movie: {movie}")

Text clustering

Group similar texts by analyzing the distances between their embeddings.
# scikit-learn is required: pip install scikit-learn
import dashscope
import numpy as np
from sklearn.cluster import KMeans

dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

def cluster_texts(texts, n_clusters=2):
  """Cluster a set of texts"""
  # 1. Get the vectors for all texts
  resp = dashscope.TextEmbedding.call(
  model="text-embedding-v4",
  input=texts,
  dimension=1024
  )
  embeddings = np.array([item['embedding'] for item in resp.output['embeddings']])

  # 2. Use the KMeans algorithm for clustering
  kmeans = KMeans(n_clusters=n_clusters, random_state=0, n_init='auto').fit(embeddings)

  # 3. Organize and return the results
  clusters = {i: [] for i in range(n_clusters)}
  for i, label in enumerate(kmeans.labels_):
    clusters[label].append(texts[i])
  return clusters


# Example usage
documents_to_cluster = [
  "Mobile phone company A releases a new phone",
  "Search engine company B launches a new system",
  "World Cup final: Argentina vs. France",
  "China wins another gold medal at the Olympics",
  "A company releases its latest AI chip",
  "European Cup match report"
]
clusters = cluster_texts(documents_to_cluster, n_clusters=2)
for cluster_id, docs in clusters.items():
  print(f"--- Cluster {cluster_id} ---")
  for doc in docs:
    print(f"- {doc}")

Text classification

Perform zero-shot text classification by calculating the similarity between an input text's embedding and predefined label embeddings. This process classifies text into new categories without requiring pre-labeled examples.
import dashscope
import numpy as np

dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

def cosine_similarity(a, b):
  """Calculate cosine similarity"""
  return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))


def classify_text_zero_shot(text, labels):
  """Perform zero-shot text classification"""
  # 1. Get the vectors for the input text and all labels
  resp = dashscope.TextEmbedding.call(
  model="text-embedding-v4",
  input=[text] + labels,
  dimension=1024
  )
  embeddings = resp.output['embeddings']
  text_embedding = embeddings[0]['embedding']
  label_embeddings = [emb['embedding'] for emb in embeddings[1:]]

  # 2. Calculate the similarity with each label
  scores = [cosine_similarity(text_embedding, label_emb) for label_emb in label_embeddings]

  # 3. Return the label with the highest similarity
  best_match_index = np.argmax(scores)
  return labels[best_match_index], scores[best_match_index]


# Example usage
text_to_classify = "The fabric of this dress is comfortable, and the style is nice too"
possible_labels = ["Digital Products", "Apparel & Accessories", "Food & Beverage", "Home & Living"]

label, score = classify_text_zero_shot(text_to_classify, possible_labels)
print(f"Input text: '{text_to_classify}'")
print(f"Best matching category: '{label}' (Similarity: {score:.3f})")

Anomaly detection

Identify anomalous data by calculating the similarity between a text's embedding and the central embedding of normal samples. Data that significantly deviates from this pattern is considered an anomaly.
The threshold in the example is for demonstration purposes. The ideal value varies based on data content and distribution, so you must calibrate it using your own dataset.
import dashscope
import numpy as np


def cosine_similarity(a, b):
  """Calculate cosine similarity"""
  return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))


def detect_anomaly(new_comment, normal_comments, threshold=0.6):
  # 1. Vectorize all normal comments and the new comment
  all_texts = normal_comments + [new_comment]
  resp = dashscope.TextEmbedding.call(
  model="text-embedding-v4",
  input=all_texts,
  dimension=1024
  )
  embeddings = [item['embedding'] for item in resp.output['embeddings']]

  # 2. Calculate the center vector (average value) of the normal comments
  normal_embeddings = np.array(embeddings[:-1])
  normal_center_vector = np.mean(normal_embeddings, axis=0)

  # 3. Calculate the similarity between the new comment and the center vector
  new_comment_embedding = np.array(embeddings[-1])
  similarity = cosine_similarity(new_comment_embedding, normal_center_vector)

  # 4. Determine if it is an anomaly
  is_anomaly = similarity < threshold
  return is_anomaly, similarity


# Example usage
normal_user_comments = [
  "Today's meeting was productive",
  "The project is progressing smoothly",
  "The new version will be released next week",
  "User feedback is positive"
]

test_comments = {
  "Normal comment": "The feature works as expected",
  "Anomaly - meaningless garbled text": "asdfghjkl zxcvbnm"
}

print("--- Anomaly Detection Example ---")
for desc, comment in test_comments.items():
  is_anomaly, score = detect_anomaly(comment, normal_user_comments)
  result = "Yes" if is_anomaly else "No"
  print(f"Comment: '{comment}'")
  print(f"Is anomaly: {result} (Similarity to normal samples: {score:.3f})\n")

API reference

For multimodal embedding, see Multimodal embeddings.

Error codes

If a call fails, see Error messages.

Rate limits

See Rate limits.

Model performance (MTEB/CMTEB)

  • MTEB: Massive Text Embedding Benchmark, a comprehensive benchmark that assesses the general-purpose performance of text embeddings on tasks such as classification, clustering, and retrieval.
  • CMTEB: Chinese Massive Text Embedding Benchmark, a large-scale benchmark specifically for evaluating Chinese text embeddings.
  • Scores range from 0 to 100. A higher value indicates better performance.
ModelMTEBMTEB (Retrieval task)CMTEBCMTEB (Retrieval task)
text-embedding-v3 (512 dimensions)62.1154.3066.8171.88
text-embedding-v3 (768 dimensions)62.4354.7467.9072.29
text-embedding-v3 (1024 dimensions)63.3955.4168.9273.23
text-embedding-v4 (512 dimensions)64.7356.3468.7973.33
text-embedding-v4 (1024 dimensions)68.3659.3070.1473.98
text-embedding-v4 (2048 dimensions)71.5861.9771.9975.01