Skip to main content
More

Configure connection reuse for DashScope SDK

Reduce timeouts at scale

The DashScope SDK supports connection reuse to reduce resource consumption and improve efficiency.
  • Java SDK: Connection pooling is enabled by default. Configure max connections and timeouts as needed.
  • Python SDK: Pass a custom Session to enable connection reuse. Supports sync and async calls.

Java SDK

Connection pooling is enabled by default. Adjust the following parameters as needed.

Parameters

ParameterDescriptionDefaultUnitNotes
connectTimeoutTimeout for establishing a connection.120secondsShorter timeouts reduce wait time in low-latency scenarios.
readTimeoutTimeout for reading data.300seconds
writeTimeoutTimeout for writing data.60seconds
connectionIdleTimeoutTimeout for idle connections.300secondsLonger idle timeouts avoid frequent reconnections under high concurrency.
connectionPoolSizeMaximum connections in the pool.32itemsToo few connections cause blocking; too many increase server load.
maximumAsyncRequestsMaximum concurrent requests across all hosts. Must be ≤ connectionPoolSize.32requests
maximumAsyncRequestsPerHostMaximum concurrent requests per host. Must be ≤ maximumAsyncRequests.32items

Code example

Before running the code, get an API key and export it as an environment variable. Then install the latest SDK.
Configure connection pool parameters and call a model service:
// Recommended DashScope SDK version >= 2.12.0
import java.time.Duration;
import java.util.Arrays;

import com.alibaba.dashscope.aigc.generation.Generation;
import com.alibaba.dashscope.aigc.generation.GenerationParam;
import com.alibaba.dashscope.aigc.generation.GenerationResult;
import com.alibaba.dashscope.common.Message;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.InputRequiredException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.protocol.ConnectionConfigurations;
import com.alibaba.dashscope.protocol.Protocol;
import com.alibaba.dashscope.utils.Constants;

public class Main {
  public static GenerationResult callWithMessage() throws ApiException, NoApiKeyException, InputRequiredException {
    Generation gen = new Generation(Protocol.HTTP.getValue(), "https://dashscope-intl.aliyuncs.com/api/v1");
    Message systemMsg = Message.builder()
        .role(Role.SYSTEM.getValue())
        .content("You are a helpful assistant.")
        .build();
    Message userMsg = Message.builder()
        .role(Role.USER.getValue())
        .content("Who are you?")
        .build();
    GenerationParam param = GenerationParam.builder()
        // If you have not configured the environment variable, replace with your API key: .apiKey("sk-xxx")
        .apiKey(System.getenv("DASHSCOPE_API_KEY"))
        .model("qwen3.6-plus")
        .messages(Arrays.asList(systemMsg, userMsg))
        .resultFormat(GenerationParam.ResultFormat.MESSAGE)
        .build();

    System.out.println(userMsg.getContent());
    return gen.call(param);
  }
  public static void main(String[] args) {
    // Connection pool configuration
    Constants.connectionConfigurations = ConnectionConfigurations.builder()
        .connectTimeout(Duration.ofSeconds(10))  // Timeout for establishing a connection, default 120s
        .readTimeout(Duration.ofSeconds(300)) // Timeout for reading data, default 300s
        .writeTimeout(Duration.ofSeconds(60)) // Timeout for writing data, default 60s
        .connectionIdleTimeout(Duration.ofSeconds(300)) // Timeout for idle connections, default 300s
        .connectionPoolSize(256) // Maximum connections in the connection pool, default 32
        .maximumAsyncRequests(256)  // Maximum concurrent requests, default 32
        .maximumAsyncRequestsPerHost(256) // Maximum concurrent requests per host, default 32
        .build();

    try {
      GenerationResult result = callWithMessage();
      System.out.println(result.getOutput().getChoices().get(0).getMessage().getContent());
    } catch (ApiException | NoApiKeyException | InputRequiredException e) {
      System.err.println("An error occurred while calling the generation service: " + e.getMessage());
    }
    System.exit(0);
  }
}

Python SDK

The Python SDK supports connection reuse via a custom Session. Two methods are available: HTTP asynchronous and HTTP synchronous.

HTTP asynchronous

Use aiohttp.ClientSession with aiohttp.TCPConnector for async connection reuse.
ParameterDescriptionDefaultNotes
limitTotal connection limit100Higher values improve concurrency.
limit_per_hostConnection limit per host0 (unlimited)Prevents excessive load on a single host.
sslSSL context configurationNoneSSL certificate validation for HTTPS connections.

Code example

Before running the code, get an API key and export it as an environment variable. Then install the latest SDK.
Configure async connection reuse and call a model service:
import asyncio
import aiohttp
import ssl
import certifi
from dashscope import AioGeneration
import dashscope
import os

async def main():
  dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

  # If you have not configured the environment variable, replace with your API key: dashscope.api_key = "sk-xxx"
  dashscope.api_key = os.getenv("DASHSCOPE_API_KEY")

  # Configure connection parameters
  connector = aiohttp.TCPConnector(
    limit=100,           # Total connection limit
    limit_per_host=30,   # Connection limit per host
    ssl=ssl.create_default_context(cafile=certifi.where()),
  )

  # Create a custom Session and pass it to the call method
  async with aiohttp.ClientSession(connector=connector) as session:
    response = await AioGeneration.call(
      model='qwen3.6-plus',
      prompt='Hello, please introduce yourself',
      session=session,  # Pass the custom Session
    )
    print(response)

asyncio.run(main())

HTTP synchronous

Use requests.Session for sync connection reuse. Requests within the same Session reuse the TCP connection.

Code example

Before running the code, get an API key and export it as an environment variable. Then install the latest SDK.
Configure sync connection reuse and call a model service:
import requests
from dashscope import Generation
import dashscope
import os

dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

# If you have not configured the environment variable, replace with your API key: dashscope.api_key = "sk-xxx"
dashscope.api_key = os.getenv("DASHSCOPE_API_KEY")

# Use a with statement to ensure the Session closes correctly
with requests.Session() as session:
  response = Generation.call(
    model='qwen3.6-plus',
    prompt='Hello',
    session=session  # Pass the custom Session
  )
  print(response)
Reuse the same Session across multiple calls:
import requests
from dashscope import Generation
import dashscope
import os

dashscope.base_http_api_url = 'https://dashscope-intl.aliyuncs.com/api/v1'

# If you have not configured the environment variable, replace with your API key: dashscope.api_key = "sk-xxx"
dashscope.api_key = os.getenv("DASHSCOPE_API_KEY")

# Create a Session object
session = requests.Session()

try:
  # Reuse the same Session for multiple calls
  response1 = Generation.call(
    model='qwen3.6-plus',
    prompt='Hello',
    session=session
  )
  print(response1)

  response2 = Generation.call(
    model='qwen3.6-plus',
    prompt='Introduce yourself',
    session=session
  )
  print(response2)
finally:
  # Ensure the Session closes correctly
  session.close()

Best practices

  • Java SDK: Set connectionPoolSize and maximumAsyncRequests based on your concurrent workload. Too few connections cause blocking; too many increase load.
  • Python SDK: Use with statements to manage the Session lifecycle and ensure proper resource cleanup.
  • Choose the right method: Use async calls for async applications (like asyncio or FastAPI). Use sync calls for traditional applications.