Content moderation, input/output guardrails, and responsible AI practices across all modalities
Qwen Cloud applies automatic content moderation to all API requests. This guide explains how the built-in safety system works, how to handle moderation responses, and best practices for building responsible AI applications.
All API requests pass through an automatic moderation layer that screens both inputs and outputs for harmful, illegal, or inappropriate content. This runs transparently — you do not need to enable or configure it.
When content is blocked, the API returns a
For image generation specifically, you may also see: "The image content does not comply with green network verification."
Do not surface raw error messages to end users. Catch moderation errors and respond gracefully:
For async tasks (image/video generation), moderation errors appear in the task result rather than as an immediate HTTP error. Check the task's
Some speech recognition models support built-in sensitive word filtering that replaces detected sensitive words in transcription output.
For Fun-ASR models, you can customize the filtering behavior via the
When recording audio for voice cloning, the recording content must not include sensitive words related to politics, pornography, or violence. Recordings with such content will fail the cloning process.
For image and video generation, you can enable watermarking to mark content as AI-generated:
The
Platform moderation catches policy violations, but your application should also validate inputs before they reach the API.
For applications that expose the model to end users, design your system prompt to resist prompt injection:
Even with input validation and platform moderation, verify model outputs before presenting them to users.
When using structured output (JSON mode or JSON Schema), validate the parsed output against your expected schema and business rules before acting on it. A syntactically valid JSON response may still contain semantically incorrect or harmful content.
For high-stakes decisions (financial advice, medical information, legal guidance):
Qwen Cloud does not retain your inputs and outputs for model training:
Qwen Cloud provides platform-level moderation as a safety baseline. Building a safe production application is a shared responsibility:
Built-in content moderation
All API requests pass through an automatic moderation layer that screens both inputs and outputs for harmful, illegal, or inappropriate content. This runs transparently — you do not need to enable or configure it.
What gets moderated
| Modality | Input screening | Output screening |
|---|---|---|
| Text generation | Prompts, system messages, conversation history | Generated text |
| Image generation | Text prompts, negative prompts, reference images | Generated images |
| Video generation | Text prompts, reference images/videos | Generated videos |
| Text-to-speech | Input text | Synthesized audio |
| Speech-to-text | Input audio | Transcribed text |
| Vision | Input images/videos | Generated analysis |
Moderation error codes
When content is blocked, the API returns a 400 status code with one of these error codes:
| Error code | Message | Meaning |
|---|---|---|
data_inspection_failed | "Input or output data may contain inappropriate content." | Content blocked by platform moderation policy |
data_inspection_failed | "Input data may contain inappropriate content." | Specifically the input was blocked |
data_inspection_failed | "Output data may contain inappropriate content." | Specifically the output was blocked |
ip_infringement_suspect | "Input data is suspected of being involved in IP infringement." | Input may violate intellectual property rights |
custom_role_blocked | "Input or output data may contain inappropriate content with custom rule." | Blocked by a custom content policy |
faq_rule_blocked | "Input or output data is blocked by faq rule." | Blocked by an FAQ rule intervention |
Handle moderation errors
Do not surface raw error messages to end users. Catch moderation errors and respond gracefully:
- Python
- Node.js
output.code field when task_status is FAILED:
Speech-specific safety features
Sensitive word filtering (ASR)
Some speech recognition models support built-in sensitive word filtering that replaces detected sensitive words in transcription output.
| Model | Sensitive word filtering | Default behavior |
|---|---|---|
| Fun-ASR, Fun-ASR-2025-11-07 | Supported | Filters from Qwen Cloud sensitive word list by default |
| Qwen3-ASR-Flash-FileTranscription | Supported | Always on |
| Qwen3-ASR-Flash | Not supported | — |
| Qwen-ASR (DashScope) | Not supported | — |
special_word_filter parameter:
filter_with_signed— Replace matched words with asterisks (*)filter_with_removal— Remove matched words entirely from the transcriptsystem_reserved_filter— Whentrue(default), applies the built-in Qwen Cloud sensitive word list
Voice cloning content restrictions
When recording audio for voice cloning, the recording content must not include sensitive words related to politics, pornography, or violence. Recordings with such content will fail the cloning process.
AI-generated content watermarking
For image and video generation, you can enable watermarking to mark content as AI-generated:
| Modality | Watermark text | Watermark position |
|---|---|---|
| Image generation | AI-generated | Bottom-right corner |
| Video generation | Generated by Qwen AI | Lower-right corner |
watermark parameter defaults to false. Enable it when transparency or regulatory compliance requires disclosure of AI-generated content.
Input guardrails
Platform moderation catches policy violations, but your application should also validate inputs before they reach the API.
Validate user inputs
- Length limits — Set
max_tokenson output and enforce a reasonable maximum input length. This prevents abuse through extremely long prompts. - Format validation — If your application expects structured input (a product description, a question about your docs), validate the format before sending it to the model.
- Rate limiting — Apply per-user rate limits to prevent a single user from consuming excessive resources or probing for moderation boundaries.
System prompt hardening
For applications that expose the model to end users, design your system prompt to resist prompt injection:
- Place critical instructions at the beginning and end of the system prompt, where they receive the most attention.
- Explicitly instruct the model to ignore attempts to override its role or instructions.
- Use clear delimiters to separate system instructions from user content.
Output guardrails
Even with input validation and platform moderation, verify model outputs before presenting them to users.
Structured output validation
When using structured output (JSON mode or JSON Schema), validate the parsed output against your expected schema and business rules before acting on it. A syntactically valid JSON response may still contain semantically incorrect or harmful content.
Confidence-based escalation
For high-stakes decisions (financial advice, medical information, legal guidance):
- If the model hedges or expresses uncertainty, route to a human reviewer rather than presenting the response directly.
- Use a second model call as a "judge" to verify the first response before showing it to users.
Data handling
Qwen Cloud does not retain your inputs and outputs for model training:
- Inputs and outputs are processed in memory only during the request and are not stored in persistent storage after the response is returned.
- Metadata (token counts, timestamps, request IDs) is logged for billing and rate limiting.
- When using the Responses API with
store=true(default), conversation data is stored for 30 days. Setstore=falseto disable conversation retention.
Shared responsibility
Qwen Cloud provides platform-level moderation as a safety baseline. Building a safe production application is a shared responsibility:
| Qwen Cloud provides | You implement |
|---|---|
| Automatic content moderation on all inputs and outputs | Application-level input validation and format checking |
| Sensitive word filtering for supported ASR models | Output verification and post-processing |
| Platform abuse detection and rate limiting | Per-user authentication and rate limits |
| Data security and encryption in transit and at rest | System prompt hardening against prompt injection |
| AI-generated content watermarking | Appropriate disclosure of AI-generated content |
Next steps
- Data security — Encryption, API key security, and compliance
- Zero data retention — Data handling during inference
- Audit logs — Track API usage for compliance
- Error messages — Full list of error codes including moderation errors
- Accuracy tuning — Improve output quality to reduce moderation-triggered failures