Overview
The Analytics page provides observability for your model deployments, including token consumption, request counts, latency, success rates, and per-model performance metrics.
Analytics
Go to the Analytics page to view usage and analytics for your workspace.
Filters
Time range : Select the time window (such as 24 Hours).
Models : Filter by specific model or view all models.
Granularity : Choose the aggregation interval (such as 1 Hour).
Metrics
The page shows four key metrics:
Metric Description Tokens Total token consumption Requests Total number of API requests Avg Latency Average response latency Success rate Percentage of successful requests
The Tokens Analysis chart below provides a visual breakdown of token consumption over time.
Cost includes all consumption across the entire platform. Refer to billing data for details.
Usage units by model type
Type Subcategory Unit Billing basis Large language model Text generation, Deep thinking, Vision understanding Token Billed by input and output token count Vision model Image generation Image (count) Billed by successfully generated images Vision model Video generation Seconds Billed by successfully generated video duration Speech model TTS, Realtime TTS, File ASR, Realtime ASR, Audio/video translation Seconds, characters, or tokens Varies by model -- may bill by audio duration, text characters, or token count Omni-modal model Omni-modal, Realtime multimodal Token Text billed by tokens; other modalities (audio, image, video) billed by corresponding token count
Per-model metrics
Below the usage charts, a per-model table breaks down throughput (TPM/RPM), call volume, success rate, time to first token, and latency for each model. Use this to identify underperforming models or unexpected error spikes.
Request logs
Go to the Logs tab to inspect individual API requests. Logs are retained for the last 14 days .
Filters
Time range : Narrow the log window to a specific period.
Model : Filter by a specific model.
Status : Filter by HTTP status code (such as 200, 400, 429).
Request ID : Search for a specific request by its ID.
Log table
Each row shows:
Column Description Request ID Unique identifier for the request Timestamp When the request was made Model Model used for the request Usage Token breakdown (total, input, output) TTFT Time to first token Latency Total response time Status HTTP status code
Request details
Click Details to open a side panel with the full request breakdown. You can view the data in List or JSON format.
Click Export to download logs as a file for offline analysis.
Alerts
Set up custom alert rules to monitor API call metrics and receive real-time notifications when anomalies occur. For details, see Monitoring alerts .