Multimodal
siliconflow
Qwen Family
Released: Oct 2025 Updated: Feb 2026

Qwen Qwen3-Omni-30B-A3B-Captioner

Qwen Qwen3-Omni-30B-A3B-Captioner is a Multimodal model.

Vision
Audio
Tool Use
Reasoning
Citations

Key Specs

Context
66k
Max Output
66k
Best Price (Input / Output)
$0.10 / $0.40 per 1M tokens
Chat with Model

Pricing Comparison

Provider Input (1M) Output (1M) Image (1k)
siliconflow $0.10 $0.40 -

Prices are per 1 million tokens unless otherwise noted. Image pricing is per 1000 images if applicable.