Qwen Qwen3-Omni-30B-A3B-Captioner

Qwen Qwen3-Omni-30B-A3B-Captioner is a Multimodal model.

Vision

Audio

Tool Use

Reasoning

Citations

Context

66k

Max Output

66k

Best Price (Input / Output)

$0.10 / $0.40 per 1M tokens

Pricing Comparison

Provider	Input (1M)	Output (1M)	Image (1k)
siliconflow	$0.10	$0.40	-

Prices are per 1 million tokens unless otherwise noted. Image pricing is per 1000 images if applicable.