Qwen3 Vl Instruct

Qwen3 Vl Instruct is a Multimodal model.

Vision

Audio

Tool Use

Reasoning

Citations

Context

131k

Max Output

129k

Best Price (Input / Output)

$0.70 / $2.80 per 1M tokens

Pricing Comparison

Provider	Input (1M)	Output (1M)	Image (1k)
vercel	$0.70	$2.80	-

Prices are per 1 million tokens unless otherwise noted. Image pricing is per 1000 images if applicable.