Llama-3.2-11B-Vision-Instruct

Llama-3.2-11B-Vision-Instruct is a Multimodal model.

Vision

Audio

Tool Use

Reasoning

Citations

Context

131k

Max Output

16k

Best Price (Input / Output)

Free / Free per 1M tokens

Pricing Comparison

Provider	Input (1M)	Output (1M)	Image (1k)
github-models	Free	Free	-
nvidia	Free	Free	-
openrouter	Free	Free	-
cloudflare-ai-gateway	$0.05	$0.68	-
inference	$0.06	$0.06	-
vercel	$0.16	$0.16	-
azure	$0.37	$0.37	-

Prices are per 1 million tokens unless otherwise noted. Image pricing is per 1000 images if applicable.