Qwen3 VL 32B Instruct
Qwen · qwen/qwen3-vl-32b-instruct
Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines deep visual perception with advanced text...
open weightsimagetexttext+image->text
Context
Max context: 131072
Max output: 32768
Max output: 32768
Pricing
Input / 1M: 0.10
Output / 1M: 0.42
Blend / 1M: 0.26
Output / 1M: 0.42
Blend / 1M: 0.26
Quality
Quality index: —
Provider
Provider: Qwen
Moderated: no
Moderated: no