UI-TARS 7B
Bytedance · bytedance/ui-tars-1.5-7b
UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, including desktop interfaces, web browsers, mobile systems, and games. Built by ByteDance, it builds upon the UI-TARS framework with reinforcement...
open weightsimagetexttext+image->text
Context
Max context: 128000
Max output: 2048
Max output: 2048
Pricing
Input / 1M: 0.10
Output / 1M: 0.20
Blend / 1M: 0.15
Output / 1M: 0.20
Blend / 1M: 0.15
Quality
Quality index: —
Provider
Provider: Bytedance
Moderated: no
Moderated: no