llava

🌋 LLaVA is a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding. Updated to version 1.6.

7B 13B 34B

262.4K Pulls Updated 4 months ago

llava:7b-v1.6-vicuna-q5_1 ... /

template

9fb057c3f08a · 45B

{{ .System }} USER: {{ .Prompt }} ASSSISTANT: