1.1B parameter Lllama model finetuned for chatting

312 Pulls Updated 6 months ago

Readme

tinyllama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

This chat model is finetuned on OpenAssistant/oasst_top1_2023-08-25 using chatml.

This model is based on an intermediate snapshot trained on 1T tokens.

Note: models will be updated as and when new snapshots are released.

Get Started with TinyLlama

CLI

ollama run saikatkumardey/tinyllama

API

curl -X POST http://localhost:11434/api/generate -d '{
  "model": "saikatkumardey/tinyllama:latest",
  "prompt":"Why is the sky blue?"
 }'

Memory Requirements

Model Memory
tinyllama 3.4G
tinyllama:Q6_K 3.4G
tinyllama:Q8_0 3.67G

Source