A compact, yet powerful 10.7B large language model designed for single-turn conversation.

18K Pulls Updated 4 months ago

Readme

Solar is the first open-source 10.7 billion parameter language model. It’s compact, yet remarkably powerful, and demonstrates state-of-the-art performance in models with parameters under 30B.

This model leverages the Llama 2 architecture and employs the Depth Up-Scaling technique, integrating Mistral 7B weights into upscaled layers.

On the H6 benchmark, this model outperforms models with up to 30B parameters, even the Mixtral 8X7B model.

References

HuggingFace

Upstage AI