Member-only story
StarCoder 2: Can Top Open Source LLM Beat GitHub Copilot?
With over 1.3 million paid subscribers and a deployment across more than 50,000 organizations, GitHub CoPilot is world’s most widely deployed AI developer tool.
Coding LLMs are not only supercharging productivity, they are permanently changing the way digital natives are developing software.
In the very near future, low-code/no-code platforms will democratize app creation, workflow automation, and data analysis, enabling the development of tailored AI copilots for various tasks.
This transformative potential underpins the great interest in open-source alternatives, a movement that leads us to an exciting development that I will cover today.
BigCode, in collaboration with NVIDIA, has recently unveiled StarCoder2, a family of open LLMs specifically designed for coding, which I think the best available open source LLM for coding in terms of size and performance.
StarCoder2 models are available in three distinct sizes, featuring 3B, 7B, and 15B parameters to support an impressive array of programming languages while setting new benchmarks in accuracy.
To empower these models, each variant is trained on The Stack v2, which is the most expansive open code dataset currently available for LLM pretraining.
Let’s have a look at some of the highlights
- 3B (by ServiceNow), 7B (by Hugging Face) & 15B parameter version (by NVIDIA using NVIDIA NeMo)