Member-only story
Qwen2.5-Coder, Cosmos Tokenizer, OpenCoder, and New SentenceTransformers: Great Times for Open Source
6 min readNov 13, 2024
I want to highlight some standout open-source advancements that have really caught my eye:
- Qwen2.5-Coder Series: An open-source code LLM that’s giving GPT-4 a run for its money.
- Cosmos Tokenizer: An advanced suite of neural tokenizers for efficient image and video compression.
- OpenCoder: A fully open-source code LLM trained on an astonishing 2.5 trillion tokens.
- Massive CPU Speedup in SentenceTransformers: A 4x speed boost on CPU inference using OpenVINO’s int8 static quantization.
Let’s dive in!
Qwen2.5-Coder Series: Open-Sourcing a SOTA Code LLM Rivaling GPT-4
Alibaba Cloud announced the open-source release of the Qwen2.5-Coder series — models that are Powerful, Diverse, and Practical — dedicated to propelling the evolution of open code large language models (LLMs).
The flagship model, Qwen2.5-Coder-32B-Instruct, sets a new benchmark as the state-of-the-art (SOTA) open-source code model, matching the coding capabilities of GPT-4. It excels in general-purpose and mathematical reasoning.