Member-only story

LinkedIn’s Liger: The GPU Kernel Suite that Andrej Karpathy, Jeremy Howard, and Thomas Wolf Use for Efficient LLM Training

7 min readAug 25, 2024

LinkedIn just open-sourced a collection of Triton-based kernels that are specifically designed for Large Language Model (LLM) training.

Honestly, it’s kind of a big deal.

I will shortly explain why companies bother with custom kernels, but I didn’t see this coming for 2024, it’s going to save us lots of time!

By adding a single line of code, you can boost throughput by over 20% and cut memory usage by 60%.

Benchmark conditions: LLaMA 3–8B, Batch Size = 8, Data Type = `bf16`, Optimizer = AdamW, Gradient Checkpointing = True, Distributed Strategy = FSDP1 on 8 A100s.

As you can see, Hugging Face models are out of memory at a 4K context length, whereas Hugging Face + Liger Kernel scales up to 16K.

This will effectively allow for extended context lengths, larger batch sizes, and support for extensive vocabularies.

Let’s dive deeper!

Hercules is a 10-year-old liger. He is 3.34m (11 feet) long and weighs 419 kg (922 lbs) which is more than 3 adult male African lions combined.

What’s the Buzz About Liger Kernels?

Liger Kernels, or as they officially call it, the LinkedIn GPU Efficient Runtime Kernel, are a set of custom Triton kernels that have been supercharged for LLM training.

LinkedIn’s Liger: The GPU Kernel Suite that Andrej Karpathy, Jeremy Howard, and Thomas Wolf Use for Efficient LLM Training

What’s the Buzz About Liger Kernels?

Written by Agent Issue

Responses (2)