Bielik Anatomy 🧠Ep.1 - Bielik LM in Triton - Can I Actually Pull This Off?
29 Jan 2026First episode of the Bielik Anatomy series - implementing the Polish language model Bielik 1.5 (1.6B parameters) from scratch using GPU kernels in OpenAI Triton.
This episode covers the Bielik 1.5 Instruct architecture, Grouped Query Attention (GQA) vs. Multi-Head Attention, SwiGLU activation and RMSNorm, and an introduction to GPU programming in Triton. It also lays out the full roadmap for the 8-episode series including Flash Attention, RoPE, and custom kernels.
Resources: