AI / NLP
Latest Update
Shrinking DistilBERT: Pruning, KD & Quantization
A deep dive into a complete three-stage compression architecture—reducing model size by 2.1× while preserving accuracy.
Draft
Attention is All You Need
Exploring the foundational paper that introduced the Transformer architecture and its impact on NLP.