Thy's Roam

Home

❯

LLM

LLM

May 19, 20261 min read

Architecture

  • Transformer
  • MoE
  • Linear Attention
  • FFN

Components

  • Tokenizer
  • Norm
  • RoPE
  • KV Cache

Model Family

  • BERT
  • GPT
  • DeepSeek

Infra

  • Quantization
  • PyTorch
  • FlashAttention

Training

  • Epoch vs Batch
  • Emergence
  • Optimizer

Misc

  • Agent

Graph View

  • Architecture
  • Components
  • Model Family
  • Infra
  • Training
  • Misc

Backlinks

  • Emergence
  • Machine Learning

Created by Thysrael © 2026

  • GitHub
  • Email