loom.

A GPT built from scratch in pure NumPy — every gradient derived by hand — trained on Shakespeare and running live in your browser. Generate text, then inspect what each attention head was looking at while it wrote. Part of a from-scratch LLM stack with mosaic (tokenizer) and nabla (autograd) · source on GitHub.

Generate

Prompt

Temperature: 0.8

Top-k: 40

Length

loading Pyodide runtime (~10 MB, one time)…

Output

Attention — what the model looked at

Layer

Head

Row i shows where the model attended while predicting the token after position i (last 48 characters of the output). Brighter = more attention. The empty upper triangle is causality: the model cannot see the future. Different heads learn different habits — some watch the previous character, some scan for the start of the line or the speaker's name.