← Back to Course Contents · ← All Papers
Week 2

LLM Deep Dive

Architecture, training, and alignment of large language models

11 arXiv papers covering transformers, scaling laws, instruction tuning, RLHF, and the major open and closed model families.

All PDFs link to raw.githubusercontent.com; clicking will download the file directly. Source links go to the canonical version on arXiv, the journal, or the publisher.

2.1 · LLM Architecture Deep Dive

Attention Is All You Need
Vaswani, A., et al. (2017) — NeurIPS 2017

2.2 · Training Large Language Models

Scaling Laws for Neural Language Models
Kaplan, J., et al. (2020)
Language Models are Few-Shot Learners (GPT-3)
Brown, T., et al. (2020) — NeurIPS 2020
Training Compute-Optimal Large Language Models (Chinchilla)
Hoffmann, J., et al. (2022)
GPT-4 Technical Report
OpenAI (2023)
Llama 2: Open Foundation and Fine-Tuned Chat Models
Touvron, H., et al. (2023)
The Llama 3 Herd of Models
Dubey, A., et al. (2024)

2.3 · Fine-Tuning, RLHF and Alignment

LoRA: Low-Rank Adaptation of Large Language Models
Hu, E. J., et al. (2021)
Training language models to follow instructions with human feedback (InstructGPT)
Ouyang, L., et al. (2022)
Constitutional AI: Harmlessness from AI Feedback
Bai, Y., et al. (2022)
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafailov, R., et al. (2023) — NeurIPS 2023

2.4 · How AI Image Generation Works

Explanatory content only — no primary papers in this sub-lesson.