Week 2 Papers — LLM Deep Dive

11 arXiv papers covering transformers, scaling laws, instruction tuning, RLHF, and the major open and closed model families.

All PDFs link to raw.githubusercontent.com; clicking will download the file directly. Source links go to the canonical version on arXiv, the journal, or the publisher.

2.1 · LLM Architecture Deep Dive

Attention Is All You Need

Vaswani, A., et al. (2017) — NeurIPS 2017

↓ Download PDF arXiv:1706.03762

2.2 · Training Large Language Models

Scaling Laws for Neural Language Models

Kaplan, J., et al. (2020)

↓ Download PDF arXiv:2001.08361

Language Models are Few-Shot Learners (GPT-3)

Brown, T., et al. (2020) — NeurIPS 2020

↓ Download PDF arXiv:2005.14165

Training Compute-Optimal Large Language Models (Chinchilla)

Hoffmann, J., et al. (2022)

↓ Download PDF arXiv:2203.15556

GPT-4 Technical Report

OpenAI (2023)

↓ Download PDF arXiv:2303.08774

Llama 2: Open Foundation and Fine-Tuned Chat Models

Touvron, H., et al. (2023)

↓ Download PDF arXiv:2307.09288

The Llama 3 Herd of Models

Dubey, A., et al. (2024)

↓ Download PDF arXiv:2407.21783

2.3 · Fine-Tuning, RLHF and Alignment

LoRA: Low-Rank Adaptation of Large Language Models

Hu, E. J., et al. (2021)

↓ Download PDF arXiv:2106.09685

Training language models to follow instructions with human feedback (InstructGPT)

Ouyang, L., et al. (2022)

↓ Download PDF arXiv:2203.02155

Constitutional AI: Harmlessness from AI Feedback

Bai, Y., et al. (2022)

↓ Download PDF arXiv:2212.08073

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Rafailov, R., et al. (2023) — NeurIPS 2023

↓ Download PDF arXiv:2305.18290

2.4 · How AI Image Generation Works

Explanatory content only — no primary papers in this sub-lesson.