← Back to Course Contents · ← All Papers
Week 7

AI for Data, Code & Computation

Code generation, data analysis, visualization, verification, and agentic workflows

6 papers downloaded across the literature on LLMs as data analysts, the data-leakage crisis, visualization, and the safety profile of agentic AI for science.

All PDFs link to raw.githubusercontent.com; clicking will download the file directly. Source links go to the canonical version on arXiv, the journal, or the publisher.

7.1 · Natural Language to Code

Is GPT-4 a Good Data Analyst?
Cheng, L., Li, X., & Bing, L. (2023)
Data Interpreter: An LLM Agent for Data Science
Hong, S., Lin, Y., Liu, B., et al. (2024)

7.2 · AI-Assisted Data Analysis in Practice

Leakage and the Reproducibility Crisis in ML-based Science
Kapoor, S., & Narayanan, A. (2023) — Patterns 4(9): 100804

7.3 · Visualization with AI

LIDA: A Tool for Automatic Generation of Grammar-Agnostic Visualizations and Infographics using Large Language Models
Dibia, V. (2023) — ACL System Demos

7.4 · Verification of AI-Generated Code

Re-uses Kapoor & Narayanan (above) as the central reading.

7.5 · Building Your Data Analysis Workflow

References Mineault (2026) Claude Code for Scientists — a Substack post, linked in the lesson.

7.6 · Agentic Data Analysis

Agentic AI for Scientific Discovery: A Survey of Progress, Challenges, and Future Directions
Gridach, M., Nanavati, J., Abidine, K. Z. E., et al. (2025)
Risks of AI scientists: prioritizing safeguarding over autonomy
Zardiashvili, L., et al. (2025) — Nature Communications

7.7 · Hands-On Activities and Assessment

Assessment design.

Other Week 7 references are practitioner resources rather than papers:

Linked but not redistributed

Nature (2026). AI scientists are changing research — institutions must respond. DOI:10.1038/d41586-026-00934-w 7.6
Nature editorial — free to read on Nature.com but no redistributable PDF endpoint.