6 papers downloaded across the literature on LLMs as data analysts, the data-leakage crisis, visualization, and the safety profile of agentic AI for science.
All PDFs link to raw.githubusercontent.com; clicking will download the file directly. Source links go to the canonical version on arXiv, the journal, or the publisher.
7.1 · Natural Language to Code
Is GPT-4 a Good Data Analyst?
Data Interpreter: An LLM Agent for Data Science
7.2 · AI-Assisted Data Analysis in Practice
Leakage and the Reproducibility Crisis in ML-based Science
7.3 · Visualization with AI
LIDA: A Tool for Automatic Generation of Grammar-Agnostic Visualizations and Infographics using Large Language Models
7.4 · Verification of AI-Generated Code
Re-uses Kapoor & Narayanan (above) as the central reading.
7.5 · Building Your Data Analysis Workflow
References Mineault (2026) Claude Code for Scientists — a Substack post, linked in the lesson.
7.6 · Agentic Data Analysis
Agentic AI for Scientific Discovery: A Survey of Progress, Challenges, and Future Directions
Risks of AI scientists: prioritizing safeguarding over autonomy
7.7 · Hands-On Activities and Assessment
Assessment design.
Other Week 7 references are practitioner resources rather than papers:
- Mineault, P. (2026). Claude Code for Scientists. neuroai.science
- Wickham, H., Çetinkaya-Rundel, M., & Grolemund, G. (2023). R for Data Science (2e). r4ds.hadley.nz
- Wilke, C. (2019). Fundamentals of Data Visualization. clauswilke.com/dataviz
Linked but not redistributed
Nature (2026). AI scientists are changing research — institutions must respond. DOI:10.1038/d41586-026-00934-w 7.6
Nature editorial — free to read on Nature.com but no redistributable PDF endpoint.