Introduction to Data Science: From Curiosity to Code
From Curiosity to Code
Start Reading →
36 chapters
~54 hours total
266 sections
1
Frontmatter
4 chapters2
Part I: Welcome to Data Science
6 chapters- Chapter 1: What Is Data Science? (And What It Isn't) — A Map of the Field
- Chapter 2: Setting Up Your Toolkit: Python, Jupyter, and Your First Notebook
- Chapter 3: Python Fundamentals I — Variables, Data Types, and Expressions
- Chapter 4: Python Fundamentals II: Control Flow, Functions, and Thinking Like a Programmer
- Chapter 5: Working with Data Structures: Dictionaries, Files, and Thinking in Data
- Chapter 6: Your First Data Analysis — Loading, Exploring, and Asking Questions of Real Data
3
Part II: Data Wrangling
7 chapters- Chapter 7: Introduction to pandas — DataFrames, Series, and the Grammar of Data Manipulation
- Chapter 8: Cleaning Messy Data: Missing Values, Duplicates, Type Errors, and the 80% of the Job
- Chapter 9: Reshaping and Transforming Data — Merge, Join, Pivot, Melt, and GroupBy
- Chapter 10: Working with Text Data — String Methods, Regular Expressions, and Extracting Meaning
- Chapter 11: Working with Dates, Times, and Time Series Data
- Chapter 12: Getting Data from Files — CSVs, Excel, JSON, and Databases
- Chapter 13: Getting Data from the Web — APIs, Web Scraping, and Building Your Own Datasets
4
Part III: Data Visualization
5 chapters- Chapter 14: The Grammar of Graphics — Why Visualization Matters and How to Think About Charts
- Chapter 15: matplotlib Foundations — Building Charts from the Ground Up
- Chapter 16: Statistical Visualization with seaborn
- Chapter 17: Interactive Visualization — plotly, Dashboard Thinking
- Chapter 18: Visualization Design — Principles, Accessibility, Ethics, and Common Mistakes
5
Part IV: Statistical Thinking
6 chapters- Chapter 19: Descriptive Statistics — Center, Spread, Shape, and the Stories Numbers Tell
- Chapter 20: Probability Thinking — Uncertainty, Randomness, and Why Your Intuition Lies
- Chapter 21: Distributions and the Normal Curve — The Shape That Shows Up Everywhere
- Chapter 22: Sampling, Estimation, and Confidence Intervals — How to Learn About Millions from a Handful
- Chapter 23: Hypothesis Testing — Making Decisions with Data (and What P-Values Actually Mean)
- Chapter 24: Correlation, Causation, and the Danger of Confusing the Two
6
Part V: First Models
6 chapters- Chapter 25: What Is a Model? Prediction, Explanation, and the Bias-Variance Tradeoff
- Chapter 26: Linear Regression — Your First Predictive Model
- Chapter 27: Logistic Regression and Classification — Predicting Categories
- Chapter 28: Decision Trees and Random Forests — Models You Can Explain to Your Boss
- Chapter 29: Evaluating Models — Accuracy, Precision, Recall, and Why "Good" Depends on the Question
- Chapter 30: The Machine Learning Workflow — Pipelines, Validation, and Putting It All Together
7
Part VI: The Data Science Profession
6 chapters- Chapter 31: Communicating Results: Reports, Presentations, and the Art of the Data Story
- Chapter 32: Ethics in Data Science: Bias, Privacy, Consent, and Responsible Practice
- Chapter 33: Reproducibility and Collaboration: Git, Environments, and Working with Teams
- Chapter 34: Building Your Portfolio: Projects That Get You Hired
- Chapter 35: Capstone Project: A Complete Data Science Investigation
- Chapter 36: What's Next: Career Paths, Continuous Learning, and the Road to Intermediate Data Science
8
Appendices
9 chaptersExplore Related Books
More open-access textbooks from our library
Advanced COBOL 40 chapters · ~67h Advanced Data Science 39 chapters · ~57h AI Ethics 39 chapters · ~82h AI Literacy 21 chapters · ~13h AI & ML for Business 40 chapters · ~80h AI Engineering 40 chapters · ~53h Algorithmic Addiction 40 chapters · ~71h American Government 40 chapters · ~77h Applied Psychology 40 chapters · ~52h Assembly Language 40 chapters · ~27h Blockchain & Crypto 40 chapters · ~68h Calculus 40 chapters · ~51h College Football Analytics 28 chapters · ~18h Creator Economy 41 chapters · ~57h Pattern Recognition 43 chapters · ~92h Cybersecurity 40 chapters · ~84h Data & Society 40 chapters · ~71h Data Viz with Python 35 chapters · ~53h Discrete Mathematics for Computer Science 40 chapters · ~75h Ethical Hacking 41 chapters · ~58h Fandom 44 chapters · ~71h History of Appalachia 42 chapters · ~69h How Humans Get Stuck 40 chapters · ~36h Handling Confrontation 40 chapters · ~80h How to Learn Anything 38 chapters · ~54h How Your House Works 40 chapters · ~66h IBM DB2 37 chapters · ~53h Intermediate COBOL 54 chapters · ~44h Intermediate Data Science 36 chapters · ~39h Intro CS Python 27 chapters · ~13h Introductory Economics 40 chapters · ~12h Introductory Statistics 28 chapters · ~47h Learning COBOL 42 chapters · ~64h Prediction Markets 42 chapters · ~60h Linear Algebra 40 chapters · ~60h Metacognition 28 chapters · ~52h Media Literacy 41 chapters · ~81h Music Production 40 chapters · ~84h NFL Analytics 28 chapters · ~16h Nuclear Physics 35 chapters · ~28h Organic Chemistry 40 chapters · ~21h Pascal Programming 40 chapters · ~43h Physics of Music 48 chapters · ~75h Political Analytics 41 chapters · ~67h Popular Psychology 40 chapters · ~21h Practical Philosophy 38 chapters · ~63h Basketball Analytics 31 chapters · ~30h Soccer Analytics 30 chapters · ~43h Propaganda 40 chapters · ~80h Python for Business 40 chapters · ~40h Quantum Mechanics 40 chapters · ~66h RegTech 40 chapters · ~59h The Science of Cooking 40 chapters · ~70h Science of Seduction 45 chapters · ~60h Sports Betting 42 chapters · ~63h Technical Writing 40 chapters · ~70h Architecture of Surveillance 40 chapters · ~54h Science of Luck 40 chapters · ~72h Eastern Cultures 40 chapters · ~47h Western Culture 40 chapters · ~30h Vibe Coding 42 chapters · ~58h Video Game Design 40 chapters · ~36h Why They Watch 40 chapters · ~48h Working with AI 42 chapters · ~58h