Introduction to Data Science: From Curiosity to Code
From Curiosity to Code
Start Reading →
36 chapters
~54 hours total
266 pages
1
Frontmatter
4 chapters2
Part I: Welcome to Data Science
6 chapters- Chapter 1: What Is Data Science? (And What It Isn't) — A Map of the Field
- Chapter 2: Setting Up Your Toolkit: Python, Jupyter, and Your First Notebook
- Chapter 3: Python Fundamentals I — Variables, Data Types, and Expressions
- Chapter 4: Python Fundamentals II: Control Flow, Functions, and Thinking Like a Programmer
- Chapter 5: Working with Data Structures: Dictionaries, Files, and Thinking in Data
- Chapter 6: Your First Data Analysis — Loading, Exploring, and Asking Questions of Real Data
3
Part II: Data Wrangling
7 chapters- Chapter 7: Introduction to pandas — DataFrames, Series, and the Grammar of Data Manipulation
- Chapter 8: Cleaning Messy Data: Missing Values, Duplicates, Type Errors, and the 80% of the Job
- Chapter 9: Reshaping and Transforming Data — Merge, Join, Pivot, Melt, and GroupBy
- Chapter 10: Working with Text Data — String Methods, Regular Expressions, and Extracting Meaning
- Chapter 11: Working with Dates, Times, and Time Series Data
- Chapter 12: Getting Data from Files — CSVs, Excel, JSON, and Databases
- Chapter 13: Getting Data from the Web — APIs, Web Scraping, and Building Your Own Datasets
4
Part III: Data Visualization
5 chapters- Chapter 14: The Grammar of Graphics — Why Visualization Matters and How to Think About Charts
- Chapter 15: matplotlib Foundations — Building Charts from the Ground Up
- Chapter 16: Statistical Visualization with seaborn
- Chapter 17: Interactive Visualization — plotly, Dashboard Thinking
- Chapter 18: Visualization Design — Principles, Accessibility, Ethics, and Common Mistakes
5
Part IV: Statistical Thinking
6 chapters- Chapter 19: Descriptive Statistics — Center, Spread, Shape, and the Stories Numbers Tell
- Chapter 20: Probability Thinking — Uncertainty, Randomness, and Why Your Intuition Lies
- Chapter 21: Distributions and the Normal Curve — The Shape That Shows Up Everywhere
- Chapter 22: Sampling, Estimation, and Confidence Intervals — How to Learn About Millions from a Handful
- Chapter 23: Hypothesis Testing — Making Decisions with Data (and What P-Values Actually Mean)
- Chapter 24: Correlation, Causation, and the Danger of Confusing the Two
6
Part V: First Models
6 chapters- Chapter 25: What Is a Model? Prediction, Explanation, and the Bias-Variance Tradeoff
- Chapter 26: Linear Regression — Your First Predictive Model
- Chapter 27: Logistic Regression and Classification — Predicting Categories
- Chapter 28: Decision Trees and Random Forests — Models You Can Explain to Your Boss
- Chapter 29: Evaluating Models — Accuracy, Precision, Recall, and Why "Good" Depends on the Question
- Chapter 30: The Machine Learning Workflow — Pipelines, Validation, and Putting It All Together
7
Part VI: The Data Science Profession
6 chapters- Chapter 31: Communicating Results: Reports, Presentations, and the Art of the Data Story
- Chapter 32: Ethics in Data Science: Bias, Privacy, Consent, and Responsible Practice
- Chapter 33: Reproducibility and Collaboration: Git, Environments, and Working with Teams
- Chapter 34: Building Your Portfolio: Projects That Get You Hired
- Chapter 35: Capstone Project: A Complete Data Science Investigation
- Chapter 36: What's Next: Career Paths, Continuous Learning, and the Road to Intermediate Data Science
8
Appendices
9 chaptersExplore Related Books
More open-access textbooks from our library
Advanced COBOL 305 pages AI Ethics 304 pages AI Literacy 40 pages AI & ML for Business 304 pages AI Engineering 307 pages Algorithmic Addiction 303 pages Applied Psychology 303 pages College Football Analytics 213 pages Creator Economy 318 pages Pattern Recognition 322 pages Data & Society 305 pages Ethical Hacking 318 pages Fandom 332 pages History of Appalachia 324 pages How Humans Get Stuck 285 pages Handling Confrontation 306 pages How Your House Works 306 pages IBM DB2 282 pages Intermediate COBOL 334 pages Intro CS Python 44 pages Introductory Statistics 216 pages Learning COBOL 322 pages Prediction Markets 316 pages Metacognition 222 pages Media Literacy 314 pages NFL Analytics 182 pages Physics of Music 316 pages Political Analytics 324 pages Basketball Analytics 214 pages Soccer Analytics 230 pages Propaganda 304 pages Python for Business 298 pages Quantum Mechanics 303 pages RegTech 307 pages Science of Seduction 320 pages Sports Betting 322 pages Architecture of Surveillance 299 pages Science of Luck 306 pages Vibe Coding 316 pages Why They Watch 308 pages Working with AI 316 pages