Case Study 2: The Portfolio Review: What Hiring Managers Actually Look For

Contributors to Introduction to Data Science

Case Study 2: The Portfolio Review: What Hiring Managers Actually Look For

Tier 3 — Illustrative/Composite Example: The hiring managers and portfolio evaluations described in this case study are composite characters based on widely documented hiring practices in data science. Interview processes, evaluation criteria, and feedback are synthesized from industry reports, hiring manager blog posts, recruiting webinars, and career guides published in the data science community. No specific real person or company is represented. All names, companies, and scenarios are constructed for pedagogical purposes.

The Setup

Meet three hiring managers who review data science portfolios regularly. Each works in a different context, values different things, and has seen hundreds of portfolios — most of which, they'll tell you candidly, are forgettable. Let's sit behind their desks for a day and see what catches their eye, what makes them cringe, and what separates the candidates they call back from the ones they don't.

Reviewer 1: Sofia — Data Science Manager at a Healthcare Company

Her Context

Sofia manages a team of six data scientists at a mid-sized healthcare analytics company. Her team builds models that help hospitals predict patient readmissions, identify high-risk populations, and optimize resource allocation. She's hiring for a junior data scientist position and has received 280 applications.

"The first thing I do," Sofia explains, "is sort applications into three piles: yes, maybe, and no. The 'yes' pile usually has about 15 people. To get into that pile, you need to show me two things: you can do the work, and you can communicate the results. I need both. If I see great code but no writing, or great writing but no code, you're in the 'maybe' pile."

What She Looks For

A clear question at the start of every project. "I can't tell you how many portfolios I've seen where someone loads a dataset and starts plotting things with no context. I don't know what they're investigating. I don't know why they chose this data. I don't know what they expected to find. It's like opening a book to a random page — I'm lost immediately."

Sofia's test: "If I read the first three cells of your notebook and I don't know what question you're trying to answer, I close the tab."

Evidence of data cleaning decisions. "In healthcare, data is always messy. Medical records have missing fields, inconsistent coding, duplicate entries. If your portfolio project doesn't show that you dealt with messy data — and more importantly, that you thought about how to deal with it — I'm skeptical that you're ready for real-world work."

She's particularly impressed when candidates document why they made a specific cleaning choice: "I once saw a notebook where the candidate explained that they chose median imputation over mean imputation for a skewed variable because the mean was being pulled by outliers. That one sentence told me more about their statistical understanding than any certification would."

Honest limitations. "The candidates who acknowledge what their analysis can't tell you are almost always stronger than the ones who present everything as definitive. In healthcare, being wrong has consequences. I need people who understand uncertainty."

A Portfolio That Impressed Her

Sofia recalls a candidate who had built a project analyzing emergency department wait times using publicly available hospital quality data from the Centers for Medicare and Medicaid Services (CMS). "What made it special wasn't the data or the methods — it was the thinking. The candidate noticed that wait times varied more within states than between states, which contradicted the common narrative that wait times are a state-level policy issue. They hypothesized that local factors — hospital ownership type, urban vs. rural setting, and emergency department capacity — were driving the variation. They tested this with a multi-level regression model and found that hospital ownership (for-profit vs. nonprofit vs. government) explained a surprising amount of the variance."

What sealed the deal: "The candidate's conclusion acknowledged that hospital ownership might be a proxy for other factors — like the socioeconomic profile of the patient population — and suggested specific additional analyses to disentangle these effects. That showed me someone who thinks like a scientist, not just a modeler."

A Portfolio That Missed the Mark

"I reviewed a portfolio from someone with impressive credentials — master's degree, multiple certifications, three years of coursework. Their GitHub had eight projects. But every single one was a Kaggle competition entry. Every one followed the same pattern: load data, minimal cleaning (because Kaggle data is usually pre-cleaned), train five different models, report accuracy, done. No writing. No questions. No interpretation. No domain context."

"This person clearly knew how to code. But I had no idea whether they could think. I couldn't tell if they understood why they were choosing random forest versus logistic regression, or whether they could explain their results to a physician who would use them to make decisions about patients. They went in the 'no' pile."

Reviewer 2: Tomás — Head of Analytics at an E-Commerce Startup

His Context

Tomás leads a three-person analytics team at a fast-growing e-commerce startup. He's hiring for a data analyst position — someone who can pull insights from data, build dashboards, and communicate findings to the founding team. The role is more SQL and visualization than machine learning.

"At a startup, I don't need someone who can build a neural network. I need someone who can answer the question 'why did revenue drop 15% last Tuesday?' in two hours and present the answer on a slide. Speed, clarity, and business intuition matter more than technical depth."

What He Looks For

Business-relevant questions. "I want to see that you can think about data from a business perspective. If your project is about predicting something, I want to know who would use this prediction and what decision it informs. If you built a churn model, great — but did you translate it into 'here are the three things we should do to retain these customers'? If not, it's an academic exercise."

Visualization quality. "For a data analyst role, I look at your charts more carefully than your code. Are your charts readable? Do they have proper titles and labels? Would I put this chart in a presentation to our CEO? If the answer is no — if the axes are unlabeled, the colors are default matplotlib, the chart type is wrong for the data — that's a problem."

Tomás's specific pet peeve: "Pie charts for more than four categories. If I see a pie chart with twelve slices, I know the candidate hasn't thought about visualization design."

Writing that a non-technical person could follow. "Our CEO is brilliant but not technical. Every insight from my team has to be communicated in plain language. If your portfolio project can only be understood by someone with a statistics degree, it's not useful to me. Show me you can write for a general audience."

A Portfolio That Impressed Him

"One candidate had a project where they analyzed their own coffee shop spending over two years. They had exported their bank transactions, categorized them, and answered questions like: How much did I spend on coffee per month? Did my spending increase over time? Was I spending more at certain locations? Did the introduction of a loyalty card actually save me money?"

"It was a simple project — no machine learning, no advanced statistics. But it was perfectly executed. The notebook was clean, the charts were beautiful, and the writing was genuinely funny. The candidate calculated their annual coffee spending and compared it to the cost of a home espresso machine, concluding (with confidence intervals!) that the machine would pay for itself in 7.3 months. I laughed, and then I called them in for an interview."

What made it work: "The project showed personality, analytical intuition, and communication skill. It proved the candidate could take a messy, real-world dataset (bank transactions are surprisingly messy), ask a practical question, and deliver a clear answer. That's exactly what I need every day."

A Portfolio That Missed the Mark

"A candidate had three projects, all of which involved training deep learning models. The code was impressive — custom architectures, GPU optimization, sophisticated preprocessing pipelines. But the projects had no business context. There was no question being answered, no stakeholder being served, no decision being informed. The candidate was clearly talented, but not for this role. They were optimizing for the wrong audience."

"It's not that deep learning is bad — it's that you need to match your portfolio to the role you're applying for. If you're applying for a data analyst position at a startup, show me you can make a clean bar chart and write a clear summary, not that you can implement a transformer from scratch."

Reviewer 3: Priya — Senior Data Scientist at a Large Financial Services Firm

Her Context

Priya leads hiring for the data science team at a large bank. Her team handles risk modeling, fraud detection, and customer segmentation. She's hiring for a junior data scientist position and receives 400+ applications per opening.

"At our scale, everyone who applies has the basic skills. Python, pandas, scikit-learn — that's the minimum. What differentiates candidates is the quality of their thinking, the rigor of their analysis, and whether they understand the ethical dimensions of working with data."

What She Looks For

Statistical rigor. "I look for candidates who understand the difference between correlation and causation, who know what p-values actually mean (and what they don't), who can identify confounders, and who are skeptical of their own results. Finance is a domain where overconfident models lose real money. I need people who are careful."

Her test: "I look at how candidates handle model evaluation. If someone reports training accuracy and doesn't mention test accuracy, overfitting, or cross-validation, that's a red flag. If they report an ROC-AUC of 0.99 and don't express any suspicion that something might be wrong (data leakage, overfitting, trivially separable classes), I worry about their judgment."

Ethical awareness. "In financial services, our models affect people's access to credit, insurance, and banking services. A model that systematically disadvantages certain demographic groups isn't just wrong — it's potentially illegal. I look for candidates who at least think about fairness, even if they don't implement advanced fairness metrics. A sentence in a portfolio project like 'I checked whether the model's error rate differed across demographic groups and found that...' shows me someone who will be responsible with our data."

Reproducibility. "Can I actually run your analysis? Is there a requirements.txt? Are the data sources documented? Is the code clean enough that someone else could understand it? In a corporate setting, your code isn't just for you — other people will need to maintain it, audit it, and build on it. Portfolio projects that demonstrate good engineering habits — clear documentation, version control, reproducible environments — signal that the candidate is ready for professional work."

A Portfolio That Impressed Her

"A candidate built a project analyzing whether automated loan approval algorithms produce different outcomes for different demographic groups, using the publicly available HMDA (Home Mortgage Disclosure Act) data. They didn't just train a model — they trained a model, then systematically analyzed its error patterns across race, gender, and income groups. They found that their model's false negative rate (denying a loan that would have been repaid) was higher for minority applicants, discussed why this might happen (historical data reflecting historical discrimination), and proposed mitigation strategies."

"This candidate understood something crucial: the model isn't the product. The decision is the product. And decisions have consequences for real people. That ethical awareness, combined with strong technical execution, made this person our top choice."

A Portfolio That Missed the Mark

"A candidate had a portfolio full of projects with impressive accuracy numbers. But when I looked closely, I noticed several issues. In one project, they used future data as a feature to predict past events — a classic data leakage error that inflated their accuracy. In another, they used accuracy as the sole metric for a heavily imbalanced classification problem, where simply predicting the majority class would have achieved 95%. In a third, they merged two datasets on a field that had different definitions in each source, producing nonsensical results."

"These are mistakes that many beginners make, and they're forgivable in a learning context. But in a portfolio — a curated collection of your best work — they suggest a lack of critical thinking about the analysis itself. The fix is simple: before publishing a portfolio project, have someone else review it. A fresh pair of eyes catches errors that you've become blind to."

Synthesis: The Universal Principles

Despite working in different industries with different needs, Sofia, Tomás, and Priya agree on several universal principles:

What Every Hiring Manager Wants to See

A question, not just a dataset. Every project should start with a specific, interesting question. "I explored this dataset" is not a project; "I investigated whether X causes Y" is.
Written thinking, not just code. The narrative around your analysis matters as much as the analysis itself. Write clearly. Explain your decisions. Interpret your results.
Honest limitations. Acknowledge what your analysis can and cannot tell you. This isn't weakness — it's intellectual maturity.
Appropriate methods. Match your techniques to the problem. A simple method applied well is more impressive than a complex method applied poorly.
Clean presentation. Proper chart labels, clear variable names, organized notebooks. Presentation isn't superficial — it's a signal of professional standards.

What Makes Every Hiring Manager Close the Tab

Empty or missing READMEs. If you can't be bothered to explain your project, why would they bother to read it?
Tutorial reproductions. Kaggle/tutorial projects with no original thinking, no original questions, and no written analysis.
Code-only notebooks. Walls of code with no Markdown, no explanation, no context.
Overstatement. "My model proves that..." or "This analysis definitively shows..." — claims that go beyond what the data supports.
Abandoned projects. Half-finished work, empty directories, repositories with README files that say "TODO."

Your Action Items

Based on what these hiring managers value, evaluate your own portfolio against their criteria:

Open your portfolio notebook. Read the first three cells. Would Sofia know what question you're investigating? If not, write a better introduction.
Look at your visualizations. Would Tomás put any of them in a CEO presentation? If not, improve the labels, titles, colors, and chart type choices.
Review your model evaluation. Would Priya find any red flags? Check for data leakage, appropriate metrics, and honest interpretation.
Read your conclusions. Are they honest about limitations? Would all three reviewers trust your judgment?
Check your README. Does it explain the project clearly enough that a stranger could understand what you did and why? If not, rewrite it.

Discussion Questions

Sofia, Tomás, and Priya prioritize different things because they work in different contexts. If you could target your portfolio at one of these three reviewers, which would it be, and how would you adjust your projects accordingly?
Tomás was impressed by a coffee-spending analysis — a simple project with no machine learning. Does this surprise you? What does it suggest about what "impressive" means in data science?
Priya flagged a portfolio with data leakage and inappropriate metrics — common beginner mistakes. How could the candidate have caught these errors before publishing? What review practices would you implement for your own work?
All three reviewers valued written communication alongside technical skill. Given that many data science courses emphasize coding over writing, how will you develop your technical writing skills?
Consider the portfolio that impressed Sofia — the emergency department wait times analysis. What made it a strong healthcare portfolio piece specifically? How would you tailor a project to a specific industry?