Chapter 30 Further Reading: HR Analytics and People Data
The resources below are verified and real. Each annotation explains what you will find there and why it is relevant at this point in your learning.
Books
"The HR Analytics Manifesto" by David Green and Jonathan Ferrar Kogan Page (2021)
A practitioner-focused guide to building an HR analytics capability in an organization. Green and Ferrar cover the full spectrum from data quality through advanced analytics, with emphasis on connecting people data to business outcomes. Particularly relevant for Chapter 30 readers: the chapters on data ethics, privacy, and stakeholder communication. One of the more readable texts on HR analytics for non-statisticians.
"Work Rules!: Insights from Inside Google That Will Transform How You Live and Lead" Laszlo Bock Twelve (2015)
Bock was Google's Head of People Operations during its rapid growth phase. The book covers Google's data-driven approach to HR decisions: structured interviewing, performance calibration, manager effectiveness, and compensation philosophy. Highly relevant to the analytics concepts in this chapter, presented through real decisions made at one of the world's largest employers. Not a technical book — focuses on what the data enabled rather than how it was built.
"People Analytics in the Era of Big Data" Jean Paul Isson and Jesse Harriott Wiley (2016)
More technical than the other titles here. Covers the full analytics pipeline from data collection through predictive modeling for HR outcomes. The chapters on retention prediction and workforce planning are directly relevant to Chapter 30. Requires some comfort with statistics; pandas-familiar readers will find the concepts straightforward.
"Invisible Women: Data Bias in a World Designed for Men" Caroline Criado Perez Abrams Press (2019)
Not an HR analytics textbook, but essential background reading for anyone doing pay equity or diversity analytics. Criado Perez documents how default assumptions in data collection create gaps that make women and other under-represented groups invisible in datasets. The implications for HR analytics are significant: the data you analyze reflects historical decisions, and treating historical patterns as objective benchmarks can perpetuate inequity. Chapter 30's caution about pay equity analysis — always analyze at the job level, never company-wide — reflects the concerns this book articulates.
Official Data Sources and Benchmarks
Society for Human Resource Management (SHRM) — Human Capital Benchmarking Report https://www.shrm.org/hr-today/trends-and-forecasting/research-and-surveys
SHRM publishes an annual benchmarking report with turnover rates, time-to-fill, cost-per-hire, and absence rates by industry, company size, and geography. This is one of the most widely cited sources for HR benchmarks. The full report is available to SHRM members; summaries are often available free. Always note the publication year when citing these benchmarks — they change meaningfully year-over-year.
Bureau of Labor Statistics — Job Openings and Labor Turnover Survey (JOLTS) https://www.bls.gov/jlt/
The BLS publishes monthly JOLTS data: job openings, hires, and separations by industry sector. This is the official U.S. government source for national turnover rates. Free, publicly available, and updated monthly. Use this when you need to contextualize your organization's turnover within the broader labor market.
Bureau of Labor Statistics — National Compensation Survey https://www.bls.gov/ncs/
The NCS publishes compensation data by occupation, industry, and region. Used by compensation consultants as a baseline for pay benchmarking. Free and publicly available. The Occupational Employment and Wage Statistics (OEWS) data provides median and percentile wages by occupation code at the metropolitan area level — useful for regional salary benchmarking.
CIPD (Chartered Institute of Personnel and Development) — People Management Resources https://www.cipd.org/uk/knowledge/
CIPD is the UK's professional body for HR and people development. Their research publications cover absence rates, turnover benchmarks, and people analytics practices. Particularly useful for readers outside the US. Many of their guides and surveys are freely downloadable.
Online Tutorials and Documentation
pandas Documentation — Pivot Tables https://pandas.pydata.org/docs/user_guide/reshaping.html
The official pandas documentation for pivot_table() and related reshaping functions. The HR pivot tables in this chapter all use pivot_table() with fill_value, aggfunc, and named column values. The "Reshaping and pivot tables" section of the docs covers all the options you will need.
Real Python — "Pandas pivot_table Explained" https://realpython.com/pandas-pivot-table-python/
A tutorial-style walkthrough of pivot_table() with business examples. More readable than the official docs. Covers margins=True (automatic totals rows/columns) and multi-level indexing, both of which are useful for HR reporting beyond what is covered in this chapter.
matplotlib Documentation — Error Bars https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.errorbar.html
The salary IQR charts in this chapter use xerr on horizontal bar charts to show the interquartile range. The matplotlib error bar documentation explains the formatting options. The capsize parameter (controls the width of the end caps on error bars) is particularly useful for making IQR bars readable.
Privacy and Data Ethics Resources
GDPR — European Data Protection Board Guidance on HR Data https://edpb.europa.eu/our-work-tools/our-documents
If your organization operates in Europe or has European employees, GDPR applies to all employee data processing. The EDPB publishes official guidance on HR data topics including automated decision-making, employee monitoring, and data retention.
SHRM — Employee Privacy Resources https://www.shrm.org/resourcesandtools/tools-and-samples/toolkits
SHRM provides HR-focused guidance on data privacy, including what can be shared within the organization and what cannot. The chapter's minimum group size principle derives partly from privacy-by-design recommendations in this space.
Python Libraries Introduced in This Chapter
pandas (version 2.x) — pip install pandas
Core data manipulation. All groupby, pivot, and aggregation operations.
matplotlib (version 3.x) — pip install matplotlib
Static chart generation including histograms, bar charts, and error bar plots.
numpy (version 1.26+) — pip install numpy
Random number generation for the synthetic dataset, and statistical operations.
seaborn (version 0.13+) — pip install seaborn
Used in exercises for heatmaps. Not required for the base chapter content.
Continuing in This Book
- Chapter 27 — RFM and customer segmentation (see how the segmentation tools from HR could apply to customer analysis)
- Chapter 29 — Financial analytics: connecting headcount costs to P&L analysis
- Chapter 33 — Intro to machine learning: building an attrition prediction model (extends Exercise 5.4)
- Chapter 36 — Data privacy in Python: anonymization, encryption, and secure data handling patterns
A Closing Note on Benchmarks
Every benchmark in this chapter — whether about turnover rates, supervisor ratios, or absence rates — is drawn from published research by identified organizations (SHRM, BLS, CIPD). When you use benchmarks in your own HR analytics work:
- Always cite the source and the publication year
- Acknowledge that benchmarks represent averages across many different organizations, and your context may differ substantially
- Be explicit about what the benchmark covers (e.g., "US voluntary turnover for manufacturing, SHRM 2023") rather than presenting it as a universal standard
- Use benchmarks as a starting point for conversation, not as targets to hit or miss
The honest framing is: "Our voluntary turnover rate is 22%. SHRM data suggests the national average for our industry is approximately 15%. That gap is worth understanding — but there may be legitimate reasons for the difference, and our first step should be understanding the specifics of our situation before concluding anything."