Chapter 26: Performance Reviews and the Measured Employee

40 min read

Jordan Ellis arrives at the Meridian Logistics warehouse at 6:00 a.m. They badge in — a timestamp logged. They walk to their assigned station — their path tracked by overhead cameras. They pick up a scanning device — the device ID linked to their...

Learning Objectives

Trace the historical development of worker measurement from Taylorism through contemporary digital monitoring
Analyze performance management systems (MBO, stack ranking, OKRs) as surveillance architectures
Evaluate the KPI as a surveillance artifact and understand Goodhart's Law
Examine call center monitoring, badge data, and communication surveillance as nested surveillance layers
Assess the research evidence on whether monitoring actually improves performance
Recognize how performance review conversations function as surveillance rituals
Connect Jordan Ellis's warehouse experience to broader structural analysis

In This Chapter

Opening Scenario: The Dashboard That Follows You
26.1 The Stopwatch and the Soul: Taylorism as Origin Story
26.2 Performance Management Systems: From MBO to Stack Ranking to OKRs
26.3 The KPI as Surveillance Artifact
26.4 Call Center Monitoring: "This Call May Be Recorded"
26.5 Badge Data and the Mapped Employee
26.6 Email and Communication Surveillance
26.7 Productivity Scores: Activity Percentages, Idle Time, Keystrokes
26.8 The Performance Review Conversation: Feedback as Surveillance Ritual
26.9 Jordan at the Dashboard: A Structural Analysis
26.10 Goodhart's Law and Measurement Perversity
26.11 Does Monitoring Improve Performance? The Evidence
26.12 Historical Continuity: The Long Thread from Taylor to Viva Insights
26.13 Workers' Rights and Self-Protection: A Practical Guide
26.14 Conclusion: The Transparent Worker
Key Terms
Discussion Questions

Exercises Quiz Case Study 01 Case Study 02 Key Takeaways Further Reading

Chapter 26: Performance Reviews and the Measured Employee

Opening Scenario: The Dashboard That Follows You

Jordan Ellis arrives at the Meridian Logistics warehouse at 6:00 a.m. They badge in — a timestamp logged. They walk to their assigned station — their path tracked by overhead cameras. They pick up a scanning device — the device ID linked to their employee number. For the next ten hours, every action Jordan takes feeds a performance dashboard that their floor supervisor can view in real time: picking rate (items per hour), idle time (minutes not scanning), path efficiency (distance walked versus optimal route), and error rate (mis-scans and wrong-bin placements).

Jordan doesn't see this dashboard. They experience its effects when a supervisor walks over at 9:15 a.m. and says, "You've been running about 12% below target this morning — what's going on?" Jordan has no way to view the data that prompted this conversation, no ability to contest the algorithm's methodology, and no knowledge of how their performance score will affect their next scheduling assignment or their eligibility for permanent hire.

This is not exceptional. This is the ordinary condition of the measured employee — a condition that has been over a century in the making.

26.1 The Stopwatch and the Soul: Taylorism as Origin Story

To understand why Jordan's every movement generates data, we must begin at the end of the nineteenth century, in the steel mills and machine shops of industrial Pennsylvania, where a mechanical engineer named Frederick Winslow Taylor became convinced that workers were systematically soldiering — deliberately working below their capacity — and that science could fix this.

Taylor's solution, published in 1911 as The Principles of Scientific Management, was elegantly simple in concept and totalizing in implication: break every job into its smallest component tasks, time each task with a stopwatch, determine the "one best way" to perform it, and hold workers to that standard. The worker's body became a machine to be calibrated. Time-and-motion studies — conducted by Taylor and his colleagues, famously including Frank and Lillian Gilbreth — transformed human movement into data points, and data points into production quotas.

Taylor was explicit about the epistemological asymmetry his system created. Knowledge of work — how to do it, how fast it could be done, what the optimal method was — would reside entirely with management, not with workers. "In the past," Taylor wrote, "the man has been first; in the future the system must be first." The worker would be selected, trained, and monitored to execute management's specifications. Their own expertise and judgment were, quite literally, managed out of the equation.

This was surveillance as epistemological conquest. By measuring workers, management gained knowledge that workers themselves did not possess about their own labor. The foreman with the stopwatch knew more about your picking rate than you did. And that asymmetry of knowledge was, fundamentally, an asymmetry of power.

💡 Intuition Check

Taylor believed his system was good for workers — that scientific management would produce higher wages and more harmonious labor relations. His critics argued it was a tool of exploitation dressed in the language of science. Both claims contain truth. How can the same system simultaneously improve wages and intensify exploitation? What does this paradox suggest about the relationship between measurement and power?

The Workers Push Back

Taylor's ideas were not received passively. In 1911, the same year Principles was published, the U.S. House of Representatives launched an investigation into Taylorism after workers at the Watertown Arsenal went on strike in response to time-and-motion studies. Congress eventually passed legislation prohibiting the use of Taylor's methods in government facilities — legislation that remained in effect until the 1940s.

The resistance was not merely practical (workers didn't want to work harder) but principled: workers argued that Taylorism robbed them of their craft knowledge, their dignity, and their autonomy. The act of being timed, studied, and reduced to a rate was experienced as dehumanizing — not despite but because of its clinical neutrality.

This tension — between the management argument that measurement is neutral and objective, and the worker argument that measurement is an exercise of power — runs through every subsequent iteration of workplace performance management, from management by objectives in the 1950s to productivity scores in 2020s remote work platforms. The form changes. The asymmetry persists.

26.2 Performance Management Systems: From MBO to Stack Ranking to OKRs

The decades after Taylor produced successive waves of management philosophy, each with its own surveillance architecture. Understanding these systems as surveillance architectures — as structures that determine who watches, what is measured, and what consequences flow from the data — reveals the continuity beneath their apparent differences.

Management by Objectives (MBO): The Negotiated Target

Peter Drucker introduced Management by Objectives (MBO) in his 1954 The Practice of Management, offering a seemingly more humanistic alternative to Taylorism. Rather than having management impose quotas top-down, MBO proposed that managers and employees jointly set goals at the start of a review period, then evaluate progress against those goals at period's end.

MBO spread rapidly through American corporations in the 1960s and 1970s. Its appeal was that it appeared to give workers agency: you participated in setting your own goals. But the participation was structured within strong constraints. Goals had to be "aligned" with organizational objectives. The manager held ultimate approval authority. And the evaluation conversation — where you accounted for what you had achieved — was fundamentally a surveillance ritual in collaborative clothing.

MBO also introduced a crucial surveillance innovation: the written performance record. Where Taylor's stopwatch produced ephemeral data that lived in a foreman's notebook, MBO generated documented goal-setting agreements and formal appraisal records that became part of the employee's permanent personnel file. The measured employee now had a dossier.

Stack Ranking: Surveillance as Tournament

Jack Welch, CEO of General Electric from 1981 to 2001, brought performance management its most explicitly Darwinian form: the forced distribution system, colloquially known as "stack ranking" or "rank and yank." The premise was stark: in any group of employees, the top 20% (the "A players") should be identified and rewarded; the vital middle 70% (the "B players") should be retained and coached; and the bottom 10% (the "C players") should be terminated — every year, without exception.

The surveillance implications of stack ranking were profound. Every employee had to be evaluated against every other employee in their unit. Managers had to produce a ranked list and defend their rankings to senior management. Employees knew that no matter how objectively good their performance was, if nine colleagues performed better, they could be fired. The system transformed colleagues into competitors in a continuous tournament — a dynamic that, as Microsoft's internal dysfunction during the Ballmer years illustrated, could systematically destroy collaborative behavior and innovation.

Former Microsoft employees described the stack ranking era as creating an environment where "people would get rid of good people to rank themselves higher." The rational individual strategy in a forced distribution system is to sabotage colleagues' performance records rather than to collaborate with them. The metric, designed to identify and reward excellence, created incentives that produced the opposite of excellence.

⚠️ Common Pitfall: Confusing the Metric with the Reality

Stack ranking illustrates what statisticians and economists call Goodhart's Law (covered in detail in section 26.6): when a measure becomes a target, it ceases to be a good measure. The fundamental error is treating the measurement instrument as equivalent to the underlying construct it was designed to capture. Performance rankings are not the same as performance quality. When the ranking becomes the thing that matters, the quality becomes secondary.

OKRs: Objectives and Key Results

The current dominant performance management framework in the technology sector is Objectives and Key Results (OKRs), developed at Intel by Andy Grove and popularized at Google by venture capitalist John Doerr's 2018 book Measure What Matters. OKRs propose that organizations, teams, and individuals set ambitious "objectives" (qualitative, aspirational goals) and measurable "key results" (quantitative indicators of progress toward the objective) on a quarterly cycle.

OKRs are genuinely different from earlier frameworks in some respects: they encourage ambitious "stretch goals" that employees are expected not to achieve fully; they are typically public within organizations, creating horizontal accountability; and they formally separate OKR achievement from compensation decisions in many implementations.

But they are also a surveillance architecture. Every employee's objectives and progress are visible to managers and (in many organizations) to all colleagues. The quarterly review cycle creates recurring surveillance moments. And the emphasis on measurability creates the same Goodhartian incentives: what gets measured is what gets done, and what cannot easily be measured — creativity, mentorship, institutional knowledge, relationship building — may be systematically undervalued.

🔗 Connection to Chapter 5

Foucault's analysis of the disciplinary gaze identified the examination as a key mechanism of power. The performance review is the examination in its organizational form — a ritualized moment of assessment that produces the "calculating gaze" Foucault described. What makes it particularly powerful is that examinations are internalized: the examined subject comes to monitor themselves against the standards of the examiner. Jordan checking their own picking rate before the supervisor can check it is this internalization in action.

26.3 The KPI as Surveillance Artifact

Key Performance Indicators — KPIs — are the atomic unit of contemporary workplace surveillance. They are data points selected to represent performance, and by now they are so embedded in organizational life that it can be difficult to perceive them as the surveillance artifacts they are.

Consider what a KPI actually is: a decision, made by management, about which aspects of a worker's performance are worth measuring and therefore worth managing. That selection process — what gets measured and what doesn't — is not neutral. It reflects management's theory of what work is, what outcomes matter, and whose perspective on performance is authoritative.

The Numerator Problem: What Gets Counted

In most professional contexts, the most important aspects of work are the hardest to measure. A nurse's ability to reassure a frightened patient, a teacher's capacity to notice when a student is struggling with something other than the curriculum, a software engineer's contribution to team culture and knowledge sharing — these produce real value, but they are difficult to quantify and therefore tend to fall outside KPI frameworks.

What gets counted instead is what can be counted: calls handled per hour, lines of code committed per week, tickets closed per sprint. The surveillance apparatus gravitates toward the quantifiable, and over time, the quantifiable becomes what "performance" means. Unmeasured behaviors and contributions are not merely untracked — they become invisible, and eventually devalued.

This is what organizational theorist Peter Drucker (somewhat ironically, given his role in developing MBO) called the "measurement trap": "What gets measured gets managed — even when it's pointless to measure and manage it, and even if it harms the purpose of the organization."

KPIs don't merely measure performance — they sort workers. The sorting function of performance metrics connects to what surveillance scholars call "social sorting" (a key term established in Chapter 1): the use of surveillance data to classify people into categories that attract different treatments.

In logistics warehouses like Jordan's, the sorting happens continuously. Workers whose picking rates exceed target thresholds are eligible for schedule preferences and performance bonuses. Workers who fall below thresholds receive coaching, then written warnings, then termination. The data produces the categories; the categories produce the outcomes. Workers who are systematically disadvantaged by the way metrics are designed — perhaps because they work on routes that require more walking, or because they have chronic health conditions that affect pace — may experience this sorting as discrimination, even if no discriminatory intent exists.

📊 Real-World Application: The Amazon Rate System

Reporting by The New York Times, The Atlantic, and investigative journalists has documented Amazon's use of "rate" — the items-per-hour picking metric — as the primary tool for workforce management in its fulfillment centers. Workers are assigned a rate target, which changes based on facility averages and Amazon's operational needs. Workers who miss rate for multiple shifts receive progressive discipline; those who meet or exceed it earn "Time Off Task" credits.

What the reporting also revealed: Amazon's algorithm tracked "time off task" (TOT) — periods when a worker's scanner was not recording activity — and automatically generated warnings when TOT exceeded thresholds. Workers reported receiving TOT warnings for extended bathroom breaks, for helping injured colleagues, for waiting in lines that management had created by understaffing. The algorithm could not distinguish between a worker who was slacking and a worker who was waiting for the understaffed restroom. The metric was the judge.

26.4 Call Center Monitoring: "This Call May Be Recorded"

Perhaps nowhere is workplace surveillance more transparent — or more normalized — than in the call center. "This call may be recorded for quality assurance and training purposes" has become so ubiquitous that it functions as a linguistic ritual, a rote announcement that customers and workers alike have stopped hearing. What it describes, however, is one of the most intensive surveillance environments in contemporary employment.

The Architecture of the Monitored Call

Call center surveillance operates in multiple simultaneous layers:

Layer 1: The call recording. Every customer interaction is recorded in full — audio, and increasingly video for chat/co-browse interactions. These recordings are stored, indexed, and can be retrieved for review.

Layer 2: Real-time monitoring. Supervisors can listen to live calls using "silent monitoring" — a technology that lets a manager hear a conversation without either party knowing. In most U.S. jurisdictions, this is legal without notification to the worker or the customer. Some systems allow the supervisor to "whisper" — to speak into the agent's earpiece while the call is ongoing without the customer hearing.

Layer 3: Automated quality scoring. Increasingly, call center monitoring is not done by human supervisors but by automated speech analysis systems. Software transcribes calls in real time, flags prohibited language (profanity, competitor mentions, certain scripts), measures "talk time," "hold time," "silence time," and "resolution rate," and automatically generates quality scores. Human supervisors review a sample; the algorithm reviews everything.

Layer 4: Screen recording. Many call center platforms simultaneously capture the worker's screen activity during calls — which applications are open, what the worker types into the CRM, how quickly they navigate between screens.

Layer 5: Biometric elements. Some platforms now incorporate keystroke dynamics and mouse movement patterns — effectively biometric signatures — to verify that the person at the workstation is the assigned employee.

Karen Levy and the Long Arm of the Log

Sociologist Karen Levy, in her research on truck driver monitoring (extended in her book Data Driven: Truckers, Technology, and the New Workplace Surveillance), developed an analysis that applies directly to call center monitoring: the most intimate aspects of work — the moments between formal tasks, the personal judgments, the small deviations from protocol — are precisely what the most intensive monitoring systems try to capture.

For call center workers, this means that the pause before answering a call, the sighing, the way a worker handles a difficult caller, the moment of venting after a particularly hard interaction — all of these become potential data points. The surveillance does not merely record the formal work; it attempts to colonize the informal margins of work where workers typically maintain some autonomy and humanity.

🌍 Global Perspective: The UK Call Center Industry

The United Kingdom has some of the most extensively studied call center monitoring regimes in the world, partly because the UK has both a large call center sector and active trade unions that have documented monitoring practices. Research by Nik Chmiel and colleagues found that UK call center workers reported significantly elevated stress levels tied directly to monitoring intensity — but not in a linear relationship. Workers who perceived monitoring as fair and used for developmental purposes (coaching) reported lower stress than workers who perceived monitoring as purely punitive. The perceived purpose of the surveillance mattered as much as its intensity. This finding has implications for how organizations design monitoring systems — and for whether the design choices reflect genuine commitment to worker development or merely rhetorical cover for control.

26.5 Badge Data and the Mapped Employee

The employee ID badge is one of the most mundane objects of modern organizational life. It is also a surveillance device with capabilities that most workers do not fully appreciate.

What the Badge Collects

At minimum, a badge-based access control system records: time of entry, door/gate entered, time of exit, door/gate exited. In larger facilities and campuses, it records movement between zones — entry into building A, then building C, then the cafeteria, then building A again. In sophisticated implementations, the badge tracks presence in specific meeting rooms (room readers), time at specific workstations (proximity sensors), and movement patterns throughout the facility.

This data, in aggregate, produces a map of an employee's workday — not what they did, but where they were and when. The map has organizational uses: security management, emergency response, space utilization planning. It also has surveillance uses that workers may not anticipate.

The Meeting Room Problem

Several large technology companies — including Google, whose campus infrastructure is extensively documented in reporting — use badge data not merely for security but for workforce analytics. Meeting room occupancy data can be used to evaluate whether managers are holding the team meetings they claim to hold. Badge entry records at after-hours times can be compared against productivity data to test hypotheses about which workers are "putting in extra effort."

In 2019, reporting on Amazon's internal analytics practices revealed that the company analyzed badge data to track which employees were attending certain types of meetings — including meetings that union organizers might use as signals of organizing activity. This use of badge data — collected under a security/access rationale but applied to labor relations intelligence — is a textbook example of "function creep": the expansion of a surveillance system beyond its stated purpose.

📝 Note: Function Creep in Badge Systems

Function creep (a key term established in Chapter 2) is particularly insidious in badge systems because the surveillance infrastructure is installed for an entirely legitimate and visible purpose (security access control), and workers understand they are being tracked for that purpose. The expansion to workforce analytics, labor relations monitoring, or performance management happens invisibly, using data the worker already assumed was collected for other reasons. The worker's consent — such as it was — applied to the stated purpose, not the actual use.

Microsoft Workplace Analytics and the Productivity Score

In 2020, Microsoft launched "Productivity Score" as part of Microsoft 365, a feature that assigned individual workers scores based on their use of Microsoft tools: how often they participated in Teams meetings, responded to emails, collaborated on documents. The feature was designed so that managers could see individual employee scores with the employee's name attached.

The response from privacy advocates was immediate and severe. Wolfie Christl, a technology researcher, published an analysis calling the feature "a workplace surveillance tool masquerading as a productivity enhancer." Within weeks, Microsoft revised the feature — renaming it Viva Insights and making individual scores visible only to the employee, with aggregate anonymized data visible to managers.

The episode is instructive not because Microsoft was doing something unusual, but because it was doing something usual in an unusually visible way. Every enterprise software platform collects behavioral data about users. The debate over Productivity Score made legible what was already happening invisibly in Google Workspace, Slack, and dozens of other enterprise platforms: the work itself generates surveillance data, continuously and passively.

26.6 Email and Communication Surveillance

"Don't put anything in email you wouldn't want to see on the front page of a newspaper" is one of those workplace aphorisms that turns out to be literally true — not because emails are likely to be subpoenaed (though they are, in litigation), but because email is routinely monitored by employers, often in ways that workers do not fully appreciate.

What Employers Can Monitor

In the United States, employers have broad legal authority to monitor employee communications conducted on employer-owned devices and employer-provided systems. This includes email, instant messaging (Slack, Teams, Google Chat), video calls, and any other communication passing through employer systems. In the majority of U.S. states, this monitoring can be conducted without informing employees beyond a generic policy statement in the employee handbook — a document that most employees sign without reading and rarely revisit.

The legal framework is governed primarily by the Electronic Communications Privacy Act (ECPA) of 1986, which predates the modern workplace by decades and which has been interpreted to permit employer monitoring of company systems. Courts have consistently held that workers have no expectation of privacy on company systems — a position that becomes increasingly consequential as work communication migrates entirely to employer-controlled platforms.

Microsoft Viva Insights: The Quantified Communication

Microsoft Viva Insights (the evolved form of the Productivity Score system) analyzes patterns in Microsoft 365 communication data to generate "insights" about collaboration, focus time, and wellbeing. Managers receive aggregate team data; individual employees receive personal dashboards.

What Viva Insights actually measures includes: number of emails sent and received, frequency of meeting attendance, after-hours communication patterns, response time to emails, number of people collaborated with, and time spent in focus (uninterrupted working time). The system can also analyze whether employees are building "diverse network connections" across the organization.

From a surveillance studies perspective, Viva Insights is significant because it makes the quantification of social relationships and communication patterns routine and passive. Workers don't fill out forms about their collaboration habits — the system infers them from behavioral residue. And it presents this quantification as "insights" and "wellbeing" support, framing surveillance as self-improvement.

🎓 Advanced: The Affective Turn in Workplace Surveillance

Shoshana Zuboff, in The Age of Surveillance Capitalism (2019), identifies "behavioral modification" as the culminating logic of surveillance systems — not merely predicting behavior, but shaping it. Contemporary workplace communication surveillance operates on this logic: by showing workers their own collaboration metrics (or by having managers act on those metrics), the system creates feedback loops that modify the behavior being measured. Workers who see that their "collaboration score" is low may deliberately restructure their communication patterns — not because this reflects better work, but because it reflects better metrics. The surveillance apparatus has reached into the worker's intentional behavior and reshaped it.

26.7 Productivity Scores: Activity Percentages, Idle Time, Keystrokes

The remote work explosion of 2020-2022 (addressed more fully in Chapter 27) intensified a form of workplace monitoring that had existed in limited forms for years: the automated productivity score. Understanding how these scores work — technically and epistemologically — is essential for evaluating their legitimacy.

How Activity Scores Are Calculated

Most automated productivity monitoring platforms calculate activity scores using some combination of:

Mouse movement: Is the cursor moving? How often? What is the velocity and pattern?
Keystroke detection: Are keys being pressed? (Usually not which keys, but whether keystrokes are occurring)
Application focus: Which application window is in the foreground? Is it a "productive" application (e.g., a work-relevant tool) or a "non-productive" one (e.g., social media)?
Screen activity: Is the screen changing, suggesting active work?

These inputs are typically combined into an "activity percentage" or "active time" metric — the proportion of tracked time during which the worker appears to be actively using their computer.

The epistemological problem is severe. An activity percentage of 85% does not mean the worker was productive 85% of the time. It means that during 85% of tracked time, the monitoring software detected mouse movement, keystrokes, or application activity. A worker who is:

Reading a physical document
Thinking deeply about a problem
On a phone call
In a legitimate work conversation not captured by the monitoring software

...will register as "inactive" and receive a lower productivity score than a worker who is actively moving their mouse while watching videos.

📊 Real-World Application: The Idle Time Controversy

In 2022, the New York Times profiled Erin Driscoll, a financial analyst at a firm using automated monitoring software, who discovered that her employer had flagged her for low activity scores — specifically, for periods of "idle time" that exceeded company thresholds. What her employer's system did not know: Driscoll was nearly blind, and conducted most of her work by listening to screen-reader audio rather than interacting with the screen in ways the monitoring software could detect. Her actual productivity was high; her measured productivity was low. The metric had failed not because it was measuring the wrong thing in theory, but because it was measuring a proxy (screen interaction) that was a poor indicator of performance for this particular worker.

The story illustrates a broader problem: automated productivity monitoring systems encode assumptions about what productive work looks like. Those assumptions reflect a particular kind of work, performed by a particular kind of body, in a particular kind of environment. Workers who deviate from these assumptions — whether due to disability, care responsibilities, different work styles, or different job functions — may be systematically mismeasured.

26.8 The Performance Review Conversation: Feedback as Surveillance Ritual

Twice a year (or annually, or quarterly) in most organizations, the measured employee sits across from their manager for the performance review. This conversation is positioned in management discourse as a "development opportunity" — a chance for feedback, goal-setting, and career planning. From a surveillance studies perspective, it is something more complex: a surveillance ritual that formalizes the monitoring relationship, produces the performance record, and defines the subject of surveillance in particular ways.

The Ritual Structure

Performance review conversations follow a remarkably stable ritual structure across organizations: the opening (framing the conversation as developmental and mutual), the review of past performance (the manager presents evaluation data), the gap analysis (areas for improvement are identified), the goal-setting (future targets are established), and the closing (the document is signed, the record created).

Each element of this structure has a surveillance function. The opening framing ("this is a conversation, not an evaluation") manages the power differential so that it does not feel so large as to produce resistance. The review of past performance makes the monitoring data legible — translates it into narrative. The gap analysis converts the data into actionable categories of deficiency. The goal-setting establishes the new surveillance targets. The closing signature transforms the ephemeral conversation into a permanent record.

The signature is particularly significant. By signing the performance review document, the employee acknowledges the record — even if they disagree with it. In many organizations, the signature line reads "I acknowledge receipt of this review" rather than "I agree with this review," but the social pressure to sign without protest is significant, and the signature is often later cited in disciplinary or termination proceedings as evidence that the employee knew their performance was problematic.

Emotional Labor in the Review Room

Sociologist Arlie Hochschild's concept of "emotional labor" — the management of feeling to create a publicly observable facial and bodily display — applies with particular force to performance review conversations. Both parties are managing their emotions: the manager presenting negative feedback in a "constructive" register, the employee receiving criticism while displaying composure and receptiveness.

The emotional labor demand falls asymmetrically on the employee. The manager is performing their professional function; the employee is being evaluated. For workers from marginalized groups — people of color, women, first-generation professionals — the emotional labor burden is compounded by the need to navigate stereotype threat, code-switching, and the additional scrutiny that marginalized workers often face.

💡 Intuition Check

Think about a performance evaluation you have experienced (academic, workplace, athletic). Who held the evaluation data? Who interpreted it? Who determined what it meant? What would it mean to experience that conversation differently if you could see the same data the evaluator saw, had participated in designing the evaluation criteria, and could challenge the methodology? How different would the power balance feel?

26.9 Jordan at the Dashboard: A Structural Analysis

Return to Jordan, at 9:15 a.m., being approached by their supervisor with performance data Jordan cannot see. Let's analyze this moment through the theoretical frameworks developed throughout this chapter.

Visibility Asymmetry

Jordan's situation exemplifies visibility asymmetry at its starkest: the supervisor holds data about Jordan's behavior that Jordan does not have access to. Jordan cannot verify whether the "12% below target" figure is accurate. They cannot assess whether the benchmark is appropriate for the conditions of their specific assignment (different routes, different product mix, different bin locations all affect picking rate in ways the algorithm may not control for). They cannot contest the methodology.

The asymmetry is not accidental. It is structural — built into the design of the monitoring system. The dashboard is designed to give supervisors a view of worker performance; it is not designed to give workers a view of the methodology being used to evaluate them. This is visibility asymmetry as a design choice, not a technical limitation.

Jordan was informed, in the employee onboarding process, that their work performance would be monitored and that a performance dashboard would be used for evaluation purposes. They signed an acknowledgment form. In formal legal terms, they consented.

But the consent was what surveillance scholars call "structured" or "manufactured" consent: accepting the monitoring system was a condition of employment. Jordan, a first-generation college student working to pay for school, did not have the luxury of declining the job because they objected to the monitoring architecture. The "consent" they gave was the choice between employment with surveillance and unemployment — a choice that does not meet any meaningful philosophical definition of voluntary agreement.

Goodhart's Law in the Warehouse

Jordan has learned to "work to the scanner" — to prioritize movements that register as picks even when a different sequence would be more ergonomically efficient or would result in fewer errors. They have observed colleagues do the same. The picking rate metric has reshaped the workflow not to maximize efficiency but to maximize the metric — exactly the dynamic Goodhart's Law predicts.

The irony is that this metric-gaming may actually reduce the outcomes Amazon claims to care about: more errors, more physical strain, higher injury rates, higher turnover. The measurement is undermining the thing it was supposed to measure. But because the measurement continues to report numbers, management may not notice.

26.10 Goodhart's Law and Measurement Perversity

Goodhart's Law, named for British economist Charles Goodhart who formulated it in a 1975 monetary policy paper, states: "Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes." The colloquial version, attributed to anthropologist Marilyn Strathern, is more memorable: "When a measure becomes a target, it ceases to be a good measure."

Goodhart's Law is one of the most important and most frequently violated principles in organizational management. It describes what happens to metrics under the pressure of accountability: they become the object of optimization rather than the instrument of measurement, and in becoming the object, they lose their diagnostic value.

Classic Cases of Measurement Perversity

The cobra problem: The British colonial government in India, concerned about cobra populations, offered bounties for dead cobras. Enterprising locals began breeding cobras to collect the bounty. When the government discovered this and ended the program, the cobra farmers released their now-worthless snakes — increasing the cobra population. The metric (dead cobras) was a reasonable proxy for the underlying goal (fewer cobras) until it became a target, at which point it actively undermined the goal.

Wells Fargo's account quotas: Wells Fargo established aggressive sales targets for retail banking employees — accounts opened per day — and compensated accordingly. Employees responded by opening accounts customers hadn't requested, forging signatures, and creating fictitious accounts. The metric was a plausible proxy for business success until it became a mandatory target, at which point it became a vehicle for mass fraud. Wells Fargo ultimately paid $3 billion in fines and settlements.

The NHS waiting time targets: The UK's National Health Service established four-hour emergency department waiting time targets. Hospitals responded by treating patients who would be admitted within four hours first, regardless of clinical need — and by keeping some patients in ambulances in hospital parking lots until a bed was available (so the four-hour clock wouldn't start). The metric measured waiting time; the gaming produced queue-jumping that wasn't clinically appropriate.

Goodhart's Law in Performance Review Systems

In organizational performance management, Goodhart's Law appears in the phenomenon of "teaching to the test" — workers and managers who optimize for the measured outputs regardless of whether the optimization reflects actual organizational value.

Stack ranking produced Goodhart's Law effects systematically: rather than identifying the best performers, it incentivized the sabotage of colleagues' performance records. Call center monitoring produces Goodhart's Law effects when workers learn to hit the metrics (short handle time, high first-call resolution rates) through strategies that actually reduce customer satisfaction (rapid call conclusions before issues are fully resolved, marking cases "resolved" prematurely). OKR systems produce Goodhart's Law effects when employees set intentionally easy key results to guarantee 100% achievement.

The profound challenge for performance management systems is that there is no metric that is immune to Goodhart's Law effects once the metric is consequential enough. Any measure under sufficient pressure will be gamed. This does not mean measurement is worthless — it means the relationship between measurement and management requires humility about what metrics can and cannot tell us.

✅ Best Practice: Rotating and Triangulating Metrics

Organizations that are sophisticated about Goodhart's Law effects use several strategies: rotating key metrics so that employees cannot settle into optimization routines; triangulating between multiple metrics so that gaming one requires deteriorating another; including qualitative assessments alongside quantitative measures; and building in "negative" metrics that capture the costs of gaming (error rates, complaint rates, quality scores). None of these strategies eliminates Goodhart's Law effects, but they make gaming more costly and visible.

26.11 Does Monitoring Improve Performance? The Evidence

The ubiquity of workplace monitoring rests on an implicit empirical claim: that monitoring improves performance. It is worth examining this claim directly, because the research evidence is considerably more mixed than the monitoring industry acknowledges.

The Hawthorne Effect: Watching Changes Behavior

The foundational study in this area is the Hawthorne experiments, conducted at the Western Electric Hawthorne Works factory in Chicago between 1924 and 1932. Researchers were studying the effect of lighting conditions on worker productivity. They found, apparently, that productivity increased regardless of whether lighting was increased or decreased — the act of being studied seemed to improve performance.

The Hawthorne Effect, as it became known, has since been substantially revised by reanalysis of the original data (the original results were less consistent than reported), but the core insight — that awareness of observation changes behavior — is well-supported by subsequent research. Monitored workers often work harder, faster, or more carefully, at least in the short term.

The critical question is whether this behavioral change is sustainable and whether it reflects genuinely better performance or merely performance for the camera.

Research on Monitoring and Long-Term Performance

Studies of monitoring and long-term performance tell a more complicated story:

Surveillance reduces intrinsic motivation. Research in the self-determination theory tradition (Deci, Ryan, and colleagues) consistently finds that external monitoring reduces workers' intrinsic motivation — their internal engagement with and enjoyment of work. Workers who are intrinsically motivated perform better, more creatively, and more sustainably than workers who are externally motivated by monitoring and reward. To the extent that monitoring shifts motivation from intrinsic to extrinsic, it may undermine long-term performance even as it boosts short-term measured output.

Surveillance increases stress and reduces job satisfaction. Studies of call center workers, data entry workers, and remote employees find consistent associations between monitoring intensity and elevated stress, reduced job satisfaction, and higher turnover intentions. Turnover is itself a performance cost — the replacement, training, and productivity loss associated with turnover often exceeds the gains from increased short-term output.

Surveillance may reduce quality in favor of quantity. Where monitoring tracks volume metrics (calls per hour, items picked per hour, tasks completed per day), it creates incentives to prioritize speed over quality. Multiple studies of call center and logistics workers find that intensive volume monitoring correlates with higher error rates and lower quality outcomes.

The trust literature. Research on organizational trust by Roderick Kramer, Denise Rousseau, and others finds that intensive surveillance signals mistrust — and that the signal of mistrust damages the trust relationship itself. Workers who perceive that management does not trust them are less likely to go beyond their formal job requirements (what organizational behavior researchers call "organizational citizenship behavior"), are less likely to share information that might reflect poorly on themselves, and are more likely to engage in protective gaming of metrics.

The Aiha Nguyen Framework

Technology researcher Aiha Nguyen, in her 2021 report "The Surveillance of Workers" for the Data & Society Research Institute, synthesizes this literature to argue that contemporary workplace monitoring systems are not primarily designed to improve worker performance. They are designed to improve management's control over worker performance — a different, though related, goal. The distinction matters because many control-oriented monitoring systems actively undermine performance goals while achieving control goals.

Nguyen identifies what she calls the "visibility/accountability mismatch": workplace surveillance makes workers more visible to managers without making management decisions more accountable to workers. The productivity gain, if any, flows from increased worker compliance; the costs — in wellbeing, intrinsic motivation, trust, and creativity — fall on workers who have no voice in the surveillance system's design.

📊 Real-World Application: The Stanford Research on Monitoring and Innovation

A 2019 study by Stanford researchers found that employees who worked under intensive monitoring were significantly less likely to propose innovative solutions to workplace problems, even when they were specifically invited to do so. The researchers concluded that monitoring created a "psychological constriction" — an implicit awareness that anything outside the measured behaviors might be scrutinized and penalized. The chilling effect (a key term established in Chapter 2) on innovation may be one of the most significant long-term costs of intensive workplace surveillance, particularly in knowledge-work environments where innovation is the primary source of value.

26.12 Historical Continuity: The Long Thread from Taylor to Viva Insights

Step back from the details of OKRs and badge data and productivity scores and see the shape of the whole: from Taylor's stopwatch in 1911 to Microsoft Viva Insights in 2020 is a continuous lineage of thought, a coherent project that has evolved its technologies and vocabulary while maintaining its structural features.

The project is: make worker behavior visible to management in quantified form; use that quantification to set expectations and evaluate compliance; use the evaluation to sort, reward, and discipline workers; use the sorting to align individual behavior with organizational objectives.

The vocabulary changes — scientific management becomes MBO becomes stack ranking becomes OKRs — and the technology changes — the stopwatch becomes the time card becomes the badge reader becomes the machine learning algorithm — but the power relation is constant: management watches; workers are watched. Management defines what counts as performance; workers perform for the definition. Management controls the data; workers are the data.

What has changed is the granularity of the watch, the scale of the data collection, and the invisibility of the surveillance. Taylor's stopwatch was visible — the worker could see it, could talk to the time-and-motion man, could object. Jordan's performance dashboard is invisible to Jordan — they can feel its effects (in the supervisor's approach at 9:15 a.m.) but cannot see the data, contest the methodology, or fully understand what is being measured and why.

The surveillance has gotten more powerful and less legible at the same time. That combination — more power, less legibility — is precisely what surveillance scholars mean when they speak of visibility asymmetry reaching its extreme.

26.13 Workers' Rights and Self-Protection: A Practical Guide

Understanding the theory of performance surveillance is important; knowing how to protect yourself within these systems is essential. This section provides practical guidance for workers navigating performance monitoring environments.

Know What Is Being Measured

Most employers are required to disclose, at minimum in general terms, the criteria by which your performance will be evaluated. Review your employee handbook, your offer letter, and any performance plan you have signed. Specific monitoring technology — badge readers, communication monitoring, productivity software — may be disclosed in separate technology use agreements. Read these documents carefully and keep copies.

If you are uncertain what your employer is measuring, you can ask your manager or HR department: "What metrics are used to evaluate my performance, and how is each one calculated?" This is a reasonable professional question, and the answer (or the refusal to answer) is itself informative.

Document Your Own Performance

Do not rely entirely on your employer's measurement systems. Keep your own records: projects completed, problems solved, positive feedback received, metrics you can independently verify. If you are in a role with quantitative performance targets, track your own numbers so you can identify discrepancies between your records and your employer's data.

Save commendations, positive emails, and records of work above and beyond your formal job description. These records are particularly valuable if you are subject to a performance improvement plan (PIP) or disciplinary action, or if you later face discrimination claims.

Understand Your Legal Rights

In the United States, key legal protections include:

The National Labor Relations Act (NLRA): Protects your right to discuss wages, hours, and working conditions with coworkers. Employers cannot legally prohibit you from discussing pay or complaining about working conditions, even if they try to do so through handbook policies.
The Americans with Disabilities Act (ADA): If a monitoring system mismeasures your performance because of a disability, you may be entitled to reasonable accommodation in how performance is measured.
State privacy laws: Several states (California, Connecticut, Delaware, New York) have stronger workplace privacy protections than federal law. Research the laws in your state.
Collective bargaining agreements: If your workplace is unionized, your collective bargaining agreement likely includes specific provisions about performance monitoring, disciplinary processes, and your right to representation.

The Union Advantage

Unionized workers have substantially stronger protections in performance management than non-unionized workers. Collective bargaining agreements typically require that performance standards be negotiated with the union, that disciplinary actions follow a progressive process, and that workers have the right to union representation in any investigatory interview that might lead to discipline (the "Weingarten right," from NLRB v. J. Weingarten, Inc., 420 U.S. 251 (1975)).

If you believe your performance monitoring system is being applied unfairly, discriminatorily, or in ways that violate your union contract, contact your union representative.

✅ Best Practice: The Weingarten Request

Under the NLRA, if you are a unionized employee and are called to a meeting that you reasonably believe might result in discipline, you have the right to request union representation before the meeting proceeds. The "Weingarten right" must be invoked — you must specifically ask for union representation. Your employer is not required to inform you of this right. Know this right and exercise it.

26.14 Conclusion: The Transparent Worker

The measured employee of the twenty-first century inhabits a paradox: they are more visible to management than workers have ever been, and more invisible as persons than the surveillance might suggest. The granularity of contemporary monitoring — picking rates and idle times and communication patterns and meeting room occupancy — produces an extraordinarily detailed behavioral profile. But the profile is not the person. It is a representation of the person designed by, for, and in the interests of the organization that pays for the monitoring system.

Jordan Ellis, tracked through every pick and pause, is known to the Meridian Logistics algorithm as a set of rates and times and error counts. What the algorithm does not know — cannot know — is that Jordan is a first-generation college student supporting themselves through school, that they worked a double shift last weekend to cover a sick colleague, that they are training two newer workers on the side while maintaining their own rate, that they are interviewing for an internship that would change their life trajectory.

The performance dashboard captures the measurable. It misses the person. And it is designed, structurally, to mistake one for the other — because the measurement is the point.

As we move forward through Part 6, we will see this basic architecture extended into new environments: the home office (Chapter 27), the algorithmic manager (Chapter 28), the hiring process (Chapter 29), and the fraught terrain of dissent and whistleblowing (Chapter 30). The technologies change; the asymmetry persists.

Key Terms

Management by Objectives (MBO): A performance management approach in which managers and employees jointly set goals and evaluate progress against them, creating a documented performance record.

Stack ranking: A forced distribution performance management system requiring that a fixed percentage of employees be ranked at each level, resulting in annual termination of the lowest-ranked performers.

OKRs (Objectives and Key Results): A goal-setting framework using qualitative objectives and quantitative key results, widely used in technology companies, with public goal visibility within organizations.

Key Performance Indicator (KPI): A quantitative metric selected to represent performance on some dimension; the atomic unit of performance surveillance systems.

Goodhart's Law: The principle that "when a measure becomes a target, it ceases to be a good measure," describing the systematic distortion that occurs when performance metrics are used for accountability purposes.

Silent monitoring: The practice of supervisors listening to live employee calls without the employee or customer being notified.

Activity score/productivity percentage: Automated metrics calculated from keyboard, mouse, and application activity data that purport to represent a worker's productivity during computer-based work.

Time Off Task (TOT): Amazon's metric tracking periods when a warehouse worker's scanner is not recording activity; used to generate automated discipline.

Weingarten right: The right of unionized workers to request union representation before an investigatory interview that might lead to discipline, established by the U.S. Supreme Court in NLRB v. J. Weingarten, Inc. (1975).

Discussion Questions

Frederick Taylor believed his system was good for workers because it produced higher wages. Was he wrong? Can a system be simultaneously good for workers in some respects and exploitative in others? How should we evaluate systems that have mixed effects?
Goodhart's Law suggests that any metric under accountability pressure will be gamed. If this is true, what does it imply for the future of performance management? Is the solution better metrics, fewer metrics, or something else entirely?
Jordan's "consent" to workplace monitoring was a condition of employment. In what sense is this consent? Does it matter that Jordan needed the job? How should we think about consent in contexts of economic necessity?
The research evidence suggests that intensive monitoring may reduce intrinsic motivation and long-term performance quality. If employers know this, why do they continue to invest in monitoring systems? What does the persistence of monitoring in the face of mixed evidence tell us about what monitoring is actually for?
Performance reviews are described in this chapter as "surveillance rituals." What ritual functions do they serve? Who benefits from the ritual structure? Are there alternative forms of performance conversation that could preserve accountability while reducing surveillance intensity?

Chapter 26 connects backward to Chapter 4's analysis of Taylor and the origins of scientific management, and to Chapter 5's Foucauldian framework of disciplinary examination. It connects forward to Chapter 27's examination of remote work surveillance, where the monitoring systems described here are transported into domestic space, and to Chapter 28's analysis of algorithmic management, where human supervisors are replaced by automated systems.

Learning Objectives

In This Chapter

Chapter 26: Performance Reviews and the Measured Employee

Opening Scenario: The Dashboard That Follows You

26.1 The Stopwatch and the Soul: Taylorism as Origin Story

The Workers Push Back

26.2 Performance Management Systems: From MBO to Stack Ranking to OKRs

Management by Objectives (MBO): The Negotiated Target

Stack Ranking: Surveillance as Tournament

OKRs: Objectives and Key Results

26.3 The KPI as Surveillance Artifact

The Numerator Problem: What Gets Counted

KPIs and Social Sorting

26.4 Call Center Monitoring: "This Call May Be Recorded"

The Architecture of the Monitored Call

Karen Levy and the Long Arm of the Log

26.5 Badge Data and the Mapped Employee

What the Badge Collects

The Meeting Room Problem

Microsoft Workplace Analytics and the Productivity Score

26.6 Email and Communication Surveillance

What Employers Can Monitor

Microsoft Viva Insights: The Quantified Communication

26.7 Productivity Scores: Activity Percentages, Idle Time, Keystrokes

How Activity Scores Are Calculated

26.8 The Performance Review Conversation: Feedback as Surveillance Ritual

The Ritual Structure

Emotional Labor in the Review Room

26.9 Jordan at the Dashboard: A Structural Analysis

Visibility Asymmetry

Consent as Fiction

Goodhart's Law in the Warehouse

26.10 Goodhart's Law and Measurement Perversity

Classic Cases of Measurement Perversity

Goodhart's Law in Performance Review Systems

26.11 Does Monitoring Improve Performance? The Evidence

The Hawthorne Effect: Watching Changes Behavior

Research on Monitoring and Long-Term Performance

The Aiha Nguyen Framework

26.12 Historical Continuity: The Long Thread from Taylor to Viva Insights

26.13 Workers' Rights and Self-Protection: A Practical Guide

Know What Is Being Measured

Document Your Own Performance

Understand Your Legal Rights

The Union Advantage

26.14 Conclusion: The Transparent Worker

Key Terms

Discussion Questions