Case Study 2: Five-Year Plans and KPIs -- Central Planning at Two Scales

Case Study 2: Five-Year Plans and KPIs -- Central Planning at Two Scales

How to use this case study: Read the two narratives, then work through the analysis questions. The first narrative examines a national-scale legibility trap (Soviet agricultural planning); the second examines an organizational-scale legibility trap (corporate metric fixation). The analysis questions ask you to identify the structural identity between systems that differ enormously in scale, ideology, and context but are governed by the same pattern.

Part I: The Virgin Lands Campaign -- Legibility at Continental Scale

The Problem

By the early 1950s, Soviet agriculture was in chronic crisis. Collectivized farms consistently underproduced. Food shortages were endemic. The causes were deeply structural: collective farming destroyed the metis of individual farmers, replacing generations of local soil knowledge with centrally dictated planting schedules; procurement quotas incentivized gross output rather than sustainable yield; and the terror of the Stalinist era had eliminated the most capable and independent-minded agricultural managers.

A rational response would have been to address these structural problems -- to restore farmer autonomy, reform the incentive system, and allow local knowledge to guide local decisions. But this would have required admitting that collectivization itself was the problem, which was ideologically impossible.

The Simplification

Nikita Khrushchev, who succeeded Stalin in 1953, chose a different approach: he would solve the food problem by bringing vast new areas of land under cultivation. The Virgin Lands Campaign, launched in 1954, aimed to plow approximately 33 million hectares of previously uncultivated steppe in Kazakhstan and western Siberia. The plan was breathtaking in its ambition and perfectly legible: more hectares plowed equals more grain produced equals problem solved.

The metric was simple: hectares of virgin land plowed and planted. The target was clear. The resources were mobilized. Hundreds of thousands of volunteers and conscripts were sent east to plow the steppe. New state farms were established. Machinery was deployed. The campaign was covered extensively in Soviet media as a triumph of socialist planning.

First-Generation Success

The first harvests were extraordinary. In 1956, the Virgin Lands produced a record grain harvest. Khrushchev was vindicated. The metric -- hectares planted, tons harvested -- confirmed the wisdom of the campaign. The dashboards were green.

The success was real but deceptive. The newly plowed virgin soil was rich with centuries of accumulated organic matter, built up under the steppe's natural grassland ecosystem. The first crops drew on this accumulated capital, just as the first rotation of spruce monoculture drew on the ecological capital of the old mixed forest. The yields were spectacular because they were consuming a resource that would not be replenished.

Second-Generation Failure

Within five years, the consequences of the simplification became apparent. The virgin steppe had not been cultivated for a reason: its soils were thin, its rainfall was sparse and unpredictable, and the grassland ecosystem that had sustained it depended on root structures and soil organisms that plowing destroyed.

Without the grass cover, the light, dry soil was exposed to the fierce winds of the Kazakh steppe. Dust storms began. In some areas, topsoil loss was measured in centimeters per year. Yields dropped precipitously. By the early 1960s, some of the virgin lands were producing less per hectare than the old, established farmland in Ukraine and central Russia. Some areas had been rendered entirely barren.

The ecological destruction was compounded by the Goodhart distortions that permeated the system. Farm managers, measured by hectares plowed, plowed marginal land that should never have been touched. Measured by tons harvested, they strip-mined the soil to hit short-term targets, ignoring long-term sustainability. Measured by quotas delivered to state procurement agencies, they delivered grain that was improperly dried and stored, resulting in enormous spoilage.

The Trap

The rational response would have been to abandon the campaign -- to let the marginal lands return to grassland, to focus resources on improving yields on already-cultivated land, and to address the structural problems of Soviet agriculture. But the trap held:

Ideological commitment. Khrushchev had staked his political legitimacy on the Virgin Lands. Acknowledging failure would have undermined his position.

Institutional constituencies. Hundreds of state farms, dozens of party agencies, and thousands of bureaucratic positions had been created for the campaign. Abandoning it would have eliminated their reason for existence.

Sunk costs. Enormous resources -- machinery, infrastructure, housing -- had been invested in the new territories. Walking away meant writing off the investment.

Destruction of alternatives. The grassland ecosystem, once plowed, could not simply be restored. The soil structure had been destroyed. Recovery would take decades, if it was possible at all.

The campaign was not formally abandoned. It was quietly scaled back over the following decades as yields continued to decline and the ecological damage became undeniable. The land that had been the steppe's greatest asset -- its deep, dense, grass-fed topsoil -- was gone. The metric had been met. The reality had been destroyed.

The Deeper Failure

The Virgin Lands Campaign illustrates a pattern that appears throughout the history of Soviet planning: the substitution of extensive growth (more inputs: more land, more labor, more steel) for intensive growth (better productivity from existing resources). Extensive growth is legible -- you can count the hectares plowed, the tons of steel produced, the number of workers deployed. Intensive growth is illegible -- it depends on tacit knowledge, local adaptation, quality improvement, and innovation, none of which can be easily planned from the center.

The planned economy's legibility regime made it structurally incapable of intensive growth. Every attempt to improve productivity through planning produced Goodhart distortions that undermined the improvement. The only path to meeting targets was to throw more resources at the problem -- more land, more labor, more capital -- consuming the country's wealth to sustain the illusion of progress. When the resources ran out, the system collapsed.

Part II: GreenTech Solutions -- Legibility at Corporate Scale

The Company

GreenTech Solutions (a composite case based on widely documented patterns in technology companies) was a mid-sized software company with 2,000 employees. Founded in the early 2000s, it had grown through the expertise and judgment of its engineering teams, who had deep knowledge of their customers' problems and considerable autonomy in how they solved them. The company was profitable, its products were respected, and its employee retention was among the best in the industry.

The company's management system was, by Silicon Valley standards, old-fashioned. Engineering managers evaluated their teams through a combination of code reviews, customer feedback, peer assessments, and their own judgment -- developed over years of working alongside the engineers. There were no formal productivity metrics. There was no dashboard. The system was profoundly illegible.

The New CEO

In 2015, a new CEO was hired from a larger, more metrics-driven company. She looked at GreenTech and saw what administrators always see in illegible systems: disorder, inconsistency, and an inability to answer basic questions. How productive is each team? Which engineers are high performers? Which projects are on track? How does GreenTech's engineering productivity compare to industry benchmarks?

The engineering managers had answers to these questions, but their answers were qualitative, nuanced, and difficult to aggregate. "Team A is doing excellent work on a very hard problem" is not an answer that fits in a spreadsheet. "Engineer X's code is beautiful and maintainable, but she produces fewer lines per week than the average because she spends time helping junior engineers level up" is not an answer that can be benchmarked.

The CEO commissioned the implementation of a comprehensive KPI system. Engineering productivity would be measured by: lines of code committed per developer per week, story points completed per sprint, bugs resolved per engineer per month, and time-to-deployment for new features. Teams would be ranked. Individual performance reviews would be tied to the metrics. Bonuses would be tied to the metrics. The dashboard would make everything visible.

First-Generation Success

The metrics launched in Q1 of 2016. By Q3, the dashboards showed improvement across the board. Lines of code were up 30 percent. Story points completed per sprint were up 25 percent. Bug resolution rates were up 40 percent. Time-to-deployment was down 20 percent. The CEO presented the results to the board, attributing them to the new data-driven management culture.

The Goodhart Machine

Behind the green dashboard, the engineers were doing what rational agents always do when measured: they were optimizing the metrics.

Lines of code. Engineers wrote verbose code where concise code would have been better. They avoided refactoring -- the practice of simplifying and cleaning up existing code -- because refactoring reduces total lines of code. The codebase was growing in size while declining in quality.

Story points. Teams learned to inflate their story-point estimates, making each task appear larger, so that completing the same work yielded more points. Teams that had previously estimated a feature at 5 story points now estimated it at 13. Completed story points soared while actual throughput was flat or declining.

Bug resolution. Engineers cherry-picked easy bugs for quick resolution, leaving the complex, systemic bugs that actually mattered. The bug count declined, but the remaining bugs were the most dangerous ones. Some engineers introduced bugs and then "resolved" them the following week to inflate their numbers.

Time-to-deployment. Teams shipped code faster by skipping code review, reducing testing, and cutting quality assurance steps. Features deployed on time but arrived with defects that required subsequent patches.

The Metis Drain

The most damaging consequence was invisible to the dashboard: the departure of GreenTech's most experienced engineers.

The senior engineers -- the ones who had spent a decade learning the codebase, who understood the customers' real problems, who could see when a "feature" was actually creating technical debt that would cost the company millions to repay -- were precisely the ones who performed worst on the new metrics. They wrote fewer lines of code because they wrote better code. They completed fewer story points because they worked on the hardest problems. They resolved fewer bugs because they were preventing bugs through careful design.

Under the new system, these engineers received mediocre reviews and reduced bonuses. Several left within the first year. They were replaced by more junior engineers who knew how to optimize the metrics but lacked the deep institutional knowledge that had made GreenTech's products excellent.

The engineering managers, who had previously used their judgment to evaluate engineers, were reduced to dashboard readers. Their metis -- their knowledge of which engineers were truly valuable, which projects were truly on track, which risks were truly dangerous -- was no longer relevant. The dashboard had replaced judgment with numbers.

Second-Generation Failure

By 2018, the consequences were becoming apparent to everyone except the dashboard:

Customer satisfaction had declined for eight consecutive quarters. Products that had been reliable and elegant were becoming buggy and bloated. Major clients were beginning to evaluate competitors.

Technical debt -- the accumulated cost of short-term coding decisions that create long-term maintenance burdens -- had exploded. The codebase had grown 40 percent in size while declining in quality, and the maintenance burden was consuming an increasing share of engineering capacity.

Employee turnover in engineering had doubled. Exit interviews consistently cited the metrics system as the primary reason for departure. The engineers who remained were increasingly those who had optimized for the metrics rather than for the work.

The remaining senior engineers -- the few who had not yet left -- wrote a letter to the CEO describing what the dashboard could not show: the degradation of code quality, the accumulation of technical debt, the loss of institutional knowledge, the declining morale. They argued that the metrics system was destroying the company's engineering culture.

The Response

The CEO's response was predictable. She did not abandon the metrics. She added more metrics. Code quality metrics (cyclomatic complexity, test coverage) were added to the dashboard. Customer satisfaction scores were added. Technical debt was estimated and tracked. The dashboard grew from twelve KPIs to thirty-seven.

The additional metrics did not solve the problem. They created new optimization targets that interacted with the existing ones in unpredictable ways. Engineers now had to optimize for lines of code, story points, bug resolution, deployment speed, cyclomatic complexity, test coverage, and customer satisfaction simultaneously. The complexity of the optimization problem overwhelmed any coherent engineering strategy. Teams devoted increasing time to understanding and gaming the metrics and decreasing time to building good software.

The board replaced the CEO in 2019. Her successor's first act was to eliminate eighteen of the thirty-seven KPIs and to restore engineering managers' authority to evaluate their teams using professional judgment alongside the remaining metrics. The recovery took years. The senior engineers who had left did not come back. The institutional knowledge they had carried was gone.

Analysis Questions

1. Scale Invariance. The Virgin Lands Campaign operated at the scale of a continent. GreenTech's KPI system operated at the scale of a single company. Yet the chapter argues they follow the same structural pattern. Map each step of the Arc of Legibility Failure in both cases. Where is the structural correspondence exact? Where does scale change the dynamics?

2. The Metis That Was Lost. In the Virgin Lands case, the lost metis was the ecological knowledge embedded in the steppe grassland (soil structure, water cycling, wind patterns) and the agricultural knowledge of local farmers. In the GreenTech case, the lost metis was the engineering judgment of senior developers and the contextual knowledge of engineering managers. In each case, explain why this knowledge was illegible to the metrics system and why its loss was invisible until the consequences became catastrophic.

3. Extensive vs. Intensive. The chapter describes the Soviet economy's inability to shift from extensive growth (more inputs) to intensive growth (better productivity). Does GreenTech exhibit the same pattern? Is there an analogue to "extensive growth" in corporate management? What would "intensive growth" look like, and why is it illegible?

4. The Doubling-Down Response. Both cases exhibit the doubling-down response: more planning in the Soviet case, more metrics in the GreenTech case. Explain why this response is structurally predictable. Why does the system produce "more of the same" rather than "something different" in response to failure?

5. Could the Trap Have Been Avoided? For each case, identify the decision point at which the legibility trap could have been avoided. What would have needed to be different -- not just the specific decision, but the institutional and epistemological conditions that would have made a different decision possible?

6. Cross-Case Synthesis. Write a one-page (approximately 300-word) analysis arguing that the Soviet Virgin Lands Campaign and GreenTech's KPI system are the same failure at different scales. Your argument should identify the common structural elements, explain why scale and ideology are less important than structure, and address the obvious objection that the comparison trivializes the human suffering of the Soviet case by equating it with a corporate management problem.

7. Connection to Chapter 17 (Redundancy vs. Efficiency). In both cases, the legibility project classified certain elements as "redundant" or "wasteful": the steppe grassland that "wasted" arable land, the senior engineers who "wasted" time on non-metric activities. In each case, explain how the "redundant" element was actually essential, and how its removal made the system fragile.

8. Designing the Escape. Using the principles from Section 20.9, design an alternative management approach for GreenTech that would:

a) Provide the CEO with legitimate visibility into engineering productivity (the legitimate purpose of the original legibility project)

b) Preserve the engineering managers' metis and the senior engineers' institutional knowledge

c) Resist Goodhart distortions by using mixed methods (quantitative and qualitative)

d) Avoid the self-reinforcing mechanisms that maintain legibility traps

e) Include an early warning system for detecting when the metrics are diverging from the reality

Be specific. What would you measure? What would you not measure? Who would have authority to override the metrics? How would you protect the illegible knowledge that the metrics cannot capture?