42 min read

There is a specific kind of heartbreak that happens the first time you build an enemy and it does not feel like an enemy. You've spent three days on it. You wrote the state machine, drew the sprite, imported the animations, hooked up the hitboxes...

Learning Objectives

  • Distinguish game AI from academic AI and articulate why the goals differ.
  • Implement a finite state machine for an enemy in Godot using GDScript.
  • Explain behavior trees, utility AI, and GOAP, and choose between them for a given design.
  • Design perception systems (sight, sound, memory) that the player can read.
  • Apply steering and navigation to produce believable movement through a level.
  • Design allies and companions without creating the "follower problem."
  • Telegraph AI decisions so that intelligent behavior reads as intelligent.
  • Tune difficulty through behavior, not only numeric scaling.
  • Recognize when cheating AI is a legitimate design tool.

Chapter 27: AI Design — Making Enemies and Allies Feel Alive

There is a specific kind of heartbreak that happens the first time you build an enemy and it does not feel like an enemy. You've spent three days on it. You wrote the state machine, drew the sprite, imported the animations, hooked up the hitboxes from Chapter 26, and pressed Play. The enemy stands there. You walk up to it. It sees you. It runs at you in a straight line. You sidestep, it misses, it turns around, it runs at you again in a straight line. Kill it. Move on. You feel nothing. The room feels nothing. You made a piñata, not an enemy.

This is the gap between "it works" and "it reads as alive." Closing that gap is what this chapter is about. It is not a machine learning chapter. We are not training neural networks on footage of Doom Eternal. We are doing something more interesting and older than that: we are building the illusion of intelligence with tools that were mature in 1998 and that still drive the vast majority of shipped games. Finite state machines. Behavior trees. A little steering. A perception system the player can understand. And most importantly, the telegraph — the bark, the animation, the posture change that turns a hidden calculation into something the player sees happening.

Game AI is not about being smart. It is about reading as smart. If a player cannot perceive the intelligence, it might as well not exist. Bungie's Damian Isla, the AI lead on Halo 2, called this "AI honesty" — the principle that the player sees only what the AI does, and so what the AI does must be legible. A grunt that flanks you is worth nothing if you cannot tell it is flanking. A grunt that panics and runs for cover, announcing it with a frightened yelp and a visible dive, is worth a dozen silent geniuses. The design of a great enemy is the design of the moment in which the player says out loud, "Oh, you bastard."

This chapter gives you the toolkit to create those moments. We start with what game AI is (and is not). We build the workhorse — the finite state machine — and implement one in Godot for our progressive project. We climb the complexity ladder to behavior trees, utility AI, and GOAP, and we look at where each makes sense. We cover perception, the part new designers always underestimate, and the navigation layer that makes AI move through a level without getting stuck in doorways. We look at group AI, at allies (which are harder than enemies), and at the telegraphing craft that separates AI that feels alive from AI that merely functions. Then we talk about difficulty — which is a behavior problem more than a numbers problem — and we end by giving you permission to cheat.

By the end of the chapter, the enemies in your game will stop being piñatas.

What Is Game AI?

Start with what it is not. Academic AI — the AI that wins at Go, that drives cars, that generates images — is trying to make decisions that are correct. It models a world, it evaluates utility, it plans, it learns, and its success is measured against external criteria: did it win the game, did it avoid the crash, did it predict the word. This AI is judged by outcomes.

Game AI is judged by experience. An AI that plays StarCraft optimally is not a good game AI — it is a boss fight that is impossible for the player to enjoy. An AI that computes the true optimal path to the player in an action game is not a good enemy — it walks through the map in a straight line and arrives. The goals are inverted. The game AI does not want to win. It wants to produce a specific feeling — fear, surprise, satisfaction when the player outthinks it — and it is measured by whether the player had that feeling.

This sounds soft. It is not. It is a harder engineering problem, because the success criteria are indirect. You cannot unit-test "did the player feel clever." You playtest, you observe, you read the face of the person holding the controller, and you adjust.

Three principles follow from this and should be tattooed on the inside of your skull before we go further.

Principle one: readability beats cleverness. If the player cannot perceive what the AI is doing, the AI might as well be a dice roll. A dumb enemy that visibly reacts is always better than a smart enemy that does not. Every behavior you add must produce a change the player can see or hear or infer from outcomes. If the enemy is flanking, the player should be able to tell. If the enemy is suppressing, the player should be able to tell. If the enemy is scared, the player should absolutely be able to tell. This is Isla's "honesty" principle. It is the single most important rule in this chapter.

Principle two: stupid is good, if it is believably stupid. Real enemies in real games are usually dumb. A grunt in Halo has maybe a dozen states. A zombie in Left 4 Dead has maybe six. What makes them feel alive is not complexity — it is the right behaviors executed with the right timing and the right feedback. A small number of well-tuned behaviors will beat a large number of half-tuned ones every time. Design your enemy's repertoire, not their intelligence.

Principle three: the player is the star. Game AI exists to make the player feel something. It is not there to be admired in its own right. An AI designer who falls in love with their own cleverness — who builds the twelve-state planner because it is interesting — will build an AI that the player cannot read and does not appreciate. You are the bassist. The player is the lead singer. Your job is to make them sound good.

💡 Intuition: If a playtester cannot, after a ten-minute session, describe your enemy's behavior in one sentence — "it chases you, then retreats when hurt," "it tries to flank you," "it freezes you and calls for help" — your enemy has failed. Legibility is the metric.

Finite State Machines: The Workhorse

A finite state machine is the oldest and most common structure in game AI. It is also — and this is the part graduate students hate — almost always enough. Ninety percent of shipped enemies in ninety percent of games you have ever played are finite state machines. Half-Life (1998). Diablo II. World of Warcraft mobs. Every mook in every Far Cry. The Goombas in Super Mario Bros. are a one-state FSM with the state "walk forward until you hit a wall, then turn around."

An FSM is a set of states the agent can be in, plus the transitions between them. At any moment, the agent is in exactly one state. The state defines what the agent does (walk, attack, flee) and what triggers it to change to another state.

A typical enemy FSM has something like this set:

  • Idle — stand still, animate breathing, look around, no threats detected.
  • Patrol — walk between waypoints, scan for threats.
  • Alert — heard or saw something, not yet committed, move to investigate.
  • Chase — target confirmed, pursue.
  • Attack — in range of target, execute attacks.
  • Retreat / Flee — low health or outmatched, fall back.
  • Stunned — hit by a stun effect, recover.
  • Dead — terminal state.

And transitions like:

  • Idle → Alert: heard a sound, saw movement.
  • Alert → Chase: confirmed target.
  • Alert → Idle: investigated, found nothing, returned.
  • Chase → Attack: reached attack range.
  • Attack → Retreat: health dropped below threshold.
  • Any → Stunned: hit by stun.
  • Stunned → previous state: stun timer expired.

The HECU marines in Half-Life 1 (1998) ran on an FSM with roughly this shape, plus group coordination primitives. They would see Gordon, bark "Get him!" or "Grenade!", take cover, shoot, throw grenades, and retreat when suppressed. Players in 1998 thought they were miraculously intelligent. They were an FSM with good telegraphs.

Why FSMs Dominate

FSMs are loved by working AI programmers for three reasons.

They are debuggable. At any moment you can print one line — the current state — and know exactly what the enemy is doing. When a bug appears ("the enemy just stood there when I walked up"), you put a breakpoint on the state change, run the game, and watch the machine. Every other AI architecture is harder to debug than this.

They are predictable for designers. A designer looking at the state diagram on a whiteboard can tell you exactly what the enemy will do in any situation. That predictability is the foundation of tuning. You can say "in the Attack state, the enemy fires every 0.8 seconds," and that statement is true.

They are cheap. Updating an FSM is one function call per tick — an if/else chain or a switch. You can run ten thousand of them in a frame and not care. Other architectures (particularly behavior trees and utility AI) burn more CPU per agent, which matters when your level has three hundred enemies.

The State Explosion Problem

FSMs have one real failure mode, and it is the reason designers eventually reach for other tools. As behaviors get more complex, the number of states grows, and — worse — the number of transitions grows as the square of the number of states. Every new state needs to consider transitions from every existing state. Add a "grenade dodge" state, and you now have to decide: can you transition to grenade dodge from Idle? From Patrol? From Attack? From Retreat? From Stunned? The transition graph becomes a nest of special cases.

This is called state explosion. It is why FSMs hit a wall at about ten to fifteen states. Beyond that, you spend all your time maintaining the transition logic instead of tuning behavior.

The usual fix is to hierarchize. An HFSM (hierarchical FSM) nests state machines inside states. The "Combat" state contains its own sub-FSM with Attack, Reload, Take-Cover, Dodge. The "Non-Combat" state contains Idle, Patrol, Investigate. Transitions within a sub-FSM don't need to worry about the outer states. This works well and is what most "FSM" implementations in shipped games actually are.

But at a certain point, even HFSMs become painful, and you reach for a behavior tree.

🛠️ Practitioner Tip: Build your first enemies as flat FSMs with five to seven states. Do not optimize for flexibility you don't need. Most enemies in most games never exceed this complexity. If you hit the wall, hierarchize. If you hit the wall again, switch to behavior trees. Do not start with behavior trees to feel sophisticated.

Behavior Trees

In 2005, a designer named Damian Isla gave a GDC talk titled "Handling Complexity in the Halo 2 AI." It became one of the most influential talks in the history of game AI. Isla explained that the Halo 2 team had moved away from the FSM-per-enemy approach of Halo 1 to a new architecture: the behavior tree.

A behavior tree is a tree of nodes representing behaviors. The tree is "ticked" every frame (or every few frames) starting from the root. Each node, when ticked, returns one of three values: Success, Failure, or Running. The tree's structure — which nodes are children of which — determines how these return values compose.

The key node types:

  • Sequence — runs children in order until one returns Failure or all return Success. Think of it as AND. "Move to cover AND wait for clip AND reload AND resume fire."
  • Selector (or Fallback) — runs children in order until one returns Success or all return Failure. Think of it as OR. "Try to flank, OR try to rush, OR take cover."
  • Decorator — modifies the result of a single child. "Repeat child three times." "Invert the result." "Only run child if health < 30%."
  • Leaf — the actual action. MoveTo(waypoint), Shoot(target), PlayAnimation("reload").

Under this architecture, a typical combat tree might be:

Selector (root)
├── Sequence (flee if damaged)
│   ├── Check (health < 20%)
│   └── MoveTo (nearest retreat point)
├── Sequence (engage if in range)
│   ├── Check (can see target)
│   ├── Check (in weapon range)
│   └── Shoot (target)
└── Sequence (patrol if nothing to do)
    ├── PickNext (patrol point)
    └── MoveTo (patrol point)

Tick the root. The Selector tries the first sequence (flee). If health is fine, the check fails, the sequence fails, the selector moves on. It tries engage. If you cannot see the target, the sequence fails, it moves on. It falls back to patrol.

The magic: this structure is composable. You can write sub-trees ("Engage target") and reuse them across enemies by dropping them into different trees. Designers can modify behavior by editing the tree structure in a visual editor, without touching code.

When BTs Beat FSMs

Behavior trees win when:

  • You have many enemies that share some behaviors but differ in others. A BT lets you reuse subtrees; an FSM makes you copy states.
  • Your behaviors are naturally hierarchical. "Engage" is really "pick a tactic and execute it," which is selector-over-sequences.
  • You have designers who should be able to change behavior without programming. Visual BT editors are much easier to hand to a designer than code.

When FSMs Beat BTs

FSMs win when:

  • Behavior is small (< 10 states). The BT overhead is not worth it.
  • You need moment-to-moment control over specific reactions. "On hit, immediately interrupt current behavior and play flinch animation" is easier to hard-wire in an FSM.
  • Tick cost matters and you have thousands of agents. BT traversal, even cached, is more expensive than FSM switch statements.

The Downsides

The downsides of BTs are real and underdiscussed.

Tick cost. Ticking a tree every frame, even with early termination, is more expensive than flipping an FSM switch. The first studios to ship BT-based AI on consoles discovered this the hard way. Shipping BT implementations always include some form of tick-skipping or event-driven invalidation — you don't actually re-traverse the whole tree every frame, you traverse when a relevant fact changes.

Harder to introspect. Standing at a breakpoint and asking "what is this enemy doing?" is harder with a BT than with an FSM. The answer is not "it is in state X," but "it was in the middle of a sequence whose third child is currently in the running state." Debuggers for BT-based AI exist, but they are more complex. Studios that ship BTs build good tooling or suffer.

Design temptation. Designers, given a visual BT editor, tend to build trees that are too complicated. "Engage" grows from three children to thirty. The composability is a double-edged sword. A discipline of pruning is required.

🎮 Case Study: The Halo 2 BT. Isla's 2005 talk describes how the Halo 2 grunts, jackals, and elites each had distinct BTs built from a shared library of leaf actions and sub-behaviors. The "Elite" tree had a "call for backup" subtree that was also used (differently) in the "Grunt" tree, where it meant "scream and flee to nearest ally." One action — "CallForBackup" — meant different things based on where it sat in the tree. Reuse without coupling.

Utility AI

A third architecture, popular in simulation games and increasingly in combat AI, is utility AI. Instead of a rigid state machine or a hierarchical tree, utility AI scores every possible action every tick and picks the one with the highest score.

The architecture looks like:

for each candidate action:
    compute a score from current world state
pick action with highest score
execute that action

The score function — called the utility function — is where the design lives. "How much do I want to reload right now?" is a score. It depends on how full the magazine is, how close the enemy is, whether there is cover, and so on. "How much do I want to flank?" is another score.

The advantage: emergence. With no rigid structure, the AI naturally prioritizes based on context. When the magazine is low and the player is close, the "take cover" score beats the "reload in place" score; when the magazine is low and the player is far, reload-in-place wins. You did not write that logic — it emerged from the shape of the utility functions.

The canonical example in game history is The Sims. Every Sim scores every object in the environment ("how much would sitting in that chair satisfy me right now?") and picks the winner. The entire game's behavior — sleep, eat, socialize, shower — is utility selection over thousands of objects.

In combat, F.E.A.R.'s AI (often described as GOAP, which we'll get to) used utility scoring at the action level. Shadows of Mordor's orcs used utility to prioritize goals in their procedural nemesis system. Supermassive's horror games use utility AI for wandering NPCs.

Pros

  • Emergent prioritization. The AI does the right thing in novel situations you didn't anticipate.
  • Easy to add new actions. Just write a new candidate with a new utility function. No need to restructure anything.
  • Natural handling of competing goals. Two goals with similar scores create hesitation or switching — which often reads as thinking.

Cons

  • Hard to debug. "Why did the enemy do that?" becomes "because action A had utility 0.73 and action B had utility 0.71." You trace utility curves, which is more painful than tracing FSM states.
  • Tuning hell. Utility curves are continuous. Get the shape wrong and the AI commits to the wrong thing. Tuning a utility AI to feel good is weeks of work.
  • Non-deterministic feel. The AI might flip between close-scored actions, which reads as indecisive. Hysteresis (sticky scores that favor continuing the current action) helps, but adds complexity.

Utility AI shines when behavior should be context-adaptive. It struggles when behavior should be scripted and predictable.

GOAP: Goal-Oriented Action Planning

In 2005 (the same year as Isla's Halo 2 talk — 2005 was a good year for AI talks), Jeff Orkin published a GDC paper describing the AI in F.E.A.R., a Monolith Productions shooter that people are still quoting twenty years later. The paper was called "Three States and a Plan: The AI of F.E.A.R."

The core idea: instead of writing a state machine or a behavior tree that specifies what the enemy does in which situation, give the enemy a set of possible actions (each with preconditions and effects), a set of goals (like "eliminate threat"), and let a planner — an honest-to-god planning algorithm — compute, at runtime, the sequence of actions that achieves the goal.

An action might be:

  • TakeCover — Preconditions: cover exists nearby. Effects: cover = true.
  • Flank — Preconditions: target position known, flanking route exists. Effects: position = flank.
  • Shoot — Preconditions: weapon loaded, target in line of sight. Effects: target damaged.
  • Reload — Preconditions: has ammo. Effects: weapon loaded.

A goal: "target eliminated." Given the current world state, the planner searches over action sequences until it finds one whose cumulative effects satisfy the goal. That sequence becomes the plan, and the enemy executes it.

Why F.E.A.R. Feels So Smart

The effect, in F.E.A.R., was spooky. Players would fire at a Replica soldier, it would duck behind a desk, and a moment later they would hear, from behind them, the click of a boot. It had flanked. Not because someone wrote a "flank" state — because the planner, given the goal "eliminate threat" and the world state "direct line of fire unavailable," found the sequence "move to flanking route → move to flanking position → fire."

Combined with F.E.A.R.'s aggressive barks ("I'm flanking left!" "Suppressing!") and crisp animations (the soldiers kicked tables over for cover), the AI read as terrifying. Players described it as "the smartest AI I've ever fought" for years afterward. The Replica soldiers didn't do anything an FSM couldn't do — they picked from a small set of actions. But the planner picked those actions in response to the situation, which meant the enemy never did the same thing twice in a row, and the enemies never all did the same thing, and the player's mental model shifted from "I am fighting enemies" to "they are fighting me."

Why Nobody Copied It

If GOAP is so great, why aren't all shooters GOAP-based? Several reasons.

It's expensive to build. Writing and tuning the action set, the precondition/effect descriptions, the heuristic for the planner — all of this is much more work than writing a behavior tree or an FSM. Orkin's team spent years on it. Most teams don't have that time.

It's hard to design for specific moments. The power of GOAP is emergence. The cost of emergence is: you can't script. "In this room, I want the Replica to kick the table then flank" is hard to enforce when the planner might decide another sequence is better. Scripting specific moments in a GOAP world requires special-case hooks that fight against the architecture.

BTs got good enough. By the late 2000s, behavior trees with enough conditional logic could approximate most of what GOAP offered, at a fraction of the implementation cost. Studios standardized on BTs.

The barks did most of the work. One of F.E.A.R.'s secrets was its barks. "I'm flanking!" "Suppress him!" "He's behind the crate!" These lines made the AI read as coordinated even when it was, sometimes, just each soldier doing its own thing. A lot of the "F.E.A.R. AI is so smart" reputation comes from readability more than planning. Subsequent shooters learned to invest in barks without investing in GOAP.

Today, GOAP is rare in shipping titles. It lives on in academic AI, in a few indie experiments, and in the hearts of people who played F.E.A.R. in 2005 and have been chasing the feeling ever since.

🎮 Case Study: See case-study-01.md for a deeper walkthrough of F.E.A.R.'s GOAP architecture and the specific design choices that produced the flanking feel.

Perception: Sight, Sound, Memory

None of the architectures above matter if your AI cannot tell what is going on. Perception is the sensory layer — the system that answers, for each agent, "what do you know about the world right now?" New designers underinvest in this layer. Experienced designers know it is where the feel of the enemy lives.

Sight

The default model is a vision cone: a 2D or 3D wedge in front of the enemy, with an angle (the field of view) and a range (how far it can see). Anything inside the cone is "potentially visible." From there, you do a line-of-sight check — a raycast from the enemy's head to the target's center of mass — to see if the view is blocked by geometry. If the raycast hits no obstacle, the target is seen.

Refinements that buy a lot of feel:

  • Peripheral vision is short-range. Add a second, wider, shorter cone. Things in peripheral vision are seen only at close range.
  • Alertness modifies cone size. An alert guard sees in a wider, longer cone than a bored one. This is how Dishonored distinguishes patrol from alert.
  • Crouching or cover reduces visibility. If the target is crouched, check the raycast against their head position, which might be behind cover. Metal Gear Solid V's guards formalized this into a set of visibility states (hidden, partially hidden, visible).
  • Movement attracts attention. Moving targets are easier to see. A target standing still in a dark corner can be missed.

Hearing

Hearing is modeled as a radius around the enemy, plus events that the sound-maker emits. The player runs — that emits a sound event at their position with a certain loudness. The enemy hears any sound event whose position is within the enemy's hearing radius, modulated by the sound's loudness.

Good hearing systems distinguish between sound types. A footstep sound is a mild alert. A gunshot is a major alert. A scream is a specific event that might trigger a specific response (run toward it, run away from it). Alien: Isolation famously propagates sound through the ship — the alien "hears" you from across a level if you make enough noise, and its perception model explicitly avoids giving the player a free pass for being far away.

Memory and Last-Known Position

Here is the piece that separates AI from roomba. When the enemy loses sight of the player — the player runs behind a wall, ducks into cover — the enemy does not forget. It remembers where it last saw the player and goes there. This is the last-known position (LKP).

The Metal Gear Solid alert cycle is built around LKP. When a guard spots you, they trigger Alert. If they lose you, they move to Caution, which has them searching the LKP area. If they find no one, they downgrade to Evasion (heightened patrol) for a while, then eventually back to Normal. This cycle is what makes MGS feel like a stealth game instead of a puzzle game. You cannot just walk past a guard and be forgotten.

Perception Cues for the Player

Here is the single most important rule of perception design: the player must be able to see what the AI sees.

If the guard is about to spot you, the player needs to know that. Not from the outcome (getting shot) but from the process (a visible sign of rising suspicion). The canonical solution is the Metal Gear Solid alert marker — a yellow exclamation mark when a guard is suspicious, a red one when they are in full alert. Splinter Cell used a light meter, showing the player how visible they were in their current lighting. Assassin's Creed shows a filling awareness bar above each enemy.

Without these cues, stealth games feel unfair. The player gets spotted, has no idea why, and assumes the AI cheated. With the cues, the player understands the situation, accepts detection as their fault, and learns. The cue is how the AI proves it is being honest.

💡 Intuition: Every internal state in your AI that affects the player should have an external sign. If the enemy is suspicious, they should look suspicious — head turn, pause, audible "hm?" If the enemy is searching, they should search visibly — sweep flashlight, call "Hello?", open doors. The internal state is for you, the designer. The external sign is for the player.

Steering and Navigation

Navigation is the "how do I get there" problem. Steering is the "how do I move without looking like a robot" problem. They are different layers.

Pathfinding

For global navigation — "I am here, the player is there, plot a route" — games use A* (A-star) over some representation of the world. The representation is usually a navmesh: a polygonal mesh covering the walkable areas of the level, pre-computed offline or baked by the engine.

In Godot 4.x, the navmesh is baked by NavigationRegion2D (2D) or NavigationRegion3D (3D) nodes, which consume collision geometry and produce the mesh. Agents use NavigationAgent2D / NavigationAgent3D to request paths to targets, and the engine handles the A* and path smoothing.

A* gives you a list of points from A to B. That is the skeleton of movement. By itself, it looks like a Roomba on rails — the agent marches along the line, snapping heading at each corner.

Local Steering

On top of pathfinding sits the steering layer, which handles the last few meters and the frame-to-frame smoothness of motion. Craig Reynolds, in 1987, published a set of behaviors — boids — that combine to produce flocking, crowd movement, and natural-looking individual motion:

  • Seek — move toward a target.
  • Flee — move away from a target.
  • Arrive — seek, but decelerate as you get close.
  • Separation — push away from nearby agents.
  • Alignment — match velocity with nearby agents.
  • Cohesion — move toward the average position of nearby agents.
  • Obstacle avoidance — turn away from obstacles.

Most games combine a few of these. For a single enemy chasing the player: Arrive (to reach the player without overshooting) + Separation (to avoid clumping with other enemies) + Obstacle avoidance (to slide around pillars).

The Doorway Problem

Every AI programmer has a story about the doorway. You ship your game. A playtester walks into a room. Four enemies try to chase them out. Two enemies try to walk through the door at the same time. They jam. They wiggle. They oscillate in place. The player dies laughing.

The doorway problem is the general problem of multiple agents contending for a narrow passage. Solutions include:

  • Reservation — only one agent may use a door segment at a time; others wait.
  • Priority — the leader goes first, others fall in line.
  • Dynamic avoidance — agents see each other as soft obstacles and route around, but this often produces the wiggle.
  • Cheating — temporarily disable collision between agents in crowded spaces, so they pass through each other. Players rarely notice.

Shipping games use some mix of these. The point is: out-of-the-box navigation does not handle crowds. Plan for it.

Group AI and Squad Behavior

One enemy is a problem. Five enemies is a different problem. When enemies share a space, the question is no longer just "what does this enemy do" but "what does this group do."

The naive approach — each enemy runs its own AI independently — produces the doorway problem and worse. All five enemies chase the player at once. All five arrive at the same time, stack on top of each other, and swing simultaneously. This is bad combat. It reads as mindless.

Good group AI imposes coordination.

Role Assignment

When a group spots the player, assign roles. One becomes the engager — runs in, attacks aggressively. Two become flankers — move around. One becomes the support — stays back, provides suppressing fire or ranged attacks. One becomes the caller — sounds the alarm, calls for backup.

The Halo AI groups worked approximately like this. Elites often took lead roles, grunts filled support, jackals provided ranged suppression. When the elite died, the grunts panicked — their designated leader was gone, and the group's role assignment collapsed. This produced the famous "grunts fleeing in terror" moment that players remember twenty years later.

The Designated Leader Trick

A classic cheat: within each group, one enemy is the "leader." Only the leader runs the expensive group-level decision logic ("should we advance, hold, retreat?"). The others follow the leader's decisions with some personal variation. This scales: with ten enemies split into three groups, you run three leader AIs, not ten.

When the leader dies, the group picks a new leader (or panics, if you want Halo-style drama).

Ring Combat: The One-at-a-Time Rule

For melee combat, most games enforce what designers call the attacking ring. At most one or two enemies may attack the player simultaneously; the rest orbit, close, and wait their turn. This is why fighting twelve enemies in Arkham Asylum is possible — most of them are circling, not attacking.

The Assassin's Creed series, the Batman Arkham series, Shadow of Mordor, Middle-earth: Shadow of War — all enforce ring combat. Players experience swarming but never face simultaneous twelve-way attacks. Without the ring, melee combat against a group is unplayable. With it, it feels like an action-movie set piece.

⚠️ Pitfall: New designers building melee combat forget to implement a ring and then wonder why their game is impossible. The first thing to build for group melee is "only one enemy may be in the attack ring at a time." Everything else is refinement.

Allies, Companions, and Pets

Allies are harder than enemies. Repeat that to yourself until it sticks.

The reason: enemies can die. If an enemy does something dumb, the player kills it, and the problem evaporates. If an ally does something dumb, it lingers — standing in the doorway, blocking your exit, shooting at nothing, failing to heal you when you need it. The player has to look at it. They can't make it go away.

Worse, a bad ally destroys the fantasy. A companion who is supposed to be competent — a fellow soldier, a romantic partner, a trained animal — exposes every seam when they misfire. They fall off ledges. They run in front of your gun. They get stuck on doorframes. Every designer who has built companions knows the pain.

The BioShock Infinite Elizabeth Approach

Irrational's solution in BioShock Infinite (2013) was to make Elizabeth, the game's companion, into something specifically engineered to never get in the way. She has no hitbox the player can collide with. She does not take damage. She does not block doorways (the game teleports her past if the player approaches). She cannot die. She cannot even miss. What she does is provide the player with things at exactly the right moment — tossing ammo when you're running low, tossing health when you're hurt, tossing salts (mana) when you're depleted, opening tears (portals) when combat needs variety.

Every design decision around Elizabeth was in service of the principle: she must never be a problem for the player. She is not meant to be a second agent in the game world. She is meant to feel like a second agent — through her animations, her barks, her looking-at-what-you're-looking-at animation pipeline — while being, mechanically, a device that gives the player resources.

This approach is rare because it is expensive and limiting. Elizabeth cannot take initiative; she cannot outplay the player; she cannot surprise them. She is functionally a friendly UI. But by being a friendly UI with the full animation and voice acting of a character, she reads as a companion without ever failing as one.

See case-study-02.md for a deeper look at Elizabeth's design.

The Ashley Problem

The opposite end is Ashley in Resident Evil 4 (2005). Ashley is an escort — Leon has to keep her alive while fighting through a village of enemies. She follows Leon. She has her own hitbox. She can be grabbed by enemies and carried off. The player has to protect her.

Ashley became a notorious pain point. She got stuck on corners. She stood in the doorway Leon was trying to cross. She took hits from enemies Leon didn't see. Mid-2000s escort missions in general — Ashley, Natalya in GoldenEye, every escort in every stealth game — became an internet meme for frustration.

The fix, in the Resident Evil 4 remake (2023), was specific: simplify Ashley's behavior, make her less grabbable, make her follow more smoothly, make her cower when asked. Capcom cut the worst friction and kept the tension. The escort still mattered, but the ally stopped being the problem.

General Rules for Companions

  • They should never block the player. If the ally is in the way, teleport them or phase their collision.
  • They should never take the player's shot. If the ally is between the player and a target, move them or make them temporarily invulnerable to friendly fire.
  • They should provide rather than need. An ally who gives the player resources (ammo, heals, buffs) is loved. An ally who demands resources (protect me!) is tolerated at best.
  • They should be honest about their capabilities. If the ally is going to pathfind poorly, make them magical — let them teleport around corners when the player isn't looking. Do not force the player to watch them fail.
  • Their barks should acknowledge the player's situation. "Nice shot!" when the player lands a headshot. "You okay?" when the player is at low health. The ally feels alive when they notice.

Telegraphing AI Decisions

We have arrived at the part of the chapter that will make the biggest difference to your game. All the architecture in the world — FSMs, BTs, GOAP, utility — means nothing if the player cannot read the AI. The solution is telegraphing.

Barks

A bark is a short voice line spoken by the AI that announces state or intent. "Grenade!" when they throw one. "He's on the catwalk!" when they spot the player. "I'm hit!" when they take damage. "Reloading!" when they reload. "Flanking left!" when they move to flank.

Barks do three things. First, they inform the player. Second, they create the illusion of coordination (enemies hearing each other's barks feels like communication even if it isn't). Third, they personify the enemy — the voice makes it a someone instead of a something.

F.E.A.R.'s barks are a textbook case. Half-Life 2's Combine radio chatter. Left 4 Dead's infected noises. Destiny's Fallen gibberish (which is barks in an invented language — still readable by tone). Every shooter you have ever thought had smart AI had good barks.

Budget accordingly. A full bark set for a single enemy type is hundreds of lines: every state transition, every event, every combination. For indie games, you compromise — you use fewer lines, layer them carefully, avoid repetition. For AAA, you use tens of thousands of lines and still notice when an enemy says the same thing twice.

Animation Telegraphs

Animation tells the player what the AI is about to do, half a second before it does it. A boss winds up for a heavy attack with a visible telegraph — a pose, a flash, a growl. The player reads the telegraph, dodges. The hit lands or doesn't based on the player's read.

Without the telegraph, the attack is unfair. Even if the game is technically dodgeable, the player experiences it as instant, as cheating, as bad. With the telegraph, the attack becomes a conversation — the AI commits, the player responds, the outcome depends on the read.

Dark Souls (which we covered in Chapter 26) is built entirely on this principle. Every enemy attack has a readable windup. The player's mastery is the mastery of reading the windups. The AI does not need to be smart — it needs to be legible.

Posture and Body Language

Before an enemy attacks, their stance changes — weight forward, weapon raised, breathing hard. Before they flee, their stance changes — weight back, eyes wide, visible panic. The state of the AI lives in the body.

This is where animation and AI design marry. Your AI programmer and your animator should be in constant communication. The animator needs to know what states exist so they can design animations for each. The programmer needs to know what animations exist so the state machine can drive them.

🛠️ Practitioner Tip: Build a debug overlay that shows every enemy's current state as a floating text label in the game view. You will catch more bugs in an hour of playtesting with this overlay than in a week without it. Ship an AI without this tool and you will spend forever chasing phantom issues.

Difficulty and AI

Difficulty scaling is usually handled badly. The bad way: on higher difficulty, enemies have more HP and do more damage. That's it. This is lazy and produces a worse experience at higher difficulty, not a better one. The fight takes longer, and the player has less margin for error. The game doesn't get smarter; it gets tedious.

Good difficulty scaling changes behavior.

  • Aggression. On higher difficulty, enemies close distance faster, attack more often, retreat less.
  • Group tactics. On higher difficulty, enemies flank more, suppress more, use cover more. The same FSM runs different probability weights on transitions.
  • Perception. On higher difficulty, vision cones are wider, hearing radii larger, last-known-position memories longer.
  • Variety. Higher difficulty introduces new enemy types. Easier difficulty uses fewer.
  • Mechanics revealed. Easier difficulty hides mechanics (enemies never grapple you, never throw grenades). Harder difficulty turns them on.

DOOM (2016) uses behavior scaling — the Ultra-Nightmare difficulty does not just amp numbers, it makes the demons aggressive to a degree that never appears on lower settings. Resident Evil 4 unlocks enemy types (crossbow zanudos, parasite heads) at higher difficulty. XCOM: Enemy Unknown gives aliens access to more abilities on harder campaigns.

Chapter 33 will discuss the ethics of adaptive difficulty — the systems that silently adjust behind the player's back. For this chapter, the takeaway is: difficulty is a design dimension of AI behavior, not just a stat multiplier. Design for it from the start.

Cheating AI (And Why It's Fine)

Designers often agonize over AI that "cheats." Should the driving game AI see the full track? Should the strategy game AI know the full map? Should the boss know exactly where the player is? The usual answer, after much hand-wringing, is: yes, and it's fine, and everyone does it.

Civilization difficulty levels openly cheat. At Deity difficulty, the AI starts with more units, researches faster, and ignores rules the player must follow. This is public information — the game tells you it's cheating. Players accept it because the game is honest about the tradeoff.

Mario Kart's rubber-band AI cheats. The enemies in the back get speed boosts; the enemies in the front hit mysterious slowdowns. This is a well-documented part of the series. Players accept it because it produces the tight finishes the game is built around.

Dark Souls enemies have less i-frames than the player, smaller hitboxes than their models suggest, and sometimes worse tracking than a real agent would have. These are gifts to the player to make combat feel fair. The AI is cheating in the player's favor.

The rule is not "don't cheat." The rule is cheat in the direction of experience. If cheating makes the game feel better, cheat. If it makes the game feel worse, don't. The player does not audit your code. They audit how the game feels. An AI that pretends not to see you when it does — because the designer wants you to catch your breath — is a good AI.

What you should never do is cheat against the player in a way that feels unfair. The AI has perfect aim and headshots you through cover. The boss reads your inputs and counters every move. These cheats exist in some games (FPS aimbot AI, fighting game "reading AI") and they feel terrible. They break the trust. The player detects dishonesty even if they can't name it.

⚖️ Honest Cheating: Tell your playtesters the AI cheats. Ask them if they noticed. If they didn't notice, and they enjoyed the game, the cheat is good. If they noticed and felt wronged, the cheat is bad. The test is effect, not principle.

Implementing an Enemy FSM in Godot

Enough theory. Let's build it. The progressive project this chapter implements is an enemy with five states — Idle, Patrol, Alert, Chase, Attack — plus perception and navmesh movement. This is EnemyFSM.gd, the script that will be introduced into the continuity tracker.

State Enum and Setup

# EnemyFSM.gd
# Attached to the enemy's root node (CharacterBody2D).
# Requires: Area2D child for vision, NavigationAgent2D, AnimationPlayer.

class_name EnemyFSM
extends CharacterBody2D

enum State { IDLE, PATROL, ALERT, CHASE, ATTACK, RETREAT, DEAD }

@export var patrol_points: Array[Vector2] = []
@export var patrol_wait_time: float = 1.5
@export var chase_speed: float = 120.0
@export var patrol_speed: float = 50.0
@export var attack_range: float = 32.0
@export var alert_duration: float = 3.0
@export var vision_range: float = 220.0
@export var vision_angle: float = deg_to_rad(60.0)  # Half-angle
@export var retreat_health_threshold: float = 0.25

@onready var nav_agent: NavigationAgent2D = $NavigationAgent2D
@onready var anim: AnimationPlayer = $AnimationPlayer
@onready var perception: PerceptionSystem = $PerceptionSystem

signal state_changed(from_state: State, to_state: State)

var current_state: State = State.IDLE
var state_time: float = 0.0
var target: Node2D = null
var last_known_position: Vector2 = Vector2.ZERO
var current_patrol_index: int = 0
var health: float = 100.0
var max_health: float = 100.0

Point of technique: the state_changed signal lets other systems — animation, audio, UI — react to transitions without polling. This is how you hook up bark lines ("I see you!") to state entry.

The Update Loop

func _physics_process(delta: float) -> void:
    state_time += delta

    # Perception runs every tick regardless of state.
    target = perception.get_visible_target()
    if target:
        last_known_position = target.global_position

    # Delegate to per-state handler.
    match current_state:
        State.IDLE:      _update_idle(delta)
        State.PATROL:    _update_patrol(delta)
        State.ALERT:     _update_alert(delta)
        State.CHASE:     _update_chase(delta)
        State.ATTACK:    _update_attack(delta)
        State.RETREAT:   _update_retreat(delta)
        State.DEAD:      pass

    # Global transitions — these can fire from any state.
    if health <= 0 and current_state != State.DEAD:
        _change_state(State.DEAD)
    elif health / max_health < retreat_health_threshold and current_state in [State.CHASE, State.ATTACK]:
        _change_state(State.RETREAT)

    move_and_slide()

Two design notes. First, perception runs every tick, no matter what state we're in — it's the agent's senses, not a behavior. Second, we separate state-specific transitions (handled inside each _update_*) from global transitions (checked at the end). Low-HP and death can fire from anywhere. State-local transitions (like "reached patrol waypoint") belong inside the relevant state handler.

State Handlers

func _update_idle(_delta: float) -> void:
    velocity = Vector2.ZERO
    anim.play("idle")
    if target:
        _change_state(State.ALERT)
    elif state_time > 2.0 and not patrol_points.is_empty():
        _change_state(State.PATROL)

func _update_patrol(_delta: float) -> void:
    anim.play("walk")
    var goal = patrol_points[current_patrol_index]
    nav_agent.target_position = goal

    if nav_agent.is_navigation_finished():
        if state_time > patrol_wait_time:
            current_patrol_index = (current_patrol_index + 1) % patrol_points.size()
            state_time = 0.0
        velocity = Vector2.ZERO
    else:
        var next_pos = nav_agent.get_next_path_position()
        velocity = (next_pos - global_position).normalized() * patrol_speed

    if target:
        _change_state(State.ALERT)

func _update_alert(_delta: float) -> void:
    velocity = Vector2.ZERO
    anim.play("alert")  # Animation: looking around, head turn.
    if target:
        # Confirmed target — engage.
        _change_state(State.CHASE)
    elif state_time > alert_duration:
        # Lost it — return to patrol.
        _change_state(State.PATROL)

func _update_chase(_delta: float) -> void:
    anim.play("run")
    nav_agent.target_position = last_known_position
    if not nav_agent.is_navigation_finished():
        var next_pos = nav_agent.get_next_path_position()
        velocity = (next_pos - global_position).normalized() * chase_speed
    else:
        velocity = Vector2.ZERO

    if target and global_position.distance_to(target.global_position) < attack_range:
        _change_state(State.ATTACK)
    elif not target and global_position.distance_to(last_known_position) < 8.0:
        # Reached LKP and target is gone. Search briefly then return.
        _change_state(State.ALERT)

func _update_attack(_delta: float) -> void:
    velocity = Vector2.ZERO
    anim.play("attack")
    # Combat integration (from Chapter 26): CombatSystem fires when animation hits.
    if not target or global_position.distance_to(target.global_position) > attack_range * 1.5:
        _change_state(State.CHASE)

func _update_retreat(_delta: float) -> void:
    anim.play("run")
    # Run away from last known player position.
    var flee_direction = (global_position - last_known_position).normalized()
    velocity = flee_direction * chase_speed
    if state_time > 3.0:
        # Caught breath — re-evaluate.
        _change_state(State.ALERT if target else State.IDLE)

Read this carefully. This is what a shipping FSM looks like — not academic pseudocode. Notice how each state does a specific thing, checks specific transition conditions, and calls _change_state explicitly. There is no implicit flow. A bug will be locatable to a single function.

The Transition Helper

func _change_state(new_state: State) -> void:
    if new_state == current_state:
        return
    var old_state = current_state
    current_state = new_state
    state_time = 0.0
    emit_signal("state_changed", old_state, new_state)
    # Hook: audio/bark system listens to state_changed and plays barks.
    # Hook: UI listens for ALERT → CHASE to show exclamation mark.

func take_damage(amount: float) -> void:
    health -= amount
    # Getting hit is attention-getting; always interrupt toward the source.
    if target == null and current_state not in [State.RETREAT, State.DEAD]:
        _change_state(State.ALERT)

The take_damage function is an example of an event-driven transition — damage is not polled in the per-tick state handler; it's an external event that forces a transition.

Perception System

# PerceptionSystem.gd
# Attached as a child node of the enemy. Emits perceived-target info.

class_name PerceptionSystem
extends Node2D

@export var vision_range: float = 220.0
@export var vision_half_angle: float = deg_to_rad(60.0)
@export var hearing_radius: float = 160.0
@export var target_group: String = "player"

@onready var owner_body: Node2D = get_parent()
var current_target: Node2D = null

func get_visible_target() -> Node2D:
    var candidates = get_tree().get_nodes_in_group(target_group)
    for candidate in candidates:
        if _can_see(candidate):
            return candidate
    return null

func _can_see(body: Node2D) -> bool:
    var to_target = body.global_position - owner_body.global_position
    if to_target.length() > vision_range:
        return false
    var forward = Vector2.RIGHT.rotated(owner_body.rotation)
    if forward.angle_to(to_target) > vision_half_angle:
        return false
    # Line of sight check
    var space = owner_body.get_world_2d().direct_space_state
    var query = PhysicsRayQueryParameters2D.create(
        owner_body.global_position, body.global_position
    )
    query.exclude = [owner_body.get_rid()]
    var result = space.intersect_ray(query)
    if result.is_empty() or result["collider"] == body:
        return true
    return false

func hear_sound(source_pos: Vector2, loudness: float) -> void:
    # Called by SoundEvent broadcaster (player footsteps, gunshots, etc.)
    var distance = owner_body.global_position.distance_to(source_pos)
    if distance < hearing_radius * loudness:
        # Feed last-known-position into the owner FSM.
        if owner_body.has_method("heard_sound"):
            owner_body.heard_sound(source_pos)

Hearing is event-driven, not polled. The player's footstep emits a sound event (a signal through the audio manager, or a direct call), and every enemy within earshot gets hear_sound called. The enemy's FSM can then, in heard_sound, decide to flip to Alert and set the LKP.

# NavmeshFollow.gd
# A helper for agents that want navigation without writing an FSM.
# Use standalone for wandering NPCs; the EnemyFSM integrates equivalent logic internally.

class_name NavmeshFollow
extends CharacterBody2D

@export var move_speed: float = 80.0
@export var arrival_distance: float = 8.0

@onready var nav_agent: NavigationAgent2D = $NavigationAgent2D

var following: bool = false

func move_to(target_position: Vector2) -> void:
    nav_agent.target_position = target_position
    following = true

func _physics_process(_delta: float) -> void:
    if not following:
        return
    if nav_agent.is_navigation_finished():
        velocity = Vector2.ZERO
        following = false
        return
    var next = nav_agent.get_next_path_position()
    var direction = (next - global_position).normalized()
    velocity = direction * move_speed
    move_and_slide()

A note for setup: NavigationAgent2D in Godot 4.x requires a baked NavigationRegion2D in the scene. If the agent isn't finding paths, the first thing to check is that the navmesh baked correctly — level designers forget to re-bake after changing geometry roughly always.

Progressive Project Update (Chapter 27)

For the project, deliver:

  1. One enemy type with the five-state FSM above (Idle → Patrol → Alert → Chase → Attack → Retreat). Use your existing enemy sprite from Chapter 17's level or your combat chapter (26) iterations.
  2. A patrol path of 3-5 points in Level 1, placed in Godot via a child Path2D or an exported Array.
  3. A vision cone (via PerceptionSystem), with a visible debug gizmo (_draw in the editor) so you can tune the cone size in the level editor.
  4. Navmesh-driven pathing via NavigationRegion2D baked over your Level 1 geometry. Ensure walls block paths.
  5. One bark line (or text popup, if you don't have voice acting) on transition to Alert ("Huh?"), Chase ("Intruder!"), and Retreat ("Fall back!"). Route these through the Chapter 30 AudioManager once that chapter's code lands, or a placeholder for now.
  6. A debug overlay that prints the enemy's current state above its head. Keep this in a debug toggle for development.

Playtest your single enemy. Walk around the level. Run toward it. Duck behind cover. Does it lose sight of you and return to patrol? Does it chase to the last known position? Does it give up if it can't find you? Does it retreat when you hurt it? Iterate until the answer to each is yes and the timing feels right.

Then, only then, duplicate the enemy and place three more in the level. You will discover the doorway problem. You will discover the clump problem. You will discover that "one enemy feels great" does not imply "four enemies feel great." This is expected. Chapter 28 touches on multi-agent coordination in the networking context; Chapter 32 will teach you to balance.

Common Pitfalls

Silent AI. The enemy has nine states, a flanking behavior, a squad coordination system, and a planner. The player sees none of it. Fix: barks, animation telegraphs, visible state cues. Do not ship AI without a legibility pass. A dumb AI with good telegraphs beats a genius AI with none every time.

Too smart. The enemy always flanks. Always takes cover. Always shoots first. Always wins the firefight. Players hate it. They cannot tell if they're outplayed or cheated. Fix: add mistakes. Make the AI miss the first shot. Make it hesitate before committing. Make it pick suboptimal cover sometimes. Believable intelligence includes believable fallibility.

No recovery from edge cases. The enemy chases the player off a cliff. Gets stuck in a doorway. Wanders into lava. Pathfinds into a wall it can't escape. Fix: set timeouts on every state. If a state has been active for more than N seconds with no progress, force transition to a safe state. Detect stuck-on-obstacle conditions and teleport or force-move. Assume everything that can go wrong will, and add exit hatches.

Navmesh holes. The level artist adds a new section. The navmesh isn't rebaked. Enemies cannot pathfind into the new area. Fix: a pre-commit check or build step that re-bakes navigation whenever level geometry changes. Plus visual inspection — the navmesh overlay in the Godot editor shows you the walkable areas; scan it before every milestone build.

All the enemies share one brain. You wrote the AI once and applied it to every enemy type. They all do the same thing. The game is monotonous. Fix: differentiate by role. Your melee enemy charges; your ranged enemy kites; your caster stays back and zones. Different behaviors for different types, using the same architecture. One FSM class, many configured instances.

Summary

Game AI is not machine learning. It is the engineering of the illusion of intelligence for the purpose of producing a feeling in the player. The core tools — finite state machines, behavior trees, utility AI, GOAP — differ in structure but share a goal: make the agent produce readable, believable behavior without requiring the player to understand what is happening under the hood.

FSMs are the workhorse. Most shipped AI is an FSM (sometimes hierarchical). BTs win when behaviors are composable and designer-editable. Utility AI wins when context-adaptivity matters more than predictability. GOAP is rare but legendary, and its legacy lives in the barks and telegraphs that make AI read as smart regardless of architecture.

Perception — sight, sound, memory, last-known-position — is where experienced designers invest, and where beginners underinvest. The player must be able to see what the AI sees, literally through visible cues like alert markers.

Steering and navigation solve the "how do I move" problem, with A* pathfinding over a navmesh and local steering for crowd behavior. The doorway problem is eternal.

Group AI requires coordination — role assignment, designated leaders, ring combat. Allies are harder than enemies because they cannot die to absolve their sins; design them as resource providers (like Elizabeth) rather than fellow combatants (like Ashley).

Telegraphs — barks, animations, posture — are the difference between AI that functions and AI that feels alive. Build them in from the start.

Difficulty is behavior, not just stats. Cheating AI is fine if it cheats toward experience.

Your progressive project gains a real enemy this chapter. Build it, playtest it, iterate. In Chapter 28, we'll look at how these AI systems survive the transition to networked multiplayer — which is a very different problem, because in multiplayer, half the "enemies" are other people. In Chapter 30, we'll return to AI through the audio lens, building the bark system that makes all of this legible. In Chapter 33, we'll revisit difficulty and adaptation as ethics questions — because the line between "the AI accommodates the player" and "the AI manipulates the player" is thinner than it looks.

Now go build an enemy that feels alive.