40 min read

Combat is the system that consumes more of your studio's budget than any other. It is the system your players will spend the largest share of their time inside. It is the system they will quote back to you years later, sometimes frame-accurately...

Learning Objectives

  • Distinguish action, strategy, and hybrid combat paradigms and identify which fits a given design intent.
  • Design a combat verb set small enough to teach and deep enough to sustain hours of play.
  • Apply frame data (startup, active, recovery) and hitbox/hurtbox reasoning to any real-time combat system.
  • Use game feel techniques (hit pause, shake, particle, audio layering) to make hits read as impacts.
  • Build encounters whose challenge comes from enemy composition and terrain rather than inflated HP.
  • Design a multi-phase boss fight with telegraphs, arena, and rhythm that teaches while it tests.
  • Implement a functional attack-state machine, hitbox/hurtbox system, and boss phase controller in Godot 4.x.

Chapter 26: Combat Design — Action, Strategy, and the Art of the Fight

Combat is the system that consumes more of your studio's budget than any other. It is the system your players will spend the largest share of their time inside. It is the system they will quote back to you years later, sometimes frame-accurately, when they describe what made your game feel good or feel bad. And it is the system that, above all others, cannot be hidden behind narrative or visuals or marketing. If your combat is bad, the player will know inside fifteen seconds. They will not be able to explain why. They will simply stop playing.

I have watched teams iterate on a single attack — one swing of one sword — for nine months. I have watched combat leads run a stopwatch against every enemy encounter, shaving frames off recovery animations until the rhythm clicked. I have watched a designer scrap an entire boss three weeks before ship because the second phase did not teach what the first phase rehearsed. The ratio of design hours to player seconds in combat is wild. A fight the player experiences for ninety seconds may have been worked and reworked by four people over four months. That ratio is why combat is one of the hardest design problems in games, and it is also why the chapters before this one matter so much — because combat is where everything you learned about feedback (Chapter 8), about flow and challenge (Chapters 11 and 13), and about progression (Chapter 25) gets stress-tested under pressure.

This chapter is a practitioner's map of the combat-design space. We will look at the three broad combat paradigms — action, strategy, and the hybrids that dominate modern design — and how the choice among them constrains everything else. We will build a vocabulary borrowed from the fighting-game community: frame data, hitboxes, hurtboxes, cancels. We will translate these concepts to any real-time combat, 2D or 3D, indie or AAA. We will study the anatomy of a great encounter and a great boss. We will write a working combat system in Godot — one that you will extend in Chapter 27 when you bolt AI onto it and again in Chapter 32 when you balance it. By the end, you should be able to look at any fight in any game and diagnose what is working, what is not, and what you would change.

One more warning before we begin. Combat design is the domain where player-facing designers and technical engineers collaborate most closely, and where the collaboration most often breaks down. The designer says "this attack should feel heavy." The engineer asks "heavy how? Longer startup? More screen shake? A lower-pitched hit sound?" The designer says "I don't know — heavier." Across the industry, teams that ship great combat have learned to translate "feel" into measurable frame counts, pixel distances, and audio envelopes. Teams that ship bad combat are still arguing about feel. This chapter will give you the vocabulary to be on the right side of that split.

Why Combat Design Is Hard

Every system in your game has a design-to-play ratio, and combat has the worst one. A movement system might take two weeks to tune and serve the player for the entire game. A dialogue system might take two months to build and carry every conversation the game needs. A combat system will take two years and be redone three times.

Here is why. Combat is the system that most demands to be felt, not understood. The player does not care about your clever state machine; they care about the tiny haptic pulse in the controller when their sword connects. Combat is the system that runs hottest — it executes every frame, with dozens of transient objects spawning and despawning per second, and every one of those events is a chance for a bug the player will notice. Combat is the system most coupled to every other system: animation, audio, visual effects, camera, UI, physics, AI, and level design all meet inside it. A combat fix may require edits across all eight departments.

And combat is the system with the narrowest margin for error. In a platformer, a jump arc that is ten percent wrong is noticeable but survivable. In combat, a ten-percent wrong attack animation — ten percent too long, ten percent too weak, ten percent late on its active frames — is unplayable. The human nervous system is trained, by millions of years of predator-prey evolution, to detect bad timing. Your players' bodies know when combat is wrong before their brains do.

💡 Intuition: Every combat system is a feedback loop between the player's hands and the screen. The loop closes in something like 150 milliseconds, round trip. If any stage of that loop — input read, state update, animation, rendering, audio, visual effect — is late or inconsistent, the loop breaks and the combat feels bad. "Feels bad" is a technical diagnosis, not an aesthetic one. It means the loop is broken.

What follows from this is a working principle. Combat is not designed; it is tuned. You will not sit down and architect a combat system that works on day one. You will build a rough version. You will play it. You will hate it. You will adjust one value — one frame count, one pixel of hitbox extent, one decibel on the hit sound — and play again. You will do this five hundred times. Somewhere around the four hundredth iteration, the thing will snap into place and feel right. Then you will hand it to a playtester, they will report it feels bad, and you will discover the next problem. Combat is the part of game design where the craft is most visible in the number of passes, not the cleverness of the idea.

Action vs. Strategy vs. Hybrid

Before you touch a hitbox, you need to decide what kind of combat you are designing. The decision is a triangle: action, strategy, hybrid.

Action combat resolves moment to moment through execution. The player sees a threat, the player presses a button at the right time, the threat is handled or it is not. Devil May Cry is the purest expression: your skill is measured in how stylishly you chain moves, and the moves are almost entirely a test of reflex and muscle-memory. The design task for action combat is to make the mechanics rich enough to sustain mastery while being legible enough to start enjoying on minute one. Devil May Cry's answer is a stratified system — a newcomer can enjoy basic combos while an expert performs jump-cancels that extend mid-air strings indefinitely. Bayonetta extends this with Witch Time, a parry-like dodge that rewards perfect timing with slow motion.

Strategy combat resolves through planning and positioning. Time pauses, or becomes turn-based, or so slow that calculation becomes possible. XCOM is the archetype: a turn ends, you see the battlefield, you consider the angles and cover and action economy, you commit a sequence of moves, and the consequences play out. The design task here is not making a reflex feel good — it is making a decision space that rewards thinking about it. Into the Breach takes this to its logical end: perfect information. You see exactly what the enemies will do next turn. The entire challenge is finding the move sequence that prevents damage. The game becomes a chess puzzle, and From the designer's perspective, every level is a puzzle handcrafted to have a solution.

Hybrid combat sits between. Dark Souls is the most-cited example: real-time enough to be action, slow enough and deliberate enough to be tactical. Every encounter asks you to read the enemy, choose a tool, commit to the approach, and manage the stamina economy that caps your actions. Bloodborne pushes the action dial. Sekiro pushes the strategy dial by making every fight about reading and deflection rather than dodging. Monster Hunter is action combat built on a strategy-combat skeleton: the actual swings and dodges are real-time, but you are also reading wind direction, monster fatigue, weapon matchups, and which limb to break for crafting materials.

🎮 Case Study: The Dark Souls middle path. When FromSoftware built Demon's Souls in 2009, the lead designers had a conviction: combat should feel weighty, like hitting something real, and it should feel fair, like every death was the player's fault. The resulting design sits exactly where hybrid works best. Attacks have long startup and long recovery, so committing to a swing is a real decision. The stamina bar means you cannot spam. The lock-on focuses the camera and turns ranged geometry into a two-enemy duel. Roll i-frames are brief — thirteen frames in Dark Souls on a well-timed roll — so dodging is a test of reading, not mashing. The player learns, over hundreds of deaths, that every encounter has a rhythm. That rhythm is the game.

Which of these should your combat be? It depends on the player fantasy (Chapter 1) you wrote on day one. If your fantasy is "be a stylish demon hunter," you are action. If your fantasy is "command a small squad against impossible odds," you are strategy. If your fantasy is "die and learn and become equal to a terrible place," you are hybrid — and you have signed up for the hardest combat problem to solve well, because both halves have to work.

The trap is middle-of-the-road hybrid that satisfies no one. Too slow for the action audience, too twitchy for the strategy audience. When you see a game criticized for "feeling clunky," it often means the designers wanted action speed but shipped action commitment, or they wanted tactical depth but built for reflex. Pick a lane, commit, and borrow from the other lane only to sharpen your lane.

The Combat Verb List

Open a blank page. Write down every verb the player can perform in combat. That list is your combat. It is also your tutorial scope, your UI scope, your animation budget, and your balancing matrix.

For most modern action games, the list looks something like this:

  • Attack (light)
  • Attack (heavy)
  • Dodge / roll
  • Block
  • Parry
  • Special / magic / ranged
  • Grab / throw
  • Jump / use vertical space

Seven verbs. Each one requires an input binding, an animation set, a set of cancel rules into and out of other verbs, a matching set of enemy responses, a sound design pass, and a tutorial beat. Seven verbs is already a lot. If you go to ten, you have blown up your testable matrix; if you go to twelve, you probably have a prototype no one fully understands.

The counter-intuitive truth: fewer verbs produce deeper combat, not shallower. Constraint is the engine of creativity (our recurring theme since Chapter 7). Hyper Light Drifter has three verbs: sword, gun, dash. That is it. No block, no parry, no jump, no heavy attack, no magic. And yet the combat is rich enough that players still post frame-perfect bossfight clips eight years after release. The depth comes from timing relationships among the three verbs. Dash is an i-frame dodge. Gun shots get refilled by sword hits, so you weave melee and ranged. Every enemy is designed to pressure those three verbs, so the combat vocabulary feels deep even though it is small.

Hollow Knight has a similar economy. Attack, dash, jump, spell. Four verbs. The depth emerges from how those four verbs combine across 25 hours of enemies.

Sekiro, perhaps surprisingly, has a very short list. Attack, deflect, dodge, jump, grappling hook, prosthetic tool. The revolution of Sekiro is not in the number of verbs — it is in how central deflect becomes, how much of the combat is routed through one verb rehearsed in every fight. We will come back to this in the case study.

🛠️ Practitioner Tip: Before you implement any combat, write your verb list on a whiteboard and defend every entry. For each verb, answer: "What does this verb let the player do that no other verb does?" If the answer is "a small variation on another verb," cut the verb. You will save months of tuning. The game will be better.

Hitboxes, Hurtboxes, and Frame Data

The fighting-game community has, over thirty years, developed the most precise vocabulary in game design for talking about attacks. Every discipline borrows from it now, and you should too, because the concepts apply to any real-time combat system regardless of genre or perspective.

Hitboxes and Hurtboxes

A hitbox is a volume (in 2D, a 2D shape; in 3D, a 3D volume) attached to an attacker that, if it overlaps a hurtbox during the active frames of an attack, registers a hit. A hurtbox is a volume attached to a character that, if overlapped by a hitbox, causes them to take damage.

Hitboxes and hurtboxes are emphatically not the same as visual models. The character you see on screen is a rendered model with arms and legs and a sword. The hitboxes and hurtboxes are simpler, often abstracted shapes — capsules, boxes, circles — that approximate where the damage actually happens. A classic rookie mistake is to use the rendered mesh as the hitbox. Do not do this. Hitboxes need to be designer-controlled because the feel of an attack depends on the size and placement of the box.

Example: in most fighting games, a character's hurtbox shrinks during a crouch. The visible model is also crouching, but the hurtbox is smaller than the model — attacks that would visually clip the knee or shoulder miss because the hurtbox has tucked inside. This is intentional. Crouching is a defensive move, and the combat system rewards the crouch with reduced hittable area. The visuals follow the mechanics, not the other way around.

Example: in Dark Souls, the lock-on targeting reticle sits on an invisible anchor that is often a small sphere on the enemy's center of mass — not on their rendered silhouette. This means you can lock on to an enemy behind a thin wall and sometimes hit them with a spear through that wall. This is either a bug or a feature depending on your perspective; FromSoftware has left it in for fifteen years because it makes certain shots satisfying.

Frame Data

Every attack has three primary phases, measured in frames (at 60 FPS, a frame is 1/60 of a second or ~16.67 ms):

Startup frames. The time from input press to the moment the hitbox becomes active. This is the "wind-up." During startup, the attack is committed — the player can no longer cancel it freely — but the hitbox is not yet dangerous.

Active frames. The time the hitbox is "live" and can register hits on hurtboxes. This is the "active" part of the attack.

Recovery frames. The time from the end of active frames until the character can act again. This is the "punish window" — an enemy observing the recovery can attack back during this vulnerable interval.

In Street Fighter II, Ryu's standing medium punch has 5 startup frames, 3 active frames, and 13 recovery frames. Total animation: 21 frames, or about 350 milliseconds at 60 FPS. That might sound like nothing, but fighting-game players have trained themselves to process these numbers as intuitively as a chess player processes rank and file. They know that Ryu's medium punch is "plus 5 on hit" (after the attack lands, Ryu recovers 5 frames before the opponent can act) and "minus 3 on block" (after the attack is blocked, Ryu recovers 3 frames after the opponent can act — meaning a fast enough opponent counter-attack will land).

You do not need to expose your frame data to the player. You do need to think in it. Every attack in your game has a startup, active, and recovery. The feel of your combat is the pattern of those numbers.

Common frame-data wisdom:

  • Light attacks should have short startup (4-8 frames) and short recovery. They are the tempo of combat. You throw them out, they whiff, no punishment.
  • Heavy attacks should have long startup (15-30 frames) and long recovery. They commit the player. A player who throws a heavy when they should have thrown a light eats the punishment.
  • Dodge rolls have a window of invincibility frames (i-frames) — usually 10-16 frames — during which the hurtbox is disabled. Before and after the i-frame window, the rolling character is fully hittable. This creates the "just barely dodged" feeling that Dark Souls is famous for.
  • Parries have a tiny active window (often 2-6 frames) during which the parry is possible. Miss by a frame and you are committed to the parry recovery, which is long. This makes parry a high-risk/high-reward tool.

💡 Intuition: Frame counts are your design dial. When a player says "this attack feels weak," they usually mean the startup is too long (so commitment is frustrating) or the recovery is too long (so whiffing feels terrible) or the active frames don't align with the visual impact moment. The fix is almost never "more damage." The fix is almost always timing.

Cancels

A cancel is when one action interrupts and replaces another mid-animation. A light-attack-cancel-to-dodge means the player can exit a light attack's recovery into a dodge, skipping the rest of the recovery. A jump-cancel means they can jump out of an attack.

Cancels are a huge part of combat depth. The Devil May Cry combo system is, under the hood, a vast table of "from state X, which inputs can cancel into which state Y, and with what priority?" A game with generous cancels feels responsive and combo-driven. A game with stingy cancels feels weighty and committed. Neither is wrong; they target different player fantasies.

Dark Souls has almost no cancels on attack recovery. Once you swing, you finish the animation. This is intentional: it forces the player to commit to attacks, which in turn means the timing of each attack matters enormously. A single-cancel rule — you can cancel the recovery of a light attack into a roll, but only after a specific frame — would make the game significantly easier and significantly less tense. When FromSoftware tuned Elden Ring, they added some cancels (specifically for jump attacks) but kept most recovery uncancelable, preserving the series' weighty commitment.

Game Feel in Combat

Game feel is the subject of Chapter 8, so this section assumes that foundation and specializes it for combat. Combat is where game feel matters more than anywhere else in your game, because combat is where the player's hands are most engaged and where "the loop feels off" will trip alarms in their nervous system first.

The combat-feel toolkit:

Hit pause (hit stop). When an attack lands, both the attacker and the target freeze for a handful of frames — often 4 to 10 — before continuing. The world holds its breath at the moment of impact. This is the single most important feel technique in combat design. Without it, hits look like the attacker is swinging through air. With it, hits feel like they connect. Super Smash Bros. invented modern hit pause conventions; Devil May Cry is nearly unplayable without it enabled. If you only implement one game-feel technique in your combat, implement hit pause.

Screen shake. A tiny camera perturbation on impact. Short — 4 to 8 frames — low amplitude, decaying quickly. Used sparingly on heavy hits. Overused, it becomes nausea.

Particle bursts. Sparks, dust, blood, magical energy erupt from the point of impact. Usually a short burst (0.2-0.4 seconds of particles) at the hitbox/hurtbox contact point. Sometimes a heavier secondary burst on "perfect" hits (critical hits, parries, finishers).

Damage numbers. Optional; genre-dependent. Borderlands and MMOs splash damage numbers everywhere. Dark Souls does not. Neither choice is wrong. Adding numbers pushes the game toward "build optimization" and "DPS thinking"; removing them pushes toward "did that hit feel bigger than this one?"

Audio punch. The sound design of the hit. A weak hit is a thin crack. A heavy hit is a low-frequency thud layered with a higher-frequency impact. Good combat audio often uses two or three simultaneous sounds per hit — the weapon's swing-sound, the impact itself, and a short "pain cry" from the target — layered so no single sample carries the weight. Listen to Monster Hunter melee impacts; they are the textbook.

Impact animation. The target reacts visibly — a recoil, a stagger, a blood burst, a pose break. Without this, hits look like they're landing on a statue. Many games use a "hit react" animation that is mandatory when health drops from certain attacks, creating the feel that your hit "mattered" to the victim.

🎮 Case Study: Doom Eternal's glory kill. id Software's combat designers on Doom Eternal created a feedback loop I find genuinely ingenious. When you damage a demon enough, it "staggers" — it glows and slows. At that moment, you can execute a "glory kill," a short scripted finishing animation that drops ammo and health. The loop is: you shoot, you stagger, you glory-kill, you get resources, you shoot more. Every step of the loop has lavish feel — the stagger flashes, the glory-kill locks the camera, the finisher crunches, the loot bursts out, the ambient music spikes. It is combat as psychostimulant. The critical design insight is that the loop is short (3-6 seconds per iteration) and always ends with a resource reward that enables the next iteration. The hit-pause on a glory-kill's landing blow is longer than on a normal hit — 200ms or so — because id understands that the player deserves a beat of punctuation on each kill.

Every technique above can be tuned. Start with exaggerated values — more shake, more pause, more particles — and dial down until the effect is subtle but present. "A little more than you think" is usually wrong for combat feel; "a little less than feels obvious" is usually right. The magic happens just below conscious notice.

Reading the Enemy

Combat is a conversation. The enemy makes an offer (an attack). The player has a small window to read the offer and respond. The response can be dodge, block, parry, counter, retreat, or commit-and-tank. If the player reads correctly, they survive; if they misread, they pay. The quality of this conversation depends on how legible the enemy's attacks are.

Legibility is the designer's gift to the player. An attack is legible if, in the moment before it becomes dangerous, the enemy broadcasts its intent through:

  • An animation tell — a wind-up pose held for some number of frames before the attack commits.
  • A visual effect — a glow, a particle pulse, a color shift signaling "attack incoming."
  • An audio cue — a grunt, a weapon whistle, a chord of music.

The "I died but it was fair" feeling in Dark Souls comes from legibility. Every enemy attack has a readable tell. The tell is often generous — a Black Knight's overhead smash has a noticeable shoulder raise for 20+ frames before it commits. A Havel the Rock heavy swing has enough startup that you can see the attack coming from across the arena. The first time you die to a Black Knight, it feels like the game killed you. The tenth time, you are dying because you misread a tell you can see perfectly well. This shift — from "the game killed me" to "I died to something I could have prevented" — is the design loop Dark Souls is famous for, and it is built entirely out of readable telegraphs.

A tell-to-active ratio of 2:1 or greater is a good starting point. If your attack has 10 active frames, give it 20+ frames of tell. Faster than that and the player cannot react. (Human reaction time on an unexpected visual cue is ~250ms, or 15 frames at 60 FPS. On an expected cue — the player knows an attack is coming — it drops to ~180ms, or 11 frames. An attack with 10 frames of tell is genuinely unreactable by a first-time player.)

There are exceptions. Low-damage mook attacks can have minimal tells because they are filler — the player doesn't need to dodge every one, and sometimes they are meant to be tanked. Big boss attacks must always have big tells, because failing to dodge them means death, and death without a readable cause is the cardinal sin of combat design.

⚠️ Pitfall: The "gotcha" attack. The attack with no tell. The enemy swings the instant the player enters a range, or transitions mid-animation into a second attack the player could not anticipate, or teleports. These feel like the game is cheating. They generate forum outrage. They also sometimes appear in a boss's second or third phase as a deliberate escalation — in which case the player is expected to have learned the boss enough to pattern-recognize rather than cleanly react. Know which you're designing.

Encounter Design

An encounter is a single fight scenario: the enemies present, their placements, the arena terrain, the triggers. Encounters are the molecules of combat; a game's combat experience is the sum of its encounters, and boring encounters kill good combat systems just as surely as broken systems kill good encounters.

Solo vs. Group Encounters

A single enemy is a duel. The combat system is stress-tested against one opponent at a time. Dark Souls boss fights, Sekiro, most fighting games, any chapter in a character-action game's arena: all duels. Duels reward reading, rhythm, and execution.

Multiple enemies is a crowd-control problem. The player now has to manage attention across threats. A one-second commitment to attack enemy A leaves you vulnerable to enemy B. This is why so many combat systems have an area attack, a crowd-clear special, or a weapon type optimized for sweeping motion. Without a crowd tool, fighting three enemies at once turns into nothing but dodge-dodge-dodge until you can isolate one — frustrating for the player, and for the design team that promised "epic battles" in the pitch deck.

Mixed enemy types is the advanced move. A melee enemy up close, a ranged enemy at mid-distance, a flying enemy overhead. The player must now prioritize. Each of these enemies, alone, is trivial; together, they pose a layered threat model that taxes every verb in the combat vocabulary. Halo's "30 seconds of fun" loop, which Bungie's designers talked about publicly for years, is built from exactly this: an encounter contains Grunts (weak, numerous), Jackals (shielded, ranged), Elites (strong, melee or ranged). Each enemy type is solved with a different tool. The player's delight comes from the fact that they are always switching tools and always moving. The encounter is 30 seconds of decisions, not 30 seconds of the same decision repeated.

The Rock-Paper-Scissors Trap

A common instinct is to design combat around explicit type matchups: fire beats ice, ice beats water, water beats fire. Each enemy has a weakness. The player's job is to identify the weakness and apply the counter.

This can work. It is the spine of Pokemon, most JRPGs, and many monster-hunting games. But it can also fail badly. When the matchup is too on-the-nose — "this enemy is immune to your current weapon, please go back to the menu, change your weapon, and return" — the combat becomes a sorting problem rather than a combat experience. The player stops fighting and starts menu-swapping.

The fix is usually one of three:

  1. Make the matchup a multiplier, not an absolute. All weapons work against all enemies, but some work better. The player is never hard-stopped, just incentivized to switch.
  2. Make switching itself a fun verb. If your weapon-switch is fast and feels good — Devil May Cry, Bayonetta — then encouraging the player to switch mid-combat is combat, not a menu dance.
  3. Limit the matchup to specific high-salience enemies. Bosses have matchups; trash mobs don't. The player isn't punished for not matching against a fodder enemy, but is rewarded for matching against a boss.

Terrain Interaction

A flat featureless arena is the weakest combat terrain. Combat is enriched when the arena itself has affordances: elevation changes, destructible cover, chokepoints, environmental hazards, pits.

Breath of the Wild is, I think, the high-water mark of combat terrain. Enemies can be pushed off cliffs, set on fire if in grass, electrocuted if in water, frozen if rained on, shattered if frozen-then-hit. The arena is the game. The player's weapon matters less than the environmental system the player can weaponize.

Dark Souls uses terrain more subtly. Encounters are designed around specific ambush points, narrow bridges, elevated archers. You are almost never fighting on a flat featureless plane. The terrain is part of the puzzle.

Do not build flat featureless arenas. Even a small rise, a single pillar, an edge you can fall off — all of these turn a one-dimensional fight into a two-dimensional one.

Boss Fights

A boss fight is a combat encounter designed as a special event: a single large enemy with unique mechanics, typically multi-phase, typically placed at the end of a level or narrative arc. Boss fights are the showcase fights of your game. They are where your design team shows off, and they are also where bad combat shows its ugliest. A broken mook is an annoyance; a broken boss is a refund.

Anatomy of a Boss Fight

A well-designed boss fight usually contains the following beats:

  1. Intro cinematic. A short (5-20 seconds) cutscene or in-engine sequence that introduces the boss, frames the stakes, and establishes visual identity. Should be skippable on death.
  2. Phase 1. The boss at "full" health with a first movelist. The player is learning the boss's patterns.
  3. Phase transition. A trigger (usually a health threshold) causes the boss to change. New attacks, new vulnerabilities, new visuals, new music cue.
  4. Phase 2. The boss at "second-stage" health with added or replaced moves. The player must integrate the new vocabulary without dropping what they learned in phase 1.
  5. (Optional) Phase 3. A final stand. Often the most aggressive, with the least margin for error.
  6. Kill moment. The final hit, often punctuated with a lingering camera, slow-motion, or a short kill-cinematic.

Every fight does not need every beat, but the structural skeleton above is the baseline. Let's walk through two famous examples.

Hollow Knight's Nightmare King Grimm

Nightmare King Grimm is the optional super-boss version of the Grimm Troupe. Team Cherry uses a three-phase escalation:

  • Phase 1 is Grimm Troupe's moveset accelerated and tightened. The player, having fought Grimm Troupe, already knows these patterns; they simply have less time to react.
  • Phase 2 introduces a new attack — a flame barrage that floods a larger portion of the screen — requiring the player to use the dash-up i-frames in a way they didn't need before.
  • Phase 3 is pure pattern recognition: the boss chains phase 1 and phase 2 moves with zero downtime.

Each phase's health bar is shorter than the last, so the rhythm is a short phase 1 (learning the baseline), a mid-length phase 2 (integrating the new tool), and a fast phase 3 (execution test on everything together). No new attack is introduced in phase 3 — the new attack has already been taught. This is crucial: phase 3 of a boss should almost never introduce a mechanic the player hasn't seen. The player's job in phase 3 is to execute, not to learn.

Sekiro's Owl (Father)

Owl Father is one of the hardest fights in Sekiro, a game that is already brutal. Its design is a master class in constraint.

Phase 1 opens slow: Owl observes the player, spaces deliberately, throws small feints before committing to a real attack. The player has time to read and deflect.

Phase 2 drops this pretense. Owl becomes relentless — his attacks chain into each other with minimal downtime. But crucially, every one of those attacks was shown, individually, in phase 1. Phase 2 is not new vocabulary; it is tightened pacing on old vocabulary.

What makes the fight a masterpiece is that the boss's behavior teaches the player the correct response to itself. Against Owl Father, the player who tries to outspace or outrun the boss dies. The player who commits to deflection — the game's central verb — survives. The boss is a deflection exam. FromSoftware designed him to make the mechanic they wanted players to master the only mechanic that reliably works. The boss teaches the game.

Cuphead

Cuphead is the opposite philosophy. Cuphead's bosses are pattern memorization puzzles. Each boss has 3-4 phases, but the phases are telegraphed on a hard schedule: hit the boss enough times, they transform, the new moveset begins. Every attack is learnable, deterministic, and survivable if you know what's coming. The fight is won in the living room during your fifth attempt, not on the execution of attempt one.

The design takeaway: the Cuphead approach is valid and beloved, but it requires your animations and visual direction to be so good that repeated deaths are still fun. If the player is going to die twelve times to the same boss, every one of those deaths has to be a pleasure to look at. Cuphead's 1930s animation pipeline exists to serve this design goal.

Boss Arena Design

The arena is half the boss. A bad arena kills a good boss.

Arena must-haves:

  • Enough space to maneuver. The player needs room to dodge without clipping geometry. If your boss has a 10-meter attack range and your arena is 12 meters wide, you have built a coffin.
  • No invisible walls near the fight center. Players will learn to assume the arena is bounded. They should never trip on a collider they couldn't see.
  • Visual identity. The arena should look like it belongs to this boss. Majula's exterior arenas, each themed to its guardian, are the textbook. The Champion of Ash's arena in Dark Souls III is just rubble — appropriate for the fight, and it frames the boss visually with clean backlight.
  • No punishing geometry. Pillars that block line of sight to the boss; bumps that cause the player to stumble mid-dodge; sloped ground that changes jump heights. All bad. All common mistakes in AA boss arenas.

Boss Tells and Rhythm

A boss that attacks without rhythm is exhausting; a boss with too-predictable rhythm is boring. The sweet spot is a musical rhythm — variation around a theme.

The best bosses in modern design work like jazz solos. There's a common pulse (the time between tell and commit, say 20 frames), but the boss improvises around it — mixing short and long attacks, sometimes pausing for a second to bait the player into attacking, sometimes chaining two attacks with minimal gap. The player learns the pulse; the variation keeps them awake.

A test I use: record 60 seconds of your boss fight. Close your eyes and listen to the audio only. Can you hear a rhythm? A good boss fight has a rhythm the way good music has a rhythm — audible as a pattern even without the visuals.

Difficulty in Combat

Combat difficulty is the most abused design dial in the industry. The lazy version: if the game is too easy, make enemies deal more damage or have more HP. If it's too hard, do the opposite. This is pure number tuning, and it is almost always the wrong tool.

The smell test for a difficulty dial: Does the harder difficulty change what the player has to do, or does it just change how long they have to do it? If HP doubles, the player simply swings longer. The strategy, the execution, the reads — unchanged. The fight is not harder; it is longer. This is a difficulty illusion.

The Bloodborne BL4 community — players who complete the game at Bloodlevel 4, with the starting stats and no level-ups — are evidence. They beat the game's hardest content using only skill. They prove that most of Bloodborne's difficulty is not in the numbers; it is in the mechanics. When you remove the number advantage, the game is still beatable, because the mechanics remain beatable. This is a good sign for combat design. It means the designers tuned the mechanics, not the spreadsheet.

Mechanics-changing difficulty looks like this:

  • Harder difficulty adds new enemy attacks that were absent on normal.
  • Harder difficulty gives enemies new AI behaviors — they stop attacking one at a time and start coordinating.
  • Harder difficulty removes affordances — fewer healing items, fewer save points, shorter parry windows.
  • Harder difficulty introduces new mechanics — a stamina system that wasn't present on easy, a break mechanic the player must manage.

Celeste's Assist Mode (Chapter 11) is the opposite approach: let the player dial back mechanics until the game is completable. This is not a difficulty setting; it is accessibility. It removes the mechanic obstacles while preserving the game's shape. Both approaches are valid; both are more honest than pure HP-tuning.

🛠️ Practitioner Tip: If your game has difficulty settings, playtest each one separately. Do not assume that "normal" playtest coverage applies to "hard." Most bugs that make a fight unfair appear only at the highest difficulty, where margins are thinnest. Also: if your designers cannot beat Hard mode without resorting to cheese, your players won't either. Your playtest team is your worst-case skill floor.

Combat Camera

The camera is combat's most-ignored system — until it breaks, at which point it is the only thing the player complains about.

Fixed Camera

The camera stays in a scripted position or follows simple rules. Fighting games use a fixed side-on camera. Classic Resident Evil uses pre-placed camera angles per room. Fixed cameras are predictable and easy to learn but restrict what the combat can show.

Lock-On

The camera locks onto a selected enemy, keeping both the player and the enemy in frame. Ocarina of Time's Z-targeting invented the pattern; Dark Souls inherited it; every third-person action game since has shipped some version.

Lock-on advantages: readability. The player knows which enemy they are engaging. The camera angles the player's movement vector so "left" and "right" mean "strafe around this enemy," which is what you usually want.

Lock-on disadvantages: the camera occasionally ends up in terrible places. If the enemy is much taller than the player, the camera tilts up and the ground disappears. If an enemy moves behind a wall, the camera lingers awkwardly on the wall. If you lock onto a small enemy close to you, the camera zooms in and loses peripheral awareness. The "Dark Souls camera" has been a subject of player complaint for fifteen years despite FromSoft iterating on it; some problems are inherent to the technique.

Free Camera

No lock. The camera follows the player, the player aims it manually. This is the Devil May Cry / Bayonetta approach — no target lock; the camera is always wide, framing multiple enemies, and the player strafes and targets manually.

Free camera supports combat with multiple enemies better. It also demands more from the player (simultaneous movement + camera input) and is harder to make feel good. When it works, it is the most expressive. When it fails, it is disorienting.

The Right Choice

If your combat is duel-focused, use lock-on. If your combat is crowd-focused, use free camera. If you are ambitious, support both and let the player toggle. If you are ambitious and do this, budget at least six months for camera work alone; it is that hard.

GDScript Implementation

We will now build the combat-system skeleton for the progressive project. The architecture is: an attack-state machine per combatant, hitbox/hurtbox system using Godot 4 Area2D nodes, and a boss fight manager that drives phase transitions. Every component is intentionally minimal; we extend these in Chapter 27 (AI-driven enemy behavior) and Chapter 32 (balancing pass).

CombatSystem.gd — The Attack State Machine

# CombatSystem.gd
# Attached to any combatant node (player or enemy). Manages attack
# state transitions, hitbox activation, and cancel rules.
extends Node

enum State { IDLE, WINDUP, ACTIVE, RECOVERY, HITSTUN }

signal attack_started(attack_name: String)
signal attack_hit(target: Node)
signal state_changed(old_state: int, new_state: int)

@export var hitbox: Area2D              # assigned in scene
@export var hurtbox: Area2D              # assigned in scene
@export var attack_library: Dictionary = {}  # name -> AttackData

var state: int = State.IDLE
var current_attack: Dictionary = {}
var state_timer: float = 0.0

func _ready() -> void:
    hitbox.monitoring = false
    hitbox.area_entered.connect(_on_hitbox_overlapped)

func start_attack(attack_name: String) -> bool:
    if state != State.IDLE:
        return false
    if not attack_library.has(attack_name):
        push_error("Unknown attack: %s" % attack_name)
        return false
    current_attack = attack_library[attack_name]
    _change_state(State.WINDUP)
    state_timer = current_attack.startup_frames / 60.0
    emit_signal("attack_started", attack_name)
    return true

func _process(delta: float) -> void:
    if state == State.IDLE:
        return
    state_timer -= delta
    if state_timer > 0.0:
        return
    match state:
        State.WINDUP:
            _change_state(State.ACTIVE)
            state_timer = current_attack.active_frames / 60.0
            hitbox.monitoring = true
        State.ACTIVE:
            hitbox.monitoring = false
            _change_state(State.RECOVERY)
            state_timer = current_attack.recovery_frames / 60.0
        State.RECOVERY:
            _change_state(State.IDLE)
        State.HITSTUN:
            _change_state(State.IDLE)

func _change_state(new_state: int) -> void:
    var old := state
    state = new_state
    emit_signal("state_changed", old, new_state)

func _on_hitbox_overlapped(area: Area2D) -> void:
    if state != State.ACTIVE:
        return
    var target_combat := area.get_parent().get_node_or_null("CombatSystem")
    if target_combat == null or target_combat == self:
        return
    target_combat.receive_hit(current_attack)
    emit_signal("attack_hit", area.get_parent())

func receive_hit(attack: Dictionary) -> void:
    _change_state(State.HITSTUN)
    state_timer = attack.get("hitstun_frames", 12) / 60.0

A few things to notice. The state machine has exactly five states; that is enough to model a complete attack cycle and still fit on a whiteboard. startup_frames, active_frames, and recovery_frames are stored per attack in the attack library, so different attacks can have different timing — this is where your frame-data tuning lives. The hitbox is enabled only during the ACTIVE state; this is what makes active-frame timing actually meaningful at runtime.

We convert frames to seconds by dividing by 60, which assumes 60-FPS design. If your game ships at 30 FPS or runs at variable framerates, use Engine.get_physics_ticks_per_second() to normalize.

Hitbox.gd and Hurtbox.gd — The Collision Layer

# Hitbox.gd — attached to an Area2D child of the attacker.
extends Area2D

@export var damage: int = 10
@export var knockback: Vector2 = Vector2(150, -50)
@export var hit_lag_frames: int = 6   # hit pause on contact

func _ready() -> void:
    collision_layer = 0
    collision_mask = 0b0010    # hurtboxes are on layer 2
    monitoring = false         # enabled by CombatSystem during ACTIVE
# Hurtbox.gd — attached to an Area2D child of the defender.
extends Area2D

signal hit_received(damage: int, knockback: Vector2, source: Node)

@export var invincible: bool = false

func _ready() -> void:
    collision_layer = 0b0010
    collision_mask = 0
    area_entered.connect(_on_area_entered)

func _on_area_entered(area: Area2D) -> void:
    if invincible or not area is Hitbox:
        return
    var hb := area as Hitbox
    emit_signal("hit_received", hb.damage, hb.knockback, hb.get_parent())

The split of hitbox-on-attacker and hurtbox-on-defender is the standard fighting-game architecture. Notice the collision layer configuration: hitboxes are on layer 2's mask (they look for hurtboxes), and hurtboxes are on layer 2 (they are detected by hitboxes). Hitboxes do not detect other hitboxes, and hurtboxes do not detect other hurtboxes. This prevents attack-vs-attack collisions except when you explicitly want them.

The invincible flag on Hurtbox is what implements i-frames. During a dodge roll, your combat system sets hurtbox.invincible = true for the roll's invincibility window, then clears it when the window ends.

BossFight.gd — The Phase Controller

# BossFight.gd
# Manages a multi-phase boss encounter: intro, phase 1, transition, phase 2,
# death. Sits one level above CombatSystem and drives the boss's high-level
# state.
extends Node

signal phase_started(phase_index: int)
signal boss_defeated()

@export var boss: CharacterBody2D                     # the boss itself
@export var boss_combat: Node                         # its CombatSystem
@export var max_hp: int = 1000
@export var phase_thresholds: Array[float] = [0.66, 0.33]
@export var intro_duration: float = 4.0
@export var transition_duration: float = 2.5

var current_hp: int
var current_phase: int = -1
var is_active: bool = false

func _ready() -> void:
    current_hp = max_hp
    boss_combat.connect("attack_hit", _on_boss_attack_hit)

func begin_fight() -> void:
    is_active = false
    _play_intro()

func _play_intro() -> void:
    emit_signal("phase_started", 0)
    # Lock player input, play camera shake/zoom, trigger music swell.
    # Replace with your actual intro logic.
    var timer := get_tree().create_timer(intro_duration)
    await timer.timeout
    _enter_phase(0)

func _enter_phase(index: int) -> void:
    current_phase = index
    is_active = true
    emit_signal("phase_started", index)
    # Switch the boss's attack library to the phase's moveset.
    boss_combat.attack_library = _movelist_for_phase(index)

func _movelist_for_phase(index: int) -> Dictionary:
    # A real implementation loads phase data from a resource.
    # The demo shows escalation via startup-frame tightening.
    match index:
        0:
            return {
                "sweep":       {"startup_frames": 24, "active_frames": 6, "recovery_frames": 18},
                "lunge":       {"startup_frames": 18, "active_frames": 4, "recovery_frames": 22},
            }
        1:
            return {
                "sweep":       {"startup_frames": 18, "active_frames": 6, "recovery_frames": 14},
                "lunge":       {"startup_frames": 14, "active_frames": 4, "recovery_frames": 18},
                "aoe_slam":    {"startup_frames": 30, "active_frames": 10, "recovery_frames": 40},
            }
        _:
            return {}

func apply_damage(amount: int) -> void:
    if not is_active:
        return
    current_hp = max(0, current_hp - amount)
    _check_phase_transition()
    if current_hp == 0:
        _defeat()

func _check_phase_transition() -> void:
    var hp_ratio := float(current_hp) / float(max_hp)
    var target_phase := 0
    for threshold in phase_thresholds:
        if hp_ratio <= threshold:
            target_phase += 1
    if target_phase != current_phase:
        _play_transition(target_phase)

func _play_transition(to_phase: int) -> void:
    is_active = false
    # Trigger animation, camera push, music swell here.
    var timer := get_tree().create_timer(transition_duration)
    await timer.timeout
    _enter_phase(to_phase)

func _defeat() -> void:
    is_active = false
    emit_signal("boss_defeated")

func _on_boss_attack_hit(target: Node) -> void:
    # hook for UI feedback, camera shake, analytics
    pass

The design choice worth calling out: phase transitions are driven by HP thresholds, and the moveset is swapped wholesale on transition. This means designing a boss is, in large part, a process of writing _movelist_for_phase and tuning the frame data entries per phase. In the demo, we escalate by tightening startup frames, meaning phase 2 is literally the same attacks with less tell time. Adding a new attack in phase 2 (aoe_slam) completes the escalation.

Notice that we emit a phase_started signal. Your UI, music manager, and lighting director should all subscribe to that signal. The phase transition is a whole-stage moment — everything changes — not just a boss-stats change.

🛠️ Practitioner Tip: Do not hard-code phase logic inside the boss script. Drive phases through data: a resource or JSON file describing each phase's moveset, visual state, and music. This lets designers iterate on boss tuning without needing a programmer to touch code. Combat systems that decouple data from logic iterate 3-5x faster than those that don't.

Progressive Project Update

For Chapter 26, extend your 2D action-adventure prototype with a functional combat system:

  1. Player attack. Implement a single light attack with three frame phases (windup / active / recovery). Use the CombatSystem.gd script above. Spawn a Hitbox node during active frames. Tune the frame counts in playtests — aim for 6 startup / 4 active / 12 recovery as a starting point, and iterate.
  2. Hit feedback. Hook your JuiceEffects.gd from Chapter 8 into the attack_hit signal. On hit: hit-pause for 6 frames, screen shake (small), particle burst at contact, low-layer audio hit.
  3. Player dodge. A roll with 12-14 frames of i-frames in the middle of the animation. Disable hurtbox during the i-frame window. This is your "Dark Souls moment" — tune until a just-barely-dodged attack feels perfect.
  4. One enemy type. A melee enemy with one attack, one tell, and a predictable attack interval. Tell should be 20+ frames.
  5. One boss with two phases. Use BossFight.gd above. Phase 1: two attacks, both generously telegraphed. Phase 2 (triggered at 50% HP): tighter tells, one new attack. Add a 3-second transition with a visual change.

By the end of Chapter 26, your prototype should have the skeleton of a real combat system. Do not yet worry about AI — enemies can attack on a timer for now. We give them brains in Chapter 27.

Common Pitfalls

Pitfall 1: Input lag. The most common combat-feel killer. If the player presses attack and the attack starts 100ms later, the combat is dead. Most engines queue input through a frame boundary, adding 16ms. If you then delay attack start by waiting for animation blend, you add another 50ms. Total: 66ms, which feels laggy on a controller. Aim for input-to-attack of under 50ms, ideally under 33ms. Profile this. Godot has a Input.is_action_just_pressed that's tighter than polling; use it in _physics_process rather than _process for deterministic timing.

Pitfall 2: No telegraphs, or telegraphs only on the boss. Every enemy attack needs a tell. Even trash mobs. Without a tell, the player learns that they cannot dodge reliably, and the game becomes a stat-check instead of a skill-check. Early in development, make your tells exaggerated — too long, too obvious. Cut them down in tuning. You will never cut them below "fair"; you will often find yourself wishing they were still longer.

Pitfall 3: "It gets harder" = more HP. The number-one symptom of unfinished difficulty design. If Hard mode differs from Normal only in enemy HP and damage, you have not designed a difficulty; you have designed a time-tax. Mechanics-changing difficulty requires more work. It is also the difference between a game that respects its hard-mode players and a game that punishes them.

Pitfall 4: A boss with no rhythm. Bosses that attack on a fixed pattern become rote; bosses that attack randomly feel unfair. The fix is rhythm: a consistent pulse with tasteful variation. When you playtest a boss, close your eyes for 60 seconds and listen to the audio. Can you hear a beat? If not, your boss is arrhythmic; tune the spacing between attacks.

Pitfall 5: The camera fights the player. A lock-on camera that constantly corrects for a larger enemy, a free camera that swings unpredictably through geometry, a fixed camera that occludes the action. Camera feels invisible when it works and maddening when it doesn't. Budget real time for camera tuning; do not let it be the last thing you polish.

Summary

Combat is the system that punishes shortcuts. The design-to-play ratio is brutal. Every frame matters. The tools — frame data, hitboxes, hurtboxes, telegraphs, hit pause — are borrowed from the fighting-game community, and they apply to any real-time combat system you will ever build.

Start with a short verb list and defend every entry. Tune attacks using frame data, not damage numbers. Build legibility — every attack announces itself before it hurts. Design encounters around terrain and enemy composition, not HP inflation. Give your boss fights a three-act structure: intro, phase 1, transition, phase 2, kill. Budget six months for camera work if your combat is 3D.

The GDScript above is the skeleton. You will extend it twice: once in Chapter 27, when enemy AI gets brains that drive the combat system intelligently; once in Chapter 32, when you run the balancing pass and discover that half your tuning was wrong. Audio will be layered in by Chapter 30. All four chapters work together — combat is never finished in one pass, and these are the passes you will make.

The final test of a combat system is this: does the player, after ninety minutes of play, want to pick up the controller and fight again? If yes, the combat is working. If no, the combat is not working, and no amount of narrative or art will compensate. Combat lives or dies in the feel, and the feel is built frame by frame by people who take the craft seriously. Be that person.