> — Eric S. Raymond (Linus's Law) — a hope that application security spends most of its time disproving
Prerequisites
- 1
- 3
Learning Objectives
- Explain the OWASP Top 10 as a risk taxonomy and describe a program-level defense for each category.
- Apply input-validation and output-encoding patterns, and state why validation belongs on the server.
- Integrate SAST, DAST, and SCA into a secure software development lifecycle and choose the right tool for a question.
- Reason about secrets, dependency risk, and the software supply chain using Log4Shell as the worked example.
- Threat-model a single feature with STRIDE and turn the findings into testable security requirements.
In This Chapter
- Overview
- Learning Paths
- 12.1 Developers as defenders: the secure software development lifecycle
- 12.2 The OWASP Top 10: a guided tour of where applications fail
- 12.3 Input validation and output encoding: the two patterns that prevent the most bugs
- 12.4 Secrets, dependencies, and the software supply chain: the code you didn't write
- 12.5 SAST, DAST, and SCA in practice: automating the search for weakness
- 12.6 Threat modeling a feature: turning "what could go wrong?" into requirements
- Project Checkpoint
- Summary
- Spaced Review
- What's Next
Chapter 12: Application Security: OWASP Top 10, Secure Coding, and Why Developers Are the First Line of Defense
"Given enough eyeballs, all bugs are shallow." — Eric S. Raymond (Linus's Law) — a hope that application security spends most of its time disproving
Overview
On the night of December 9, 2021, a developer published a short proof-of-concept for a flaw in a logging library most engineers had never thought about. The library was Apache Log4j — a piece of plumbing buried so deep in the Java world that companies running it often did not know they were running it. The flaw, later catalogued as CVE-2021-44228 and nicknamed Log4Shell, let an attacker who could get a single crafted string into a log message run code of their choosing on the server. Not steal a file. Not crash a process. Run code. And getting a string into a log is trivial — a username, a search box, a User-Agent header, a chat message. Within hours, the entire internet was being scanned. Within days, every security team on earth was asking the same two questions, and discovering they could not answer either one: Do we use Log4j? Where?
Meridian Regional Bank's security team — Dana Okafor's group, whom you have followed since Chapter 1 — spent that week the way nearly every organization did: not patching a vulnerability they understood, but hunting for a vulnerability they could not see. Log4j was not something Meridian had chosen to install. It arrived as a dependency of a dependency of a vendor's loan-origination component, three layers down from anything a Meridian developer had ever typed. The vulnerable code was running in production in systems no one had associated with "Java logging." That is the part that should make a defender uncomfortable, and it is the central lesson of this chapter: most of the code running in your organization was not written by your organization, and you are responsible for all of it.
This is a chapter about that responsibility. Parts II of this book defended the network and Chapter 11 hardened the operating systems beneath your software. But the application is where the network's careful segmentation and the host's careful hardening get spent — because the application, by design, accepts input from the outside world and acts on it. A firewall will happily pass a perfectly-formed HTTP request carrying a malicious payload; that is its job. The defense has to live in the code. And the code is written by developers, which is why this chapter's subtitle is not a slogan: developers are the first line of defense, the only people positioned to stop most application attacks before they exist. Our job — the security team's job — is not to write the application for them. It is to give them the patterns, the tooling, the requirements, and the feedback loops that make the secure way the easy way.
In this chapter, you will learn to:
- Read the OWASP Top 10 as what it actually is — a prioritized list of the categories of application risk that hurt organizations most — and describe a program-level defense for each.
- Apply the two patterns that prevent a startling fraction of all application vulnerabilities: input validation (decide what you accept) and output encoding (make data safe for wherever it is going).
- Reason about secrets, dependency risk, and the software supply chain, using Log4Shell as a worked case in why "your" code includes code you have never read.
- Place SAST, DAST, and SCA correctly in a secure software development lifecycle (SSDLC), and know which tool answers which question.
- Threat-model a single feature and convert the threats into security requirements a developer can build and a tester can check.
Learning Paths
Application security touches every role, but it lands differently depending on where you sit.
🏗️ Security Engineer: This is your chapter. Read all of it, but §§12.3–12.6 (secure coding, supply chain, SAST/DAST/SCA, threat modeling) are the core of an AppSec engineer's daily work. The Project Checkpoint's
appsec.pyis a tool you will extend in Chapter 13 and wire into a pipeline in Chapter 31. 📜 Certification Prep: The OWASP Top 10 (§12.2), the secure-coding patterns (§12.3), and the SAST/DAST/SCA distinction (§12.5) are heavily tested on Security+ and CISSP (Software Development Security domain). Thekey-takeaways.mdmaps each to its domain. 🛡️ SOC Analyst: You will not write the code, but you will detect attacks against it and triage its vulnerabilities. Focus on §12.2 (so you recognize the categories in alerts) and §12.4 (supply-chain risk, which becomes Log4Shell triage in Chapter 23). The web-specific detection work lands in Chapter 13. 📋 GRC: Your interest is the program: the SSDLC policy (§12.1, the Project Checkpoint), the security-requirements practice (§12.6), and the supply-chain governance that Chapter 29 builds on. Skim the code; read the process.
12.1 Developers as defenders: the secure software development lifecycle
Let us begin where the harm begins. A vulnerability in a network device is something a vendor shipped and you deployed; a vulnerability in your application is, far more often, something you built — a line of code that trusted input it should have doubted, a query assembled by gluing strings together, a permission check that was never written. The uncomfortable, empowering truth is that the overwhelming majority of application vulnerabilities are introduced by ordinary developers doing ordinary work, under deadline, without malice and usually without ever knowing they did it. That is not a moral failing. It is a systems problem, and systems problems have systems solutions.
The systems solution is the secure software development lifecycle (SSDLC): the practice of building security activities into every phase of how software is designed, written, tested, deployed, and maintained — rather than bolting a security review onto the end. The phrase to retire is "we'll have security look at it before launch." By launch, the architecture is set, the code is written, the deadline is tomorrow, and "security" can do little but write a report nobody has time to act on. The phrase to adopt is shift left: move security activity earlier (leftward on a timeline that runs design → code → test → deploy), where defects are cheaper to fix and where a single good decision can prevent an entire class of bug.
Why does earlier mean cheaper? Because the cost of fixing a defect rises steeply with how far it travels. A flawed design caught at the whiteboard costs a conversation. The same flaw caught in code review costs a rewrite of one function. Caught in testing, it costs a rewrite plus re-testing. Caught in production — by an attacker — it costs an incident, a disclosure, a regulator's attention, and the trust of every customer whose data went out the door. The numbers vary by study and you should be skeptical of any precise multiplier, but the direction is not in dispute and matches every practitioner's experience: the later you find it, the more it costs, and the curve is steep.
Here is the SSDLC as a pipeline with security gates — the mental model for the whole chapter. Each phase has its own security activity, and each activity is a gate the work passes through (Figure 12.1).
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ DESIGN │──▶│ CODE │──▶│ BUILD │──▶│ TEST │──▶│ DEPLOY │──▶ operate
└────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │ │ │
┌────▼─────┐ ┌────▼─────┐ ┌────▼─────┐ ┌────▼─────┐ ┌────▼─────┐
│ THREAT │ │ SECURE │ │ SCA │ │ DAST │ │ secrets │
│ MODEL │ │ CODING + │ │ (deps) + │ │ (running │ │ scan + │
│ (§12.6) │ │ SAST │ │ secrets │ │ app) │ │ pen test │
│ security │ │ (§12.5) │ │ scan │ │ (§12.5) │ │ + monitor│
│ reqs │ │ │ │ (§12.4-5)│ │ │ │ (Ch.13) │
└──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘
▲ │
└──────────── findings feed back into the next DESIGN ◀──────┘
"shift left": the earlier the gate, the cheaper the fix.
Figure 12.1 — The SSDLC with security gates. Security is not a phase at the end; it is an activity inside every phase, and each phase's findings feed the next cycle's design. This chapter populates each gate; Chapter 31 automates them in a CI/CD pipeline.
🚪 Threshold Concept: Security is a property of the process that produces software, not an inspection you perform on finished software. A team that threat-models its designs, codes against known patterns, and scans automatically at every stage will ship more secure software with less friction than a team that builds freely and then submits to a dreaded pre-launch audit — because the first team never builds most of the vulnerabilities in the first place. Internalize this and "where does security fit?" answers itself: everywhere, a little, continuously.
This reframes the developer's role. A developer who validates input, parameterizes a query, and encodes output is not "doing security's job" — they are doing their own job correctly, the way a structural engineer who sizes a beam to code is not doing the inspector's job. The security team's job is to make that correctness the path of least resistance: provide the validated-input helper, the safe query library, the linter rule that flags the dangerous pattern, the threat-model template. We will spend this chapter building exactly those scaffolds. The recurring theme from Chapter 1 — the human is the weakest link and the strongest asset — has a precise application-security form: an untrained developer is your largest attack surface, and a trained, tooled developer is your largest army of defenders. The same person, equipped differently.
🔗 Connection: This chapter builds directly on two earlier foundations. From Chapter 1 we keep the risk vocabulary — every OWASP category below is a vulnerability class, and your job is to reason about the risk (likelihood × impact) it carries in your context, not to treat all of them as equally urgent. From Chapter 3 we keep the principles: least privilege governs what an application and its database account may do, and defense in depth is why we still validate input even though we also encode output even though we also run a WAF (Chapter 13) — no single layer is trusted to hold alone.
🔄 Check Your Understanding: 1. What does "shift left" mean, and why does finding a defect at the design stage cost so much less than finding it in production? 2. Why is "have security review it before launch" a weak model, even when the security reviewers are excellent?
Answers
- "Shift left" means moving security activities earlier in the development timeline (design and code, not just pre-launch). Earlier is cheaper because a design flaw caught at the whiteboard costs a conversation, while the same flaw in production costs an incident, disclosure, regulatory exposure, and lost trust — and the cost curve is steep. 2. By launch the architecture is fixed, the code is written, and the deadline is imminent, so reviewers can only document problems, not prevent them; security has to be built into each phase, where decisions are still cheap to change.
12.2 The OWASP Top 10: a guided tour of where applications fail
If you defend applications, you will hear "OWASP Top 10" constantly, and it is worth being precise about what it is and is not. The OWASP Top 10 is a periodically updated awareness document, published by the Open Worldwide Application Security Project, that ranks the categories of web-application security risk that are most prevalent and most impactful, drawn from real-world data and practitioner survey. It is the closest thing application security has to a shared starting map. Three cautions before the tour:
- It is a list of categories, not individual bugs. "Injection" is a category that contains many specific weaknesses. Mapping from a category to a specific, catalogued weakness is the job of the CWE — the Common Weakness Enumeration, a community catalog (maintained by MITRE) of software and hardware weakness types, each with a CWE number (for example, CWE-89 is "SQL Injection"). The OWASP Top 10 tells you which categories to worry about most; CWE tells you exactly which weakness a finding is. You will see both in scanner output for the rest of your career.
- It is an awareness and prioritization tool, not a compliance checklist or a complete standard. For a thorough, testable standard, OWASP also publishes the Application Security Verification Standard (ASVS) — a far longer list of concrete requirements at three assurance levels — which is what you reach for when you need to verify an application rather than merely be aware of risk.
- It is web-centric, and so are most of its categories. That is convenient for us, because it lets this chapter introduce the categories at the program and SDLC level — what each one is, how it gets in, and how a secure-development program prevents it — while Chapter 13 dissects the specific web attacks (SQL injection, cross-site scripting, request forgery, and their kin) in the depth they deserve. Think of this section as the map and Chapter 13 as the close-up of the most dangerous neighborhoods.
Here is the tour. For each category we name what it is, the mechanism (why it happens), how it is abused, and the program-level defense — the practice that prevents the whole class, not a single instance.
A01 — Broken Access Control. What: the application fails to enforce what a user is allowed to do, so users reach data or actions that should be denied. Mechanism: authorization checks are missing, incomplete, or enforced only in the user interface (hiding a button is not a control). The classic form is insecure direct object reference — the app trusts an identifier from the request (/account?id=1004) and returns that object without checking the requester owns it. Abused: a user changes 1004 to 1005 and reads someone else's loan; a non-admin calls an admin API that the UI merely hid. Defense: enforce authorization server-side, on every request, by default-deny (Chapter 17 builds the access-control models). Center the check on the authenticated identity, never on a value the client supplied. Test it: for every object endpoint, ask "what if the ID belongs to someone else?"
A02 — Cryptographic Failures. What: sensitive data is exposed because cryptography is missing, weak, or misused. Mechanism: data sent in cleartext, stored unencrypted, protected by a broken algorithm, or "protected" by a homegrown scheme. Abused: an attacker on the network reads data in transit; a stolen database yields plaintext passwords or card numbers. Defense: this is the applied half of Chapters 4 and 5 — encrypt in transit (TLS) and at rest, hash passwords with a slow algorithm built for it, and never invent your own cryptography. Classify data first so you know what must be protected; you cannot encrypt what you have not identified as sensitive.
A03 — Injection. What: untrusted input is interpreted as a command by an interpreter — SQL, OS shell, LDAP, an expression language. Mechanism: the application builds a command by concatenating data into a string of code, so the interpreter cannot tell the developer's intent from the attacker's data. Abused: an input of ' OR '1'='1 turns a login query into one that always succeeds; a crafted value runs a shell command. Defense: separate code from data so the interpreter is never asked to parse attacker input as instructions — parameterized queries for SQL, safe APIs for shells, and input validation as a second layer. We preview the pattern in §12.3; Chapter 13 dissects injection in full. Log4Shell was, at root, an injection-class flaw: Log4j interpreted a special syntax inside ordinary log data and was tricked into fetching and running remote code — data became instructions.
A04 — Insecure Design. What: the application is insecure because of a flaw in its design, not a bug in its code. Mechanism: a missing or wrong security control was never specified — there is no rate limit on a password-reset, no limit on transferable funds, no separation between roles. You cannot patch your way out of a missing requirement. Abused: the feature works exactly as designed, and the design is exploitable. Defense: this category exists to make a point — threat modeling (§12.6) and security requirements are the only controls that catch design flaws, because no scanner can find a control that was never meant to exist. This is why we threat-model before we code.
A05 — Security Misconfiguration. What: the application or its stack is configured insecurely — default credentials left in place, verbose error messages that leak internals, unnecessary features enabled, missing hardening. Mechanism: defaults favor convenience, and "it works" is mistaken for "it is configured." Abused: an attacker logs in with a default admin password, reads a stack trace that reveals the database structure, or finds a debug endpoint left enabled. Defense: a hardened, repeatable baseline (the Chapter 11 discipline, applied to the application tier), least-functionality (turn off what you do not use), and configuration as code so the secure setup is the only setup.
A06 — Vulnerable and Outdated Components. What: the application depends on libraries, frameworks, or runtimes with known vulnerabilities. Mechanism: dependencies are added and forgotten; nobody tracks what is in the build or watches for new advisories against it. Abused: this is the Log4Shell category. An attacker exploits a published vulnerability in a component you are running but did not know you had. Defense: software composition analysis (SCA) to inventory and monitor dependencies (§12.4–12.5), a patch process that can move fast (Chapter 23), and eventually a software bill of materials so you can answer "where do we use X?" in minutes, not days. We dwell on this one in §12.4 because it is the chapter's anchor.
A07 — Identification and Authentication Failures. What: the application fails to confirm identity correctly — weak passwords allowed, no protection against automated guessing, broken session handling. Mechanism: authentication is built ad hoc instead of using vetted mechanisms. Abused: credential-guessing at scale, session hijacking, predictable password resets. Defense: use established authentication frameworks, enforce strong-credential and lockout policy, and manage sessions correctly — the substance of Chapter 16 (authentication) and Chapter 13 (session and auth flaws in the web context).
A08 — Software and Data Integrity Failures. What: the application trusts code or data whose integrity it has not verified — an unsigned update, an unverified plugin, a deserialized object from an untrusted source, a compromised build step. Mechanism: "if it arrived, it must be legitimate" — no signature, no verification. Abused: an attacker substitutes a malicious update or library, or smuggles code through the build. Defense: verify integrity — signed artifacts, checked dependencies, a trustworthy build pipeline. This is the application-tier seed of the supply-chain story that Chapter 29 (provenance, SBOM) and Chapter 31 (pipeline integrity) develop fully.
A09 — Security Logging and Monitoring Failures. What: the application does not produce the evidence needed to detect, investigate, or respond to an attack. Mechanism: security-relevant events (logins, access-control failures, high-value actions) are not logged, or logs are not monitored. Abused: an attacker operates undetected because nothing recorded their failed authorization checks or anomalous actions. Defense: log the security-relevant events with enough context, ship them where they can be watched, and — recalling Chapter 1's theme that logs are the ground truth — design the application to help the defenders who will one day investigate it. This feeds the SIEM you build in Chapter 21.
A10 — Server-Side Request Forgery (SSRF). What: the application can be tricked into making requests to destinations the attacker chooses, including internal systems it should never reach. Mechanism: the app fetches a URL supplied or influenced by the user without restricting where it may go. Abused: an attacker points the server at an internal metadata service or an unexposed admin interface, using the trusted server as a proxy. Defense: validate and restrict outbound destinations (allowlists, blocked internal ranges), and apply least privilege to what the server may reach. Chapter 13 dissects SSRF in the web context.
🛡️ Defender's Lens: Notice a pattern across the ten. Half are about trusting input you should not (injection, SSRF, access control on client-supplied IDs); several are about failing to verify integrity or identity (components, software-and-data integrity, authentication); and a few are about the program around the code (insecure design, misconfiguration, logging). From the blue-team seat, this is reassuring: you do not need ten unrelated defenses. You need a handful of disciplines — validate input, separate code from data, verify integrity, enforce authorization server-side, configure to a baseline, and log the right events — applied everywhere. The Top 10 is ten symptoms of a smaller number of habits.
⚠️ Common Pitfall: Treating the OWASP Top 10 as a test you pass rather than risks you manage. A team that "checked the Top 10" once before launch has done roughly what a team that scanned for 1,400 vulnerabilities and called it security did in Chapter 1 — a snapshot, not a program. The categories shift over time (older lists separated XSS and named injection differently; newer ones elevated insecure design and SSRF as the data demanded), precisely because the document tracks reality rather than freezing it. Use it as a recurring map, not a one-time gate.
🔄 Check Your Understanding: 1. What is the difference between the OWASP Top 10 and a CWE number? Which one would appear on a single scanner finding for a specific bug? 2. Which OWASP category is the Log4Shell home, and what single program-level practice most directly addresses that category?
Answers
- The OWASP Top 10 is a ranked list of categories of risk for awareness and prioritization; a CWE is a specific weakness type with an ID (e.g., CWE-89 SQL Injection) from MITRE's catalog. A single scanner finding for a specific bug cites a CWE; the Top 10 groups many CWEs into a category. 2. Log4Shell lives in A06, Vulnerable and Outdated Components; the most direct practice is software composition analysis (SCA) — inventorying and monitoring dependencies so you know what you run and when it becomes vulnerable. (It also has an injection character, A03, since data was interpreted as instructions.)
12.3 Input validation and output encoding: the two patterns that prevent the most bugs
If you could teach a development team only two secure-coding habits, these would be the two, because between them they prevent a startling share of the injection and scripting bugs that dominate every Top 10. They are easy to confuse and they are not interchangeable. The slogan to carry: validate input on the way in; encode output on the way out.
Input validation is the practice of checking that incoming data conforms to what the application actually expects before the application acts on it — and rejecting (or, carefully, sanitizing) what does not. The governing principle is positive validation, also called allowlisting: define what is allowed and reject everything else. This is far stronger than denylisting — trying to enumerate what is forbidden — because you cannot list every dangerous input an attacker might invent, but you usually can list what a legitimate value looks like. A U.S. ZIP code is five digits (or five-plus-four). A loan-account ID is a 10-character string matching a known pattern. A transfer amount is a positive number within an allowed range. If you specify the allowed shape and reject the rest, you have closed the door on inputs you never imagined, which is exactly the inputs an attacker will try.
Two rules make input validation actually safe, and beginners violate both:
- Validate on the server, always. Client-side validation (in the browser, in the mobile app) is a usability feature — it gives the user fast feedback. It is not a security control, because the client is under the attacker's control: anyone can bypass the browser and send raw requests directly to your server. Every check that matters for security must be re-done server-side, where the attacker cannot reach the code. A validation that runs only in JavaScript is decoration.
- Validate as close to the expected type as possible. Parse a date into a date, a number into a number, an ID into its known format, and reject anything that does not parse. "Is this a string that does not contain bad characters?" is a weaker, more error-prone check than "is this a valid integer between 1 and 5?"
Here is the pattern at the category level. The point is not the specific language; it is the shape — declare the allowed form, reject everything else, and do it server-side.
# VULNERABLE: trusts input shape; "validation" is denylist + client-side only
def set_transfer_amount_bad(raw_amount):
# (a) denylist: try to strip "bad" things — always incomplete
cleaned = raw_amount.replace(";", "").replace("--", "")
# (b) assumes the browser already checked it's a number — it didn't
return float(cleaned) # crashes or misbehaves on hostile input
# Expected output (illustrative): float("1000 OR 1=1") -> ValueError at runtime,
# or worse, a value that flows unchecked into a query downstream.
# FIXED: positive validation, server-side, parsed to the expected type and range
from decimal import Decimal, InvalidOperation
MAX_TRANSFER = Decimal("25000.00") # a business rule, enforced in code
def set_transfer_amount_good(raw_amount):
"""Accept only a well-formed, in-range monetary amount; reject everything else."""
try:
amount = Decimal(str(raw_amount)) # parse to the expected TYPE
except InvalidOperation:
raise ValueError("amount is not a valid number")
if amount <= 0 or amount > MAX_TRANSFER: # enforce the allowed RANGE
raise ValueError("amount out of allowed range")
return amount
# Expected output (hand-traced):
# set_transfer_amount_good("1000.00") -> Decimal('1000.00')
# set_transfer_amount_good("0") -> ValueError: amount out of allowed range
# set_transfer_amount_good("1000 OR 1=1") -> ValueError: amount is not a valid number
# set_transfer_amount_good("99999999") -> ValueError: amount out of allowed range
The fixed version never tries to clean hostile input; it parses to the type it wants and rejects what does not fit. That inversion — from "remove the bad" to "require the good" — is the whole idea.
Output encoding (also called output escaping) is the complementary pattern, and it solves a problem validation cannot. Input validation asks "is this acceptable input?" Output encoding asks "is this data safe in the context where I am about to put it?" The same string can be perfectly valid as a name and dangerous when dropped, unescaped, into a web page, a SQL string, a shell command, or an HTML attribute. The defense is to transform the data so that the interpreter at the destination treats it as inert data, not as syntax — encode for the context you are writing into. A < becomes < when written into HTML so a browser renders it as a literal less-than sign instead of the start of a tag; a value bound as a parameter to a SQL statement is handled by the database driver as data, never parsed as SQL.
The non-negotiable rule is context-specific encoding: the right encoding depends entirely on the destination. HTML body, HTML attribute, JavaScript, URL, SQL, and shell each have different rules, and encoding for the wrong one provides no protection (HTML-encoding a value that you then drop into a JavaScript context does nothing useful). This is precisely why we lean on frameworks and safe APIs rather than hand-rolled escaping: a mature templating engine encodes for the correct context automatically, and a parameterized-query API moves the code/data boundary into the driver where attacker input cannot become SQL.
USER INPUT ──▶ [ INPUT VALIDATION ] ──▶ application logic ──▶ [ OUTPUT ENCODING ] ──▶ destination
"is it the reject what isn't the (store, compute, make it inert FOR (HTML page,
shape I allowed shape; do it decide) THIS destination's SQL driver,
expect?" SERVER-SIDE interpreter shell, ...)
│
Defense in depth: BOTH gates run. Validation alone can miss context; encoding alone can let
malformed data into your logic. Neither replaces the other.
Figure 12.2 — Input validation and output encoding are two distinct gates at two distinct moments. Validation governs what enters your logic; encoding governs how data leaves it into a particular interpreter. A robust application runs both, because each catches what the other misses.
⚠️ Common Pitfall: Believing input validation alone stops injection and scripting. It helps, but it is not sufficient, because data that was valid input (a perfectly legitimate name like
O'Brien, or a comment containing<) can still be dangerous output when placed unescaped into a SQL string or an HTML page. The reverse pitfall is just as common: relying entirely on encoding and letting malformed data corrupt your business logic. They are defense in depth for data — you run both, at their respective moments. Chapter 13 shows exactly how injection and cross-site scripting slip through when one of these gates is missing.🧩 Try It in the Lab: In your own sandbox, write a tiny function that validates a username against a strict allowlist (say, 3–20 characters of
[a-zA-Z0-9_]only) and rejects everything else, then hand-trace it against five inputs: a normal name, an empty string, a name with a space, a 30-character name, and a name containing<script>. Which inputs does positive validation reject that a denylist of "remove<script>" would let through? You will quickly see why allowlisting is the stronger default.🔄 Check Your Understanding: 1. Why is client-side input validation not a security control, and what must you do instead? 2. A name like
O'Brienpasses input validation as a legitimate name. Explain why output encoding (or parameterization) is still required before that name is written into a SQL statement or an HTML page.
Answers
- The client (browser/mobile app) is under the attacker's control, so any check that runs only there can be bypassed by sending raw requests directly to the server; security-relevant validation must be re-done server-side where the attacker cannot alter the code. 2. Because
O'Brienis valid input but its apostrophe is dangerous in a SQL string context (it can break out of a quoted literal) and its characters can matter in HTML; output encoding/parameterization makes the data inert for the destination interpreter, which input validation does not do — they defend different moments.
12.4 Secrets, dependencies, and the software supply chain: the code you didn't write
We now arrive at the chapter's anchor, and at the category that taught the whole industry a humbling lesson. Two related problems live here: secrets that get embedded where they should not, and dependencies that bring risk you did not choose. Both flow into the larger concern of the software supply chain — everything that goes into your software that you did not write yourself, and the question of whether you can trust it.
Secrets in code. A secret is any credential that grants access — a password, an API key, a database connection string, a private key, a token. The recurring failure is hard-coding secrets into source code, where they end up committed to version control, copied into build logs, and shared with everyone who ever clones the repository. The mechanism is convenience: it is faster to paste the key into the file than to wire up a proper secrets store. The abuse is brutal in its simplicity — an attacker who reads the source (a leaked repo, a misconfigured public repository, an insider, a compromised laptop) reads the keys, and a key in a Git history is permanently exposed even if you delete the line later, because version control remembers. The program-level defense is to keep secrets out of code entirely: inject them at runtime from a secrets manager or environment, scan commits for secrets before they merge, and rotate any secret that touches a repository. This is a preview; Chapter 20 builds the full secrets-management discipline (vaults, dynamic secrets, rotation, machine identity), and your appsec.py toolkit gains a secret-scanning function there. For now, hold the rule: a secret in source control is a secret that has leaked.
Dependency risk. Modern software is assembled, not written. A typical application is a thin shell of your own code resting on a deep stack of third-party libraries, each of which depends on further libraries, recursively. This is enormously productive — you should not reimplement a JSON parser or a logging framework — but it means dependency risk: the risk that a component you depend on (directly or transitively) contains a vulnerability, is malicious, or is abandoned. The two words that matter are direct (a library you chose and listed) and transitive (a library your library chose, which you may not even know you have). Transitive dependencies are where the danger hides, because nobody reviewed them and nobody is watching them.
Now Log4Shell, in full, because it makes every abstraction above concrete.
📟 War Story: Log4Shell at Meridian. Constructed in its Meridian specifics; the Log4j vulnerability (CVE-2021-44228, CVSS 9.8 Critical) and its industry impact are real. When the advisory broke, Sam Whitfield's first instinct was the right one and also, at first, useless: "patch it." But Meridian could not patch what it could not find. Log4j was not in any Meridian application's dependency list that a developer had written. It arrived transitively — three layers deep — inside a component of the vendor-supplied loan-origination platform, and again inside a logging sidecar in an internal microservice, and again inside a build tool nobody thought of as "an application" at all. The vulnerability let an attacker who could place a crafted string anywhere that got logged — a
User-Agent, a form field, a username on a failed login — make the server reach out and execute remote code. The exposure was therefore everywhere the bank logged untrusted input, which is to say everywhere.The week that followed was not a patching exercise; it was a discovery exercise. Priya Nair's team hunted for the string patterns in logs and proxy data (detection while a fix was prepared). Sam's team raced to inventory: which of Meridian's hundreds of services, build tools, and vendor products contained a vulnerable Log4j, at what version, exploitable how? They had no software bill of materials, so the inventory was archaeology — grepping dependency trees, querying the vendor, scanning artifacts. The questions that decided whether Meridian was breached were not "how does the exploit work?" but "do we use it, and where?" — and those were the questions the bank was least equipped to answer.
Sit with that, because it is the lesson the whole chapter is built to deliver. Log4Shell was not a failure of any developer's code. It was a failure of knowing what you run. The defenses are unglamorous and they are exactly what a secure-development program provides:
- Inventory your dependencies, transitively. You cannot manage risk in components you cannot list. This is the core job of software composition analysis (§12.5). Eventually it produces a software bill of materials (SBOM) — a formal, machine-readable inventory of every component in a piece of software — which is introduced in Chapter 23 and fully developed in Chapter 29. The reason an SBOM matters is precisely the Log4Shell question: with one, "where do we use Log4j?" is a database query; without one, it is a week of archaeology.
- Monitor those dependencies against advisories. Knowing you use a component is only half the job; you must learn the day it becomes vulnerable. SCA tooling watches your inventory against feeds of new vulnerabilities and tells you when a component you run is now dangerous.
- Be able to patch fast. When the advisory lands, the clock is the attacker's. A program that can identify, test, and deploy a dependency update in hours rather than weeks survives; one that cannot, gambles. Chapter 23 turns this into a prioritized, SLA-driven vulnerability-management process — and uses Log4Shell again as its worked example of triage under pressure.
- Prefer fewer, healthier dependencies. Every dependency is a trust decision and a future liability. Favor well-maintained components, remove unused ones, and treat "add a library" as the security decision it is.
🔗 Connection: Log4Shell threads through this entire book because it touches so many disciplines. You meet it here as a secure-development and dependency problem; in Chapter 23 it returns as a vulnerability-management and prioritization problem (how do you triage a 9.8 across thousands of assets?); in Chapter 29 it returns as a supply-chain and SBOM problem (how do you contractually require and operationally use a bill of materials?); and in Chapter 31 the lesson generalizes to pipeline integrity. The same incident is a different lesson from each seat — which is itself the point: application security is not a silo.
🛡️ Defender's Lens: During a Log4Shell-class event, the defender's advantage is preparation done in advance. The organization that had an SBOM answered "where?" in an afternoon and spent the rest of the week patching. The organization that did not, spent the week finding and patched whatever it could before the attackers found it first. Neither organization could change the vulnerability; they could only differ in how well they knew their own estate. This is Chapter 1's theme — you cannot protect what you do not know you have — written in the language of dependencies. Build the inventory before you need it.
🔄 Check Your Understanding: 1. Why was Log4Shell so hard for organizations to respond to, even though a patch existed quickly? 2. Explain the difference between a direct and a transitive dependency, and why transitive dependencies are the more dangerous of the two.
Answers
- Because the hard problem was not fixing the vulnerability but finding it: Log4j was a deeply transitive dependency present in countless products and services that organizations did not know they ran, so most lacked an inventory (SBOM) to answer "do we use it, and where?" — turning patching into a slow discovery exercise. 2. A direct dependency is a library you explicitly chose and listed; a transitive dependency is one pulled in by your dependencies, often without your awareness. Transitive ones are more dangerous because no one on your team reviewed or is monitoring them, yet they run with the same access as everything else.
12.5 SAST, DAST, and SCA in practice: automating the search for weakness
A secure-development program cannot rely on humans to find every flaw by reading code; the volume is too high and the bugs are too easy to miss. So we automate, with three families of tooling that answer three different questions. New engineers blur them together; precision here is what lets you choose the right tool and interpret its output sanely.
SAST — Static Application Security Testing. SAST analyzes your source code (or compiled bytecode) without running it — "static" means at rest. It parses the code, models how data flows through it, and flags dangerous patterns: input that reaches a SQL string without parameterization (a taint path from a source to a dangerous sink), use of a known-weak function, a hard-coded secret, a missing authorization check it can recognize. Because it reads the code, SAST can run early (in the developer's editor, on every commit) and can point to the exact file and line — its great strength. Its great weakness is false positives: without running the program, it cannot always tell whether a flagged path is truly reachable or truly exploitable, so it over-warns, and a SAST tool that cries wolf gets ignored. SAST is the gate at the code phase of Figure 12.1. (The taint_demo function your appsec.py toolkit gains in Chapter 13 is an illustrative, defensive sketch of the source-to-sink idea SAST automates.)
DAST — Dynamic Application Security Testing. DAST tests a running application from the outside, as an attacker would — "dynamic" means in motion. It sends crafted requests at the live app and observes the responses, looking for behaviors that reveal a vulnerability. Because it exercises the real, running system, a DAST finding is more likely to be real and exploitable (fewer false positives in that sense), and it catches issues that only appear at runtime or in configuration — things SAST, which never runs the app, cannot see. Its weaknesses mirror SAST's strengths: it needs a deployed, running application (so it runs later, at the test phase), it cannot point to a line of source, and it only finds what it manages to reach and trigger, so it can miss flaws in code paths it never exercised (false negatives). SAST and DAST are complementary by construction: one reads the blueprint, the other rattles the doors.
SCA — Software Composition Analysis. SCA answers the question §12.4 made urgent: what third-party components are in this software, and are any of them known to be vulnerable? It inspects your dependency manifests and build artifacts, builds the full transitive dependency tree, and checks each component and version against vulnerability databases — and, increasingly, flags risky licenses and unmaintained projects. SCA is what would have answered "do we use Log4j, and where?" in an afternoon. It runs at the build phase (and continuously, since new advisories land against components you already shipped). SAST and DAST examine the code you wrote; SCA examines the code you imported — and since most of your software is imported, SCA is not optional.
Here is the comparison a practitioner keeps in their head:
| SAST | DAST | SCA | |
|---|---|---|---|
| Examines | Your source/bytecode, at rest | Your running application, from outside | Your third-party dependencies |
| "White/black box" | White box (sees the code) | Black box (sees only behavior) | Inventory + known-vuln matching |
| Finds | Insecure code patterns, taint paths, hard-coded secrets | Runtime/config vulns, exploitable behaviors | Known-vulnerable & outdated components |
| Runs (SDLC phase) | Early — editor, commit (code) | Later — needs a deployed app (test) | Build + continuously |
| Points to a line? | Yes (file & line) | No (a request/response) | Yes (a component & version) |
| Main weakness | False positives (can't confirm exploitability) | False negatives (only finds what it reaches) | Only as good as its vuln database; can't judge your own code |
| Answers Log4Shell's question? | No | Maybe (if it triggers it) | Yes — "do we use it, where?" |
The mature answer is not "which one?" but "all three, at the phase each fits," feeding their findings into a single prioritized view — which is exactly the CI/CD pipeline integration of Chapter 31. None of the three replaces threat modeling (§12.6) or human code review, because tools find the bugs they were built to recognize and miss the design flaws and business-logic abuses that require understanding intent. Automation is leverage, not a substitute for thinking.
⚠️ Common Pitfall: Drowning developers in unfiltered tool output. A SAST scan that returns 4,000 findings, most of them false positives or low-risk, trains developers to ignore all of it — including the three findings that mattered. The fix is the same risk-based prioritization from Chapter 1, applied to tool output: tune the tools, suppress known-false patterns, rank by exploitability and asset criticality, and gate the build on the high-confidence, high-severity subset (Chapter 31's
ci_gate), not on raw volume. A noisy scanner that everyone ignores is worse than no scanner, because it manufactures false assurance.🛡️ Defender's Lens: Think of the three tools as covering three blind spots. SAST covers "we wrote a dangerous pattern." DAST covers "the running system behaves exploitably." SCA covers "we imported a known-vulnerable component." A vulnerability that escapes all three usually lives in the fourth space — insecure design and business logic — which is why threat modeling is not a tool you can buy. When you assess a team's AppSec maturity, check not whether they own scanners but whether they have closed all four spaces: the three automatable ones and the human one.
🔄 Check Your Understanding: 1. You want to find whether any of your imported libraries has a newly published critical vulnerability. Which of SAST/DAST/SCA answers this, and why not the other two? 2. A SAST tool flags a line; a DAST scan of the same application does not flag the corresponding behavior. Give two different, legitimate reasons this can happen.
Answers
- SCA — it inventories your (transitive) dependencies and matches them against vulnerability databases. SAST examines your own code, not third-party components' known vulnerabilities; DAST tests runtime behavior and would only catch it if it happened to trigger the specific flaw, which it usually will not for an arbitrary library vulnerability. 2. (a) The SAST finding is a false positive — the flagged path is not actually reachable or exploitable at runtime, so DAST correctly sees nothing. (b) The flaw is real but DAST is a false negative here — it never reached or triggered that code path (wrong input, unauthenticated, behind a feature flag), so it missed a genuine issue SAST caught.
12.6 Threat modeling a feature: turning "what could go wrong?" into requirements
Tools find the bugs you can recognize mechanically. Threat modeling finds the ones you have to think about — the design flaws, the missing controls, the business-logic abuses that no scanner will ever flag because they are not deviations from the code; they are the code, working as designed. This section gives you a practical, lightweight method you can run on a single feature in an hour, and shows how its output becomes the security requirements that drive everything else.
Threat modeling (application) is the structured activity of analyzing a design to identify what could go wrong from a security standpoint, before it is built, so the defenses can be designed in rather than retrofitted. (The basic idea of a threat model was introduced for the whole organization in Chapter 2; here we apply it at the granularity of a feature.) The honest version of the practice answers four questions, popularized by the AppSec community as the framing of any threat model:
- What are we building? Draw the feature — its components, its data, and especially its trust boundaries (the lines where data crosses from less-trusted to more-trusted, such as the browser-to-server boundary). Most threats live at trust boundaries.
- What can go wrong? Enumerate threats systematically. The standard mnemonic is STRIDE, which names six threat types so you do not miss a category: Spoofing (pretending to be someone), Tampering (altering data), Repudiation (denying an action with no evidence to prove otherwise), Information disclosure (leaking data), Denial of service (degrading availability), and Elevation of privilege (gaining rights you should not have).
- What are we going to do about it? For each credible threat, decide on a control — mitigate it, eliminate the feature, transfer the risk, or consciously accept it (the treatment vocabulary of Chapter 1, formalized in Chapter 27).
- Did we do a good job? Validate the model and its mitigations, and revisit when the design changes.
Let us run it on a real Meridian feature so it stops being abstract.
📟 War Story: threat-modeling the loan-document upload. Constructed (Tier 3). Meridian is adding a feature to its loan-origination app: an applicant uploads supporting documents (pay stubs, tax forms), and a loan officer later views them. Sam Whitfield runs a one-hour threat-modeling session with the two developers before a line is written.
Step 1 — what are we building? They sketch it: a browser uploads a file to an application server, which stores it and records metadata in the database; later, a loan officer's browser requests the file and the server returns it. The key trust boundary is obvious once drawn — the line between the applicant's browser (fully untrusted; anyone on the internet) and the server. A second, subtler boundary sits between the loan officer's session and the stored documents.
APPLICANT'S BROWSER ║ trust MERIDIAN APP SERVER DATABASE
(untrusted internet) ║ boundary
│ upload file ──────────╫────────────▶ [ receive ] ──▶ [ store file ] ──▶ metadata row
│ ║ │
LOAN OFFICER'S BROWSER ║ [ serve file on request ] ◀── file store
│ view file ◀───────────╫────────────────────┘
║ most threats cross THIS line ↑
Figure 12.3 — The loan-document upload feature with its primary trust boundary. The threats worth finding are the ones that cross the boundary between the untrusted internet and the trusted server.
Step 2 — what can go wrong? (STRIDE) Walking the letters keeps them honest: - Spoofing — could an unauthenticated user upload, or impersonate an applicant? (The upload endpoint must require the applicant's authenticated session.) - Tampering — could the file or its metadata be altered in transit or at rest? (TLS in transit; integrity controls at rest.) - Repudiation — if a fraudulent document is later disputed, can the bank prove who uploaded what, when? (Log the upload with identity and timestamp — directly serving A09.) - Information disclosure — here is the one that matters. When the loan officer requests a file, does the server check that this officer is allowed to see that applicant's documents, or does it trust the document ID in the URL? An insecure direct object reference (A01) here means changing the ID in the request exposes another applicant's tax forms. Also: are uploaded files stored where the web server might serve them directly, bypassing the check entirely? - Denial of service — could a huge or malicious file (a "zip bomb," thousands of uploads) exhaust storage or processing? (Enforce size limits, type limits, rate limits.) - Elevation of privilege — could a malicious file become code? An uploaded file that the server later executes, or that carries an exploit for the document parser, turns "upload" into "run." (Validate type, store outside any executable path, scan content.)
Step 3 — what do we do about it? The session turns each credible threat into a control, and Sam writes them down not as advice but as security requirements — testable statements the feature must satisfy.
A security requirement is a specific, verifiable statement of what the software must do (or must never do) to be secure — the bridge between "we found a threat" and "a developer built a defense and a tester confirmed it." Vague advice ("validate input," "be secure") cannot be built or tested; a requirement can. Here is the session's output, the form that actually changes what ships:
| # | Security requirement (from a STRIDE threat) | STRIDE / OWASP | Verifiable by |
|---|---|---|---|
| SR-1 | The upload endpoint SHALL reject requests without a valid authenticated applicant session. | S / A01, A07 | DAST, code review |
| SR-2 | On every document retrieval, the server SHALL verify the requesting user is authorized for that specific document, keyed to the authenticated identity — never to a client-supplied ID alone. | I / A01 | Code review, abuse test |
| SR-3 | Uploaded files SHALL be stored outside any web-servable or executable path; the server SHALL never execute uploaded content. | E / A05, A08 | Architecture review |
| SR-4 | The server SHALL accept only an allowlist of file types and enforce a maximum size; oversized or disallowed uploads SHALL be rejected. | D / A03-adjacent | DAST, unit test |
| SR-5 | Every upload and retrieval SHALL be logged with user identity, document ID, and timestamp. | R / A09 | Log review |
🚪 Threshold Concept: A threat model's deliverable is not a diagram — it is a list of requirements. The diagram and the STRIDE walk are how you discover the requirements; the requirements are how the discovery changes the software. A threat model that produces beautiful diagrams and no testable requirements has produced nothing. Once you see threat modeling as a requirements-generation activity, it stops feeling like a security ritual and starts being what it is: the cheapest place in the entire SDLC to prevent a vulnerability — at the whiteboard, before the code exists.
Notice what just happened. Without writing any code, the team eliminated an insecure design (A04) — the original sketch had no authorization check on retrieval at all, which would have shipped as a textbook IDOR — and produced five requirements that SAST, DAST, and code review can each verify. This is the SSDLC closing its loop: threat modeling (design) generates requirements; secure coding and SAST (code) implement and check them; DAST (test) confirms them on the running system; logging (operate) feeds the SOC. Every gate in Figure 12.1 now has something concrete to do.
⚠️ Common Pitfall: Saving threat modeling for "important" features and skipping it for "simple" ones. The loan-document upload looked simple — "just let them attach a file." Simple features routinely hide the worst flaws precisely because no one thought they warranted scrutiny; the IDOR in SR-2 would have been trivially exploitable and catastrophic, on a feature a team might have built in an afternoon without a second thought. Threat-model anything that crosses a trust boundary or touches sensitive data, regardless of how small it seems.
🔄 Check Your Understanding: 1. What do the letters of STRIDE stand for, and what is the purpose of walking through them rather than just "brainstorming threats"? 2. Why is the deliverable of a threat model a set of security requirements rather than a diagram, and what makes a security requirement different from advice like "validate input"?
Answers
- Spoofing, Tampering, Repudiation, Information disclosure, Denial of service, Elevation of privilege. Walking the six categories systematically ensures you do not overlook a whole class of threat (free-form brainstorming tends to find the threats you already expect and miss the ones you don't). 2. Because the diagram only helps you discover threats; the software changes only when threats become testable requirements that a developer builds and a tester verifies. A security requirement is specific and verifiable ("the server SHALL verify the requester is authorized for that specific document"), whereas "validate input" cannot be built or confirmed as written.
Project Checkpoint
Two deliverables advance for Meridian this chapter: a section of the security program, and a new bluekit module.
Program increment — the Secure SDLC policy. Dana asked Elena Vasquez and Sam Whitfield to turn this chapter's practices into a one-page Secure Software Development Lifecycle policy — the artifact a CISO can hand an auditor and a development manager alike. It is deliberately short and mandatory, mapping each SDLC phase to a required security activity (the gates of Figure 12.1):
- Design: every feature crossing a trust boundary or touching customer/cardholder data SHALL be threat-modeled (STRIDE) and its threats recorded as security requirements before implementation.
- Code: developers SHALL follow the secure-coding standard (server-side input validation by allowlist; context-correct output encoding; parameterized queries; no hard-coded secrets), and SAST SHALL run on every commit.
- Build: SCA SHALL inventory all direct and transitive dependencies and fail the build on a known-critical vulnerable component; a secrets scan SHALL run on the repository.
- Test: DAST SHALL run against a deployed build before release; security requirements SHALL be verified.
- Deploy/Operate: security-relevant events SHALL be logged to the SIEM (Chapter 21); dependency advisories SHALL be monitored continuously and triaged per the vulnerability-management policy (Chapter 23).
This policy is the spine that Chapter 13 (web-app controls), Chapter 23 (vuln management), Chapter 29 (supply chain), and Chapter 31 (the CI/CD pipeline that automates these gates) all attach to. Meridian's developers are now, formally, the first line of defense — with a process that makes it achievable.
bluekit increment — appsec.py with scan_dependencies(reqs). Log4Shell's lesson was "know what you run." So the toolkit's first AppSec function is a tiny dependency checker: given a list of pinned dependencies and a (toy) advisory database, it flags components running a known-vulnerable version — the SCA idea in miniature. As always, the code is illustrative and hand-traced; nothing is executed during authoring.
# bluekit/appsec.py — Chapter 12 increment
"""Application-security helpers. Chapter 12 adds scan_dependencies();
Chapter 13 will add an illustrative taint_demo(). Code is never executed here —
every block shows its hand-traced expected output."""
# A toy advisory feed: {package: (max_vulnerable_version_inclusive, advisory_id)}.
# In reality an SCA tool queries live databases (NVD, GitHub Advisories) over the
# FULL transitive dependency tree. This shows the idea, not a production scanner.
KNOWN_VULNERABLE = {
"log4j-core": ("2.14.1", "CVE-2021-44228"), # Log4Shell: fixed in 2.15.0+
"openssl": ("1.0.2u", "CVE-2016-2107"),
}
def _ver_tuple(v):
"""Turn '2.14.1' into (2, 14, 1) for comparison; non-numeric parts -> 0."""
return tuple(int(p) if p.isdigit() else 0 for p in v.split("."))
def scan_dependencies(reqs):
"""reqs: list of 'package==version' strings (a pinned requirements list).
Return findings for any package at or below a known-vulnerable version."""
findings = []
for line in reqs:
if "==" not in line:
continue # skip unpinned lines (a finding of its own!)
pkg, ver = (p.strip() for p in line.split("==", 1))
vuln = KNOWN_VULNERABLE.get(pkg.lower())
if vuln and _ver_tuple(ver) <= _ver_tuple(vuln[0]):
findings.append((pkg, ver, vuln[1])) # (package, version, advisory)
return findings
if __name__ == "__main__":
requirements = [
"log4j-core==2.14.1", # vulnerable (Log4Shell)
"log4j-core==2.17.1", # patched -> not flagged
"flask==2.0.1", # not in our toy feed
"openssl==1.0.2k", # vulnerable (older than 1.0.2u)
]
for pkg, ver, advisory in scan_dependencies(requirements):
print(f"VULNERABLE {pkg}=={ver} ({advisory})")
# Expected output (hand-traced):
# VULNERABLE log4j-core==2.14.1 (CVE-2021-44228)
# VULNERABLE openssl==1.0.2k (CVE-2016-2107)
Trace it by hand to be sure: log4j-core==2.14.1 → (2,14,1) <= (2,14,1) is true → flagged with the Log4Shell advisory. log4j-core==2.17.1 → (2,17,1) <= (2,14,1) is false → not flagged. flask==2.0.1 → not in the feed → skipped. openssl==1.0.2k → (1,0,2,0) <= (1,0,2,0) — both reduce to (1,0,2,0) since k/u are non-numeric, so <= holds → flagged. Two findings, exactly as printed. This twenty-five-line function is, in essence, what an SCA tool does — only against the full transitive tree and a live advisory feed, which is the difference between a teaching toy and the thing that would have answered "do we use Log4j?" in an afternoon. Chapter 23 prioritizes these findings; Chapter 29 formalizes the inventory as an SBOM; Chapter 31 wires this check into the build so a vulnerable dependency fails the pipeline.
Summary
This chapter put security into the way software is built, with developers as the first line of defense.
- The SSDLC builds security into every phase (design → code → build → test → deploy), not as a pre-launch audit. Shift left: the earlier a defect is caught, the cheaper it is to fix — a design flaw at the whiteboard costs a conversation; the same flaw in production costs an incident. Security is a property of the process that produces software.
- The OWASP Top 10 is a ranked list of categories of application risk (access control, cryptographic failures, injection, insecure design, misconfiguration, vulnerable components, auth failures, integrity failures, logging failures, SSRF) — an awareness and prioritization map, not a pass/fail test. A CWE identifies a specific weakness type; ASVS is the testable standard. Most categories reduce to a few disciplines: validate input, separate code from data, verify integrity, enforce authorization server-side, configure to a baseline, log the right events.
- Input validation (positive/allowlist, server-side, parsed to the expected type) decides what you accept; output encoding (context-specific) makes data inert for its destination interpreter. They are defense in depth for data — run both; neither replaces the other.
- Secrets belong out of source control (a secret in a repo has leaked); dependency risk is the risk in components you import, especially transitive ones. Log4Shell (CVE-2021-44228) is the canonical case: the hard problem was not patching but finding — answering "do we use it, and where?" — which is why you inventory dependencies (toward an SBOM), monitor them against advisories, and can patch fast.
- SAST reads your code at rest (early, points to a line, false-positive-prone); DAST tests the running app from outside (later, fewer false positives, false-negative-prone); SCA inventories third-party components against known vulnerabilities (answers the Log4Shell question). Use all three, plus human review and threat modeling — tools miss design and logic flaws.
- Threat modeling a feature with STRIDE (Spoofing, Tampering, Repudiation, Information disclosure, Denial of service, Elevation of privilege) finds the design and logic flaws scanners cannot, and its deliverable is a set of security requirements — specific, verifiable statements that close the SSDLC loop from design to test.
- Meridian additions: a one-page Secure SDLC policy;
bluekit/appsec.pywithscan_dependencies(reqs)(SCA in miniature, hand-traced against Log4Shell).
Spaced Review
Retrieval practice across this chapter and earlier ones — answer before scrolling up.
- (This chapter) Name the three automated AppSec tool families and the one question each answers best. Which one would have told Meridian "we use Log4j, here," and why not the other two?
- (Chapter 3) This chapter relied on two principles from the security-principles chapter: one governs what an application's database account is allowed to do, and one explains why we run input validation and output encoding and (later) a WAF. Name both principles and connect each to its application-security use here.
- (Chapter 1) A SAST scan returns 4,000 findings. Using the risk idea from Chapter 1, explain why "fix all 4,000" is the wrong instruction and what you would do instead — and state, in Chapter 1's terms, what a vulnerability count fails to capture.
Answers
1. SAST (insecure *code patterns* you wrote, at rest), DAST (exploitable *runtime behavior* of the running app), SCA (known-vulnerable *third-party components* you imported). SCA would have answered "we use Log4j, here," because it inventories transitive dependencies against vulnerability databases; SAST examines only your own code, and DAST would catch it only if it happened to trigger the specific flaw. 2. **Least privilege** — the application's database account (and the app itself) should have only the permissions it needs, so a successful injection or compromise can do less; and **defense in depth** — no single data-safety control is trusted alone, so we validate input *and* encode output *and* add a WAF, each designed assuming the others may fail. 3. A raw count treats every finding as equal, but risk is likelihood × impact (Chapter 1): a flagged path that is unreachable or low-impact does not deserve the same attention as an exploitable flaw on a crown-jewel asset. Instead, tune the tool, suppress false positives, and rank by exploitability and asset criticality, fixing the high-confidence, high-severity subset first. A vulnerability count fails to capture *likelihood* and *impact* — the very factors that turn a list of weaknesses into a defensible plan.What's Next
You now have the program — a lifecycle, a set of patterns, a way to find weakness automatically, and a way to think threats through before you build. What you do not yet have is the close-up on the specific web attacks that the OWASP categories keep gesturing at. Chapter 13 supplies it: it dissects SQL injection and command injection (and the parameterization that stops them), cross-site scripting and the content-security policy that contains it, cross-site request forgery and server-side request forgery, and the session and authentication flaws that haunt every login form — then deploys the web application firewall as a layer of defense in depth and shows you how exploitation looks in the logs your SOC will watch. The patterns from this chapter — validate input, encode output, separate code from data, threat-model the feature — are the foundation; Chapter 13 is where you see them defeat the attacks "that never get old," and where your appsec.py toolkit gains the illustrative taint_demo that makes the source-to-sink idea concrete.