> "The web was designed to be open, interconnected, and accessible. Every one of those qualities is also a potential attack vector." --- Jeff Williams, Co-Founder of OWASP
Learning Objectives
- Understand modern web application architecture and its attack surfaces
- Master the OWASP Top 10 (2021) vulnerability categories
- Analyze HTTP requests and responses at a protocol level
- Configure and operate Burp Suite for web application testing
- Perform web application reconnaissance and site mapping
- Implement input validation and output encoding defenses
In This Chapter
- 18.1 Modern Web Application Architecture
- 18.2 The OWASP Top 10 (2021)
- 18.3 HTTP in Depth
- 18.4 Burp Suite Setup and Configuration
- 18.5 Web Application Reconnaissance and Mapping
- 18.6 Input Validation and Output Encoding
- 18.7 Web Application Firewalls (WAF)
- 18.8 Web Application Testing Methodology
- 18.9 MedSecure Portal: Web Application Security Considerations
- 18.10 Summary
- Chapter 18 References
Chapter 18: Web Application Security Fundamentals
"The web was designed to be open, interconnected, and accessible. Every one of those qualities is also a potential attack vector." --- Jeff Williams, Co-Founder of OWASP
Web applications are the dominant interface between organizations and their users, handling everything from e-commerce transactions to healthcare records. They are also, overwhelmingly, the most targeted attack surface in modern cybersecurity. According to Verizon's Data Breach Investigations Report, web application attacks account for approximately 26% of all breaches---more than any other attack pattern. Understanding how web applications work, where they break, and how to test them ethically is not merely a specialization within penetration testing; it is the foundation upon which most modern security assessments are built.
This chapter establishes that foundation. We will dissect web application architecture to understand the components an attacker can reach. We will study the OWASP Top 10 as a structured taxonomy of the most critical web risks. We will go deep into HTTP---the protocol that carries every web interaction---because you cannot test what you do not understand at the wire level. We will set up Burp Suite, the industry-standard intercepting proxy, and learn to use it as an extension of our hands. Finally, we will explore web application reconnaissance: the methodical process of mapping an application's structure, technology stack, and entry points before a single exploit is attempted.
By the end of this chapter, you will see web applications not as polished user interfaces, but as layered systems of trust boundaries, input channels, and state management---each layer presenting opportunities for authorized security testing.
18.1 Modern Web Application Architecture
Before you can test a web application, you must understand what you are testing. Modern web applications are far more complex than the static HTML pages of the early web. They are distributed systems with multiple tiers, each with its own attack surface.
18.1.1 The Three-Tier Architecture
The canonical web application architecture consists of three logical tiers:
Presentation Tier (Client Side) This is what runs in the user's browser. In our running example, ShopStack uses a React frontend---a single-page application (SPA) that renders the user interface, manages client-side state, and communicates with the backend via API calls. The presentation tier includes HTML, CSS, JavaScript, and increasingly, WebAssembly. From a security perspective, everything in this tier is under the attacker's control. The browser is hostile territory.
Application Tier (Server Side) ShopStack's Node.js API server handles business logic: authenticating users, processing orders, calculating prices, enforcing access controls. This tier receives requests from the presentation tier, validates them (or fails to), and interacts with the data tier. It is where most application-level vulnerabilities manifest---injection flaws, broken authentication, broken access control.
Data Tier (Backend) ShopStack uses PostgreSQL for relational data (users, orders, products) and might use Redis for session caching, Elasticsearch for product search, and S3 for file storage. The data tier should never be directly accessible from the internet, but misconfigurations and injection attacks can bridge that gap.
18.1.2 The Expanded Modern Stack
Real-world applications extend well beyond three tiers. Consider ShopStack's full deployment on AWS:
[Browser/Mobile App]
|
[CloudFront CDN]
|
[AWS WAF]
|
[Application Load Balancer]
|
[ECS/Fargate Containers]
/ | \
[Node.js] [Node.js] [Node.js] (Application instances)
\ | /
[Internal ALB]
|
[PostgreSQL RDS] [Redis ElastiCache] [S3 Buckets]
Each component introduces potential vulnerabilities:
- CDN (CloudFront): Cache poisoning, origin misconfiguration
- WAF (AWS WAF): Bypass techniques, rule gaps
- Load Balancer: HTTP request smuggling, header injection
- Containers: Escape vulnerabilities, misconfigured orchestration
- Application Servers: The full OWASP Top 10
- Databases: Injection, misconfigurations, excessive privileges
- Object Storage: Public bucket exposure, insecure direct object references
18.1.3 API-Driven Architecture
Modern applications increasingly separate their frontend and backend through APIs. ShopStack exposes a RESTful API:
GET /api/v2/products # List products
GET /api/v2/products/:id # Get specific product
POST /api/v2/orders # Create order
GET /api/v2/users/me # Get current user profile
PUT /api/v2/users/me # Update profile
DELETE /api/v2/admin/users/:id # Admin: delete user
API-driven architectures shift the attack surface. Instead of testing HTML forms, you are testing JSON endpoints. Instead of session cookies alone, you may encounter JWT tokens, API keys, or OAuth flows. The MedSecure patient portal uses a similar API architecture, but with additional FHIR-compliant healthcare endpoints that add complexity.
Key Concept
Every API endpoint is a potential entry point. During testing, you must enumerate all endpoints---not just those the frontend uses, but also deprecated versions (v1), admin endpoints, and debug routes that may have been left exposed.
18.1.4 Session and State Management
HTTP is stateless, but web applications are not. They maintain state through several mechanisms:
- Cookies: Server-set values sent with every request. Session cookies (e.g.,
connect.sidin Express) link requests to server-side session data. - JWT (JSON Web Tokens): Self-contained tokens that carry user identity and claims. ShopStack uses JWTs stored in localStorage for API authentication.
- URL Parameters: Query strings or path parameters that carry state (dangerous if containing sensitive data).
- Hidden Form Fields: Values embedded in HTML forms that the server expects to receive back.
- Local/Session Storage: Browser-side storage accessible via JavaScript.
Each state management mechanism has distinct security implications. Cookies can be stolen via XSS if not marked HttpOnly. JWTs can be forged if the signing key is weak or the algorithm is manipulable. URL parameters leak through referrer headers and browser history.
18.1.5 Trust Boundaries
A trust boundary exists wherever data crosses from one trust level to another. In ShopStack:
- Browser to Server: The most critical boundary. All user input crosses here.
- Server to Database: SQL queries constructed from user input cross here.
- Server to External APIs: Payment processor calls, email service integrations.
- Between Microservices: Internal service-to-service calls that may lack authentication.
- Server to File System: File uploads, log writes, template rendering.
Blue Team Perspective: Defense-in-depth means validating data at every trust boundary, not just at the perimeter. ShopStack should validate input at the API gateway, again in the application logic, and again through database constraints. If any single layer fails, the others should catch the attack.
18.2 The OWASP Top 10 (2021)
The Open Web Application Security Project (OWASP) Top 10 is the most widely recognized document for web application security awareness. Updated periodically (most recently in 2021), it represents a broad consensus on the most critical web application security risks. Understanding the Top 10 is essential not just for testing, but for communicating findings to developers and management.
18.2.1 A01:2021 --- Broken Access Control
Moved from #5 to #1. This is now the most common critical web vulnerability. Broken access control occurs when users can act outside their intended permissions.
Examples in ShopStack:
- A regular user accessing /api/v2/admin/users by simply changing the URL
- Modifying the user_id parameter in a request to view another user's orders
- Accessing the order management API without being an authenticated merchant
Testing Approach: - Test every endpoint with different privilege levels - Attempt Insecure Direct Object Reference (IDOR) by manipulating identifiers - Check for missing function-level access controls on admin endpoints - Verify that CORS policies are properly restrictive
Key Statistic: 94% of applications tested by OWASP had some form of broken access control.
18.2.2 A02:2021 --- Cryptographic Failures
Previously "Sensitive Data Exposure." This category focuses on failures related to cryptography that lead to exposure of sensitive data.
Examples in ShopStack: - Transmitting credit card data over HTTP instead of HTTPS - Using MD5 or SHA-1 for password hashing instead of bcrypt/Argon2 - Storing API keys in plaintext in the database - Using weak or default TLS configurations
Testing Approach: - Verify TLS configuration using tools like testssl.sh - Check for sensitive data in URLs, logs, or error messages - Examine password storage mechanisms - Test for sensitive data transmitted in cleartext
18.2.3 A03:2021 --- Injection
Dropped from #1 to #3 as frameworks have improved default protections, but still critically dangerous. Injection occurs when untrusted data is sent to an interpreter as part of a command or query.
Types Covered in Chapter 19: - SQL Injection - NoSQL Injection - Command Injection - LDAP Injection - Template Injection
ShopStack Example:
# Vulnerable product search
GET /api/v2/products?search=laptop' OR '1'='1
18.2.4 A04:2021 --- Insecure Design
New category in 2021. This represents a fundamental shift in thinking---some vulnerabilities exist because the application was designed insecurely, not because it was implemented incorrectly.
Examples: - A password recovery flow that reveals whether an email is registered - A shopping cart that trusts client-side price calculations - No rate limiting on authentication endpoints - Insufficient anti-automation on critical business flows
Why This Matters: You cannot fix insecure design with a patch. It requires rethinking the architecture.
18.2.5 A05:2021 --- Security Misconfiguration
Expanded to include XML External Entities (XXE). This covers improper configuration at any level of the application stack.
ShopStack Examples: - Default credentials on the PostgreSQL admin interface - Directory listing enabled on the static file server - Stack traces exposed in production error responses - Unnecessary HTTP methods enabled (PUT, DELETE on static resources) - S3 buckets with public read access
18.2.6 A06:2021 --- Vulnerable and Outdated Components
Using components with known vulnerabilities---libraries, frameworks, or other software modules. ShopStack's package.json might include dozens of npm dependencies, each a potential vulnerability.
Testing Approach:
- Run npm audit or yarn audit for Node.js dependencies
- Use OWASP Dependency-Check or Snyk
- Check CVE databases for known vulnerabilities in identified versions
- Test for outdated jQuery, Angular, or React versions with known XSS vectors
18.2.7 A07:2021 --- Identification and Authentication Failures
Previously "Broken Authentication." Covers weaknesses in authentication mechanisms.
ShopStack Testing Points: - Credential stuffing resistance (rate limiting, CAPTCHA) - Password policy enforcement - Session fixation after login - JWT implementation flaws (none algorithm, weak secrets) - Multi-factor authentication bypass
18.2.8 A08:2021 --- Software and Data Integrity Failures
New category. Relates to assumptions about software updates, critical data, and CI/CD pipelines without verification.
Examples: - Auto-update mechanisms without signature verification - Deserialization of untrusted data - CI/CD pipeline compromise (supply chain attacks) - Insecure use of CDN-hosted JavaScript without Subresource Integrity (SRI)
18.2.9 A09:2021 --- Security Logging and Monitoring Failures
Previously "Insufficient Logging & Monitoring." Without proper logging, attacks go undetected.
What to Test: - Are login failures logged? - Are access control failures logged? - Do logs include sufficient context (IP, timestamp, user, action)? - Are logs protected from tampering? - Is there alerting on suspicious patterns?
18.2.10 A10:2021 --- Server-Side Request Forgery (SSRF)
New to the Top 10 in 2021. SSRF occurs when a web application fetches a remote resource without validating the user-supplied URL.
ShopStack Example:
# Image preview feature
POST /api/v2/products/preview
{"image_url": "http://169.254.169.254/latest/meta-data/iam/security-credentials/"}
This request could make the server fetch AWS metadata, exposing IAM credentials. SSRF has been involved in some of the most significant cloud breaches, including the 2019 Capital One breach.
Blue Team Perspective: OWASP Top 10 coverage should be your minimum standard, not your goal. Use it as a communication framework with development teams, but your actual testing should go deeper. Consider OWASP's Application Security Verification Standard (ASVS) for comprehensive coverage.
18.3 HTTP in Depth
HTTP (Hypertext Transfer Protocol) is the language of the web. Every web application test begins and ends with HTTP. To be an effective web application tester, you must be able to read, write, and manipulate HTTP at the raw level.
18.3.1 HTTP Request Structure
An HTTP request consists of four parts:
POST /api/v2/auth/login HTTP/1.1 <-- Request Line
Host: shopstack.example.com <-- Headers begin
Content-Type: application/json
Content-Length: 52
Cookie: session=abc123; tracking=xyz789
Authorization: Bearer eyJhbGciOiJIUzI1NiJ9...
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64)
Accept: application/json
Accept-Encoding: gzip, deflate, br
Connection: keep-alive
<-- Blank line (CRLF)
{"username":"admin","password":"P@ssw0rd"} <-- Body
The Request Line contains the method, the path (including query string), and the HTTP version. Each component is testable: - Method: Can you change POST to PUT? Does the server behave differently? - Path: Are there path traversal possibilities? Hidden endpoints? - Version: Does HTTP/1.0 vs 1.1 change behavior (useful for request smuggling)?
18.3.2 HTTP Methods
| Method | Purpose | Security Relevance |
|---|---|---|
| GET | Retrieve resource | Should never modify state; parameters in URL (logged, cached) |
| POST | Submit data | Body not cached; used for state changes |
| PUT | Replace resource | Often used in REST APIs; test for unauthorized updates |
| PATCH | Partial update | May bypass validation that checks full PUT requests |
| DELETE | Remove resource | Test for unauthorized deletion |
| HEAD | GET without body | Useful for recon; may reveal headers without triggering WAF body inspection |
| OPTIONS | Show allowed methods | Reveals CORS configuration and supported methods |
| TRACE | Echo request back | Can enable Cross-Site Tracing (XST) attacks |
| CONNECT | Establish tunnel | Proxy-related; can enable SSRF |
Testing Tip: Always send an OPTIONS request to every endpoint during reconnaissance. The Allow header in the response reveals which methods the server accepts:
HTTP/1.1 200 OK
Allow: GET, POST, PUT, DELETE, OPTIONS
If DELETE is allowed on a resource endpoint, test whether access control is enforced for that method.
18.3.3 HTTP Response Structure
HTTP/1.1 200 OK <-- Status Line
Date: Wed, 27 Feb 2026 10:30:00 GMT <-- Headers
Content-Type: application/json; charset=utf-8
Set-Cookie: session=def456; HttpOnly; Secure; SameSite=Strict; Path=/
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
Content-Security-Policy: default-src 'self'
Strict-Transport-Security: max-age=31536000; includeSubDomains
X-Request-Id: 7a8b9c0d-1e2f-3a4b-5c6d-7e8f9a0b1c2d
Server: nginx/1.24.0
Content-Length: 157
{"success":true,"token":"eyJhbGciOiJIUzI1NiJ9...","user":{"id":42,"role":"customer"}}
18.3.4 Status Codes as Intelligence
Status codes reveal application behavior:
| Code | Meaning | Security Insight |
|---|---|---|
| 200 | OK | Request succeeded; baseline response |
| 201 | Created | Resource created; useful for confirming injection |
| 301/302 | Redirect | May lead to open redirect vulnerabilities |
| 400 | Bad Request | Input validation triggered; modify payload |
| 401 | Unauthorized | Authentication required; test bypass |
| 403 | Forbidden | Access denied; test for bypass (different method, headers) |
| 404 | Not Found | Resource doesn't exist; use for directory brute-forcing |
| 405 | Method Not Allowed | This method blocked; try others |
| 429 | Too Many Requests | Rate limiting active; note threshold |
| 500 | Internal Server Error | Server-side error; may indicate injection success |
| 502/503 | Bad Gateway/Unavailable | Backend failure; may indicate DoS potential |
Critical Testing Pattern: The difference between a 403 and a 404 response for non-existent resources can reveal information. If /api/v2/admin/users returns 403 (forbidden) but /api/v2/admin/nonexistent returns 404 (not found), you have confirmed that the users endpoint exists and is protected. A secure application returns the same response for both.
18.3.5 Security-Critical Headers
Request Headers to Manipulate:
Host:--- Virtual host routing; test for host header injectionX-Forwarded-For:--- IP spoofing behind load balancersReferer:--- CSRF protection bypass if checking refererContent-Type:--- Change fromapplication/jsontoapplication/xmlto test XXECookie:--- Session manipulation, parameter tampering
Response Headers to Verify (Blue Team Checklist):
| Header | Secure Value | Purpose |
|---|---|---|
Strict-Transport-Security |
max-age=31536000; includeSubDomains |
Force HTTPS |
X-Content-Type-Options |
nosniff |
Prevent MIME sniffing |
X-Frame-Options |
DENY or SAMEORIGIN |
Prevent clickjacking |
Content-Security-Policy |
Restrictive policy | Prevent XSS, data exfil |
X-XSS-Protection |
0 (modern approach) |
Deprecated; CSP replaces it |
Referrer-Policy |
strict-origin-when-cross-origin |
Control referer leakage |
Permissions-Policy |
Restrict features | Control browser APIs |
Cache-Control |
no-store (for sensitive pages) |
Prevent caching of sensitive data |
18.3.6 Cookies Deep Dive
Cookies are the primary session management mechanism and deserve special attention:
Set-Cookie: session=eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyIjo0Mn0.abc123;
Domain=shopstack.example.com;
Path=/;
Expires=Wed, 06 Mar 2026 10:30:00 GMT;
HttpOnly;
Secure;
SameSite=Strict
Cookie Attributes and Their Security Impact:
| Attribute | Effect | Risk If Missing |
|---|---|---|
HttpOnly |
Not accessible via JavaScript | XSS can steal cookie |
Secure |
Only sent over HTTPS | Cookie exposed on HTTP |
SameSite=Strict |
Not sent on cross-origin requests | CSRF attacks possible |
SameSite=Lax |
Sent on top-level navigations only | Some CSRF vectors remain |
Domain |
Scope of cookie sharing | Overly broad = sibling subdomain access |
Path |
URL path restriction | Rarely effective as security control |
Expires/Max-Age |
Cookie lifetime | Persistent sessions if too long |
Lab Exercise Preview: In your home lab, use Browser Developer Tools (F12 > Application > Cookies) on DVWA to examine cookie attributes. Note which security flags are missing. Then use Burp Suite to intercept and modify cookie values.
18.3.7 HTTP/2 and HTTP/3 Considerations
Modern web applications increasingly use HTTP/2 (binary framing, multiplexing, header compression) and HTTP/3 (QUIC-based, UDP transport). While the security fundamentals remain similar, these protocols introduce new attack surfaces:
- HTTP/2 request smuggling: Binary framing can create desynchronization between frontend and backend servers
- HPACK compression attacks: Header compression can leak information (similar to CRIME/BREACH)
- HTTP/3 UDP-based attacks: New amplification and reflection possibilities
Burp Suite handles HTTP/2 transparently, but you should be aware of protocol differences when analyzing raw traffic.
18.4 Burp Suite Setup and Configuration
Burp Suite, developed by PortSwigger, is the single most important tool for web application penetration testing. It functions as an intercepting proxy that sits between your browser and the target, allowing you to inspect, modify, and replay every HTTP request and response. If you learn only one web testing tool, make it Burp Suite.
18.4.1 Architecture and Components
Burp Suite is a platform with multiple integrated tools:
| Component | Purpose | Usage Frequency |
|---|---|---|
| Proxy | Intercept and modify HTTP traffic | Every test |
| Scanner | Automated vulnerability scanning (Pro only) | Most tests |
| Intruder | Automated request manipulation | Frequent |
| Repeater | Manual request modification and replay | Every test |
| Sequencer | Token randomness analysis | Occasional |
| Decoder | Encoding/decoding utility | Frequent |
| Comparer | Diff two responses | Occasional |
| Logger | Full HTTP history | Background |
| Extender | Plugin management | Setup |
18.4.2 Initial Setup
Step 1: Install Burp Suite Community Edition
Download from portswigger.net. Install with default settings. Launch and create a temporary project (Community) or named project (Pro).
Step 2: Configure Browser Proxy
Option A --- FoxyProxy Browser Extension (Recommended):
1. Install FoxyProxy in Firefox or Chrome
2. Add a proxy: Host 127.0.0.1, Port 8080, Type HTTP
3. Name it "Burp Suite" and enable when testing
Option B --- System Proxy:
1. Set system proxy to 127.0.0.1:8080
2. Remember to disable after testing
Step 3: Install Burp CA Certificate
With proxy active, navigate to http://burpsuite. Download the CA certificate. Import it into your browser's certificate store as a trusted root CA. This allows Burp to intercept HTTPS traffic. Only install this certificate in your testing browser.
Step 4: Configure Scope
In Burp, go to Target > Scope. Add your target:
Protocol: HTTPS
Host: shopstack.example.com
Port: 443
File: ^/api/.*
Enable "Use advanced scope control" for precise targeting. Under Proxy > Options, select "And URL is in target scope" for both interception and history. This prevents capturing traffic from unrelated sites.
18.4.3 Essential Burp Workflows
Workflow 1: Passive Reconnaissance 1. Turn intercept OFF (Proxy > Intercept > "Intercept is off") 2. Browse the application normally 3. Review HTTP history (Proxy > HTTP history) 4. Examine the Site Map (Target > Site map) 5. Note API endpoints, parameters, authentication mechanisms
Workflow 2: Request Manipulation 1. Turn intercept ON 2. Perform action in browser (e.g., submit login form) 3. Modify the intercepted request in Burp 4. Forward the modified request 5. Observe the response
Workflow 3: Repeater Testing 1. Right-click any request in HTTP history 2. "Send to Repeater" 3. Modify the request manually 4. Click "Send" 5. Analyze the response 6. Iterate with different payloads
Workflow 4: Intruder Attacks 1. Send request to Intruder 2. Mark payload positions (e.g., the password field) 3. Select attack type (Sniper, Battering Ram, Pitchfork, Cluster Bomb) 4. Configure payload list 5. Start attack 6. Sort results by status code, length, or response time to identify anomalies
18.4.4 Essential Extensions
Install these via Extender > BApp Store:
- Autorize: Automatically tests access control by replaying requests with different session tokens
- Logger++: Enhanced logging with filtering capabilities
- JSON Beautifier: Pretty-print JSON in all Burp tabs
- Param Miner: Discover hidden parameters and headers
- Hackvertor: Advanced encoding and decoding
- Active Scan++: Enhanced scanning capabilities
Blue Team Perspective: Understanding Burp Suite is equally important for defenders. Security teams should use Burp to validate their WAF rules, test their security headers, and verify that access controls work correctly. Many organizations run Burp scans as part of their CI/CD pipeline using Burp's REST API.
18.5 Web Application Reconnaissance and Mapping
Reconnaissance is the most critical phase of web application testing. A thorough recon phase finds more vulnerabilities than brute-force automated scanning. The goal is to build a complete map of the application: its pages, its API endpoints, its parameters, its technology stack, and its trust boundaries.
18.5.1 Technology Stack Identification
Before testing, determine what you are testing:
Passive Identification:
| Source | Information | Example |
|---|---|---|
| HTTP Headers | Server software, framework | Server: nginx/1.24.0, X-Powered-By: Express |
| Cookies | Framework/language | JSESSIONID = Java, PHPSESSID = PHP, connect.sid = Express |
| HTML Source | Framework markers | React root div, Angular ng- attributes, Vue.js v- directives |
| JavaScript Files | Framework/version | react.production.min.js, angular.min.js |
| Error Pages | Stack/version | Detailed stack traces in development mode |
| Favicon Hash | Technology | Known favicon hashes for default installs |
| robots.txt | Hidden paths | Disallowed directories reveal application structure |
| sitemap.xml | Full URL list | Intended for search engines; useful for testers |
Tool: Wappalyzer A browser extension that automatically identifies technologies. For ShopStack, it would reveal: React, Node.js, Express, nginx, AWS CloudFront, and potentially PostgreSQL if error messages leak.
Command-Line Identification:
# Check HTTP headers
curl -sI https://shopstack.example.com | grep -i "server\|x-powered\|set-cookie"
# Retrieve robots.txt
curl -s https://shopstack.example.com/robots.txt
# Check common files
for file in robots.txt sitemap.xml .well-known/security.txt crossdomain.xml; do
echo "--- $file ---"
curl -s -o /dev/null -w "%{http_code}" "https://shopstack.example.com/$file"
echo
done
18.5.2 Directory and File Discovery
Automated discovery finds resources not linked from the visible application:
Gobuster:
# Directory brute-force
gobuster dir -u https://shopstack.example.com -w /usr/share/wordlists/dirb/common.txt \
-t 50 -o gobuster-results.txt --no-error
# With file extensions
gobuster dir -u https://shopstack.example.com \
-w /usr/share/seclists/Discovery/Web-Content/raft-medium-files.txt \
-x php,asp,aspx,jsp,html,js,json,xml,txt,bak,old,conf \
-t 50 -o gobuster-files.txt
# API endpoint discovery
gobuster dir -u https://shopstack.example.com/api/ \
-w /usr/share/seclists/Discovery/Web-Content/api/api-endpoints.txt \
-t 30
Valuable Files to Find:
| File/Path | Significance |
|---|---|
.git/ |
Source code disclosure via git repository |
.env |
Environment variables with secrets |
backup.sql, dump.sql |
Database dumps |
web.config, .htaccess |
Server configuration |
/api/swagger.json |
API documentation |
/api/v1/ |
Old API versions with fewer protections |
/debug/, /test/ |
Debug interfaces |
phpinfo.php |
PHP configuration disclosure |
/server-status |
Apache status page |
/actuator/ |
Spring Boot management endpoints |
18.5.3 Spidering and Crawling
Automated crawling follows links to build a site map:
Burp Suite Spider: 1. Right-click the target in Site Map 2. "Crawl" (Burp Scanner) or "Spider this host" (older versions) 3. Configure crawl depth and scope 4. Review discovered content in the Site Map tree
Custom Crawling with Python:
For more control, write a custom crawler (see code example example-01-web-crawler.py). Custom crawlers can handle JavaScript rendering, respect rate limits, and extract specific data patterns.
Important Considerations:
- Always set crawl scope to prevent testing unauthorized targets
- Respect robots.txt during authorized tests (but also review it for intelligence)
- Be aware that crawlers can trigger destructive actions (DELETE endpoints, logout links)
- Use authenticated crawling to discover protected content
- Note: JavaScript-heavy SPAs (like ShopStack's React frontend) require a browser-based crawler or tools that execute JavaScript (like Burp's built-in browser)
18.5.4 API Endpoint Enumeration
For API-driven applications, systematic endpoint discovery is crucial:
From JavaScript Source:
# Extract API calls from JavaScript bundles
curl -s https://shopstack.example.com/static/js/main.js | \
grep -oP '["'"'"']/api/[^"'"'"'\s]+["'"'"']' | sort -u
From Swagger/OpenAPI: If the application exposes API documentation:
# Common API documentation paths
curl -s https://shopstack.example.com/api/swagger.json
curl -s https://shopstack.example.com/api/docs
curl -s https://shopstack.example.com/api/v2/openapi.yaml
curl -s https://shopstack.example.com/api-docs
From Traffic Analysis: Browse the application extensively with Burp Proxy running. Every API call the frontend makes will appear in HTTP history. Sort by URL to identify endpoint patterns.
18.5.5 Parameter Discovery
Parameters are the primary input channels for attacks:
Visible Parameters:
- URL query parameters: ?search=laptop&category=electronics&page=2
- POST body parameters: {"username":"admin","password":"secret"}
- Path parameters: /api/v2/products/42 (where 42 is a parameter)
- Cookie values: session=abc123
Hidden Parameters: Use Burp's Param Miner extension or manual testing:
# Arjun - parameter discovery tool
arjun -u https://shopstack.example.com/api/v2/products --stable
Common hidden parameters to test: debug, test, admin, verbose, callback, _method, format, template, id, role.
18.5.6 Authentication and Session Analysis
Map the authentication flow completely:
- Registration: What validation exists? Can you create admin accounts?
- Login: What credentials are required? How does the token/session work?
- Session Management: How are sessions maintained? What is the session lifetime?
- Password Reset: What is the reset flow? Are tokens time-limited and single-use?
- Logout: Is the session properly invalidated server-side?
- MFA: If present, can it be bypassed?
For ShopStack, the authentication flow is:
POST /api/v2/auth/register -> Create account
POST /api/v2/auth/login -> Receive JWT
GET /api/v2/users/me -> Authorization: Bearer <JWT>
POST /api/v2/auth/refresh -> Refresh expired JWT
POST /api/v2/auth/logout -> Invalidate refresh token
POST /api/v2/auth/reset -> Password reset request
POST /api/v2/auth/reset/:token -> Password reset completion
Each of these endpoints requires thorough testing.
18.5.7 Creating a Test Plan from Reconnaissance
After reconnaissance, organize findings into a structured test plan:
Target: ShopStack (shopstack.example.com)
Technology: React + Node.js/Express + PostgreSQL on AWS
Authentication:
- JWT-based (HS256)
- Refresh token rotation
- Password reset via email
API Endpoints Discovered: 47
- Public: 12 (product listing, search, categories)
- Authenticated: 28 (orders, profile, cart, wishlist)
- Admin: 7 (user management, product management, reporting)
Input Parameters Identified: 156
- Search/filter: 23
- User input forms: 18
- File upload: 3
- API body parameters: 112
Security Headers: Partial
- HSTS: Present
- CSP: Present but permissive (unsafe-inline)
- X-Frame-Options: Missing
- Cookie flags: HttpOnly present, SameSite=Lax
Priority Test Areas:
1. Access control on admin endpoints (A01)
2. JWT implementation (A07)
3. Search functionality - injection (A03)
4. File upload - arbitrary file upload (A04)
5. CSP bypass for XSS (A03)
18.6 Input Validation and Output Encoding
The root cause of most web application vulnerabilities is the same: the application treats user-supplied data as trusted. Input validation and output encoding are the two fundamental defenses against this entire class of attacks.
18.6.1 The Core Problem
Consider ShopStack's product search:
// VULNERABLE: User input directly concatenated into SQL
app.get('/api/v2/products', (req, res) => {
const query = `SELECT * FROM products WHERE name LIKE '%${req.query.search}%'`;
db.query(query).then(results => res.json(results));
});
The developer assumed req.query.search would contain a product name. An attacker sends:
GET /api/v2/products?search=' UNION SELECT username,password,null,null,null FROM users--
The server constructs:
SELECT * FROM products WHERE name LIKE '%' UNION SELECT username,password,null,null,null FROM users--%'
And the database returns every username and password.
18.6.2 Input Validation Strategies
Allowlisting (Preferred): Define exactly what valid input looks like and reject everything else:
// Validate search input
const searchPattern = /^[a-zA-Z0-9\s\-]{1,100}$/;
if (!searchPattern.test(req.query.search)) {
return res.status(400).json({ error: 'Invalid search query' });
}
Denylisting (Fragile): Block known-bad patterns. This approach is inherently incomplete because attackers will find patterns you have not blocked:
// DON'T rely on this alone
const blocked = /['";-]|UNION|SELECT|DROP|INSERT/i;
if (blocked.test(req.query.search)) {
return res.status(400).json({ error: 'Invalid input' });
}
// Bypass: sElEcT, SEL/**/ECT, Unicode encoding, etc.
Type Validation: Enforce data types strictly:
// Parse and validate numeric input
const productId = parseInt(req.params.id, 10);
if (isNaN(productId) || productId < 1) {
return res.status(400).json({ error: 'Invalid product ID' });
}
Length Validation: Enforce reasonable length limits:
if (req.body.username.length > 50 || req.body.username.length < 3) {
return res.status(400).json({ error: 'Username must be 3-50 characters' });
}
18.6.3 Parameterized Queries
The definitive defense against SQL injection is parameterized queries (also called prepared statements). Instead of building SQL strings with user data, you separate the query structure from the data:
// SECURE: Parameterized query
app.get('/api/v2/products', (req, res) => {
const query = 'SELECT * FROM products WHERE name LIKE $1';
const searchParam = `%${req.query.search}%`;
db.query(query, [searchParam]).then(results => res.json(results));
});
The database driver ensures that searchParam is always treated as data, never as SQL code, regardless of its content.
18.6.4 Output Encoding
Output encoding prevents injected data from being interpreted as code when rendered. The encoding must match the output context:
HTML Context:
// Encode for HTML body
function htmlEncode(str) {
return str.replace(/&/g, '&')
.replace(/</g, '<')
.replace(/>/g, '>')
.replace(/"/g, '"')
.replace(/'/g, ''');
}
// <p>Search results for: <script>alert(1)</script></p>
JavaScript Context:
// Encode for JavaScript string
function jsEncode(str) {
return str.replace(/\\/g, '\\\\')
.replace(/'/g, "\\'")
.replace(/"/g, '\\"')
.replace(/\n/g, '\\n')
.replace(/\r/g, '\\r');
}
URL Context:
// Encode for URL parameter
const safeParam = encodeURIComponent(userInput);
// Turns <script>alert(1)</script> into %3Cscript%3Ealert(1)%3C%2Fscript%3E
CSS Context:
// CSS encoding for dynamic values
function cssEncode(str) {
return str.replace(/[^a-zA-Z0-9]/g, function(char) {
return '\\' + char.charCodeAt(0).toString(16) + ' ';
});
}
18.6.5 Content Security Policy (CSP)
CSP is the most powerful defense against XSS and data exfiltration. It tells the browser which sources of content are legitimate:
Content-Security-Policy:
default-src 'self';
script-src 'self' 'nonce-abc123';
style-src 'self' 'unsafe-inline';
img-src 'self' data: https://cdn.shopstack.com;
connect-src 'self' https://api.shopstack.com;
font-src 'self' https://fonts.gstatic.com;
frame-src 'none';
base-uri 'self';
form-action 'self';
report-uri /api/csp-report;
CSP Directives for Defense:
| Directive | Purpose |
|---|---|
default-src |
Fallback for all resource types |
script-src |
JavaScript sources; use nonces or hashes, avoid unsafe-inline |
style-src |
CSS sources |
img-src |
Image sources |
connect-src |
AJAX/WebSocket/EventSource targets |
frame-src |
iframe sources; prevents clickjacking |
base-uri |
Restricts <base> tag; prevents base tag injection |
form-action |
Restricts form submission targets |
report-uri / report-to |
Where to send violation reports |
18.6.6 Framework-Level Protections
Modern frameworks provide built-in protections:
React (ShopStack Frontend):
- JSX automatically escapes values rendered in templates
- dangerouslySetInnerHTML is explicitly named to discourage use
- Component-based architecture limits global DOM manipulation
Express.js (ShopStack API): - Helmet middleware adds security headers - express-validator provides input validation - express-rate-limit prevents brute force - csurf provides CSRF protection
// ShopStack security middleware stack
const helmet = require('helmet');
const rateLimit = require('express-rate-limit');
const { body, validationResult } = require('express-validator');
app.use(helmet());
app.use(rateLimit({ windowMs: 15 * 60 * 1000, max: 100 }));
app.post('/api/v2/auth/login',
body('username').isEmail().normalizeEmail(),
body('password').isLength({ min: 8 }),
(req, res) => {
const errors = validationResult(req);
if (!errors.isEmpty()) {
return res.status(400).json({ errors: errors.array() });
}
// ... authentication logic
}
);
Blue Team Perspective: Defense in depth for web applications means layering protections: WAF at the edge, input validation in the application, parameterized queries at the database, output encoding in the response, and CSP in the browser. No single layer is sufficient. When performing authorized testing, you are looking for gaps where one or more layers are missing.
18.7 Web Application Firewalls (WAF)
Web Application Firewalls sit in front of web applications and filter malicious requests. Understanding WAFs is important for both testers and defenders.
18.7.1 How WAFs Work
WAFs inspect HTTP traffic using:
- Signature-Based Detection: Known attack patterns (regex matching)
- Anomaly-Based Detection: Deviations from normal traffic patterns
- Machine Learning: Behavioral analysis of request patterns
- Rate Limiting: Throttling excessive requests from single sources
Common WAFs: - AWS WAF: Integrated with CloudFront and ALB - Cloudflare WAF: CDN-integrated - ModSecurity: Open-source, runs with Apache/nginx - Imperva/Incapsula: Enterprise-grade - F5 Advanced WAF: Hardware/virtual appliance
18.7.2 WAF Detection
Before testing, determine if a WAF is present:
# Wafw00f - WAF detection tool
wafw00f https://shopstack.example.com
# Manual detection - send obvious attack and check response
curl -s "https://shopstack.example.com/search?q=<script>alert(1)</script>" -o /dev/null -w "%{http_code}"
# 403 with custom page = likely WAF
# Check for WAF headers
curl -sI "https://shopstack.example.com" | grep -i "cf-ray\|x-sucuri\|x-cdn\|server.*cloudflare"
18.7.3 WAF Bypass Techniques (For Authorized Testing)
During authorized penetration tests, you may need to test whether the WAF can be bypassed:
- Case variation:
SeLeCtinstead ofSELECT - Encoding: URL encoding, double encoding, Unicode
- Comment insertion:
SEL/**/ECT - Alternative syntax:
HAVING 1=1instead ofOR 1=1 - HTTP parameter pollution: Duplicate parameters with different values
- Content-Type confusion: Sending JSON payload as form data
- Request smuggling: Exploiting differences between WAF and backend parsing
Important Note: WAF bypass testing must be explicitly authorized in your rules of engagement. Never bypass a WAF on a production system without written permission.
18.8 Web Application Testing Methodology
Bringing everything together, here is a structured methodology for web application testing:
Phase 1: Scope and Authorization
- Confirm target URLs and IP ranges
- Review rules of engagement
- Understand testing windows and restrictions
- Set up dedicated testing environment and tools
Phase 2: Reconnaissance (This Chapter)
- Identify technology stack
- Map application structure
- Enumerate endpoints and parameters
- Analyze authentication mechanisms
- Document trust boundaries
Phase 3: Vulnerability Discovery (Chapters 19-24)
- Test each OWASP Top 10 category
- Focus on input handling (injection, XSS)
- Test access controls systematically
- Check authentication and session management
- Review security configuration
Phase 4: Exploitation and Validation
- Confirm vulnerabilities are exploitable
- Document proof of concept
- Assess business impact
- Chain vulnerabilities where possible
Phase 5: Reporting
- Document all findings with evidence
- Provide risk ratings (CVSS)
- Include remediation recommendations
- Offer retesting after fixes
Applying the Methodology to ShopStack
For ShopStack, your test plan might prioritize:
- Access Control Testing: Can a regular user access admin API endpoints? Can user A view user B's orders?
- JWT Analysis: How are tokens generated? Can the signing key be brute-forced? Is the
nonealgorithm accepted? - Injection Points: Search functionality, product reviews, user profile updates, file upload names
- Client-Side Attacks: XSS in search results, CSRF on state-changing operations, clickjacking on the checkout page
- Business Logic: Can prices be manipulated? Can order quantities be negative? Can discount codes be reused?
Lab Exercise: Set up DVWA (Damn Vulnerable Web Application) in your home lab. Configure Burp Suite to proxy traffic to DVWA. Complete the following exercises at "Low" security level: (1) Browse DVWA with Burp intercepting and examine the Site Map, (2) Identify all cookies and their attributes, (3) Find hidden directories using Gobuster, (4) Document the technology stack using only HTTP headers and source code.
18.9 MedSecure Portal: Web Application Security Considerations
The MedSecure patient portal presents unique web application security challenges due to the sensitivity of healthcare data and regulatory requirements (HIPAA in the US, GDPR for EU patients).
Architecture Differences from ShopStack: - Additional authentication layer (patient identity verification) - Strict session timeouts (auto-logout after 15 minutes of inactivity) - Audit logging of all data access (who viewed which patient record, when) - Role-based access control with multiple roles (patient, nurse, doctor, admin) - Integration with external systems (lab results, pharmacy, insurance)
Testing Priorities: 1. Access Control: Can Patient A view Patient B's records? Can a nurse access admin functions? 2. Session Security: Are sessions properly expired? Can sessions be hijacked? 3. API Security: Are FHIR endpoints properly authenticated? Can bulk data be exported? 4. Data Exposure: Do error messages leak patient data? Are logs sanitized? 5. Integration Security: Are connections to external systems encrypted and authenticated?
The consequences of web application vulnerabilities in healthcare are severe: HIPAA violations can result in fines up to $1.5 million per incident, and breached patient data can never be "unbreached."
18.10 Summary
Web application security fundamentals encompass a broad landscape of knowledge:
- Architecture: Modern web applications are complex distributed systems with multiple tiers, each presenting attack surfaces. Understanding the architecture is prerequisite to effective testing.
- OWASP Top 10: The standard taxonomy of web risks, with Broken Access Control now the most prevalent critical vulnerability. Use it as a communication framework and testing baseline.
- HTTP Protocol: The language of the web. Master it at the raw level---request structure, methods, headers, status codes, cookies---because every attack and defense is expressed in HTTP.
- Burp Suite: The essential intercepting proxy. Configure it properly, learn its workflows, and extend it with BApp Store plugins.
- Reconnaissance: Thorough recon finds more vulnerabilities than automated scanning. Map the technology stack, discover endpoints, enumerate parameters, and analyze authentication flows before attacking.
- Input Validation and Output Encoding: The fundamental defenses against injection and XSS. Allowlisting, parameterized queries, context-aware output encoding, and CSP form the defensive layers.
With this foundation, you are prepared to dive into specific attack categories. Chapter 19 begins with injection attacks---the third most critical OWASP category and historically the most devastating. Chapter 20 explores cross-site scripting and client-side attacks. Both chapters build directly on the HTTP knowledge, Burp Suite skills, and architectural understanding established here.
The key insight of web application security is this: the browser is a universal client that will faithfully execute whatever the server tells it to. The server processes whatever the client sends it. When neither side validates the other's input, vulnerabilities emerge. Your job as an ethical hacker is to find those gaps before malicious actors do.
Chapter 18 References
- OWASP Foundation. "OWASP Top 10:2021." https://owasp.org/Top10/
- PortSwigger. "Web Security Academy." https://portswigger.net/web-security
- Stuttard, D. and Pinto, M. The Web Application Hacker's Handbook, 2nd Edition. Wiley, 2011.
- OWASP Foundation. "Application Security Verification Standard (ASVS) 4.0." https://owasp.org/www-project-application-security-verification-standard/
- Verizon. "2024 Data Breach Investigations Report." https://www.verizon.com/business/resources/reports/dbir/
- Fielding, R. et al. "RFC 9110: HTTP Semantics." IETF, 2022.
- OWASP Foundation. "Testing Guide v4.2." https://owasp.org/www-project-web-security-testing-guide/
- Mozilla Developer Network. "Content Security Policy." https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP