
What is HTML Injection?
HTML Injection is a type of web security vulnerability that occurs when a web application includes untrusted data in the HTML output without proper validation or escaping. This allows an attacker to inject malicious or arbitrary HTML code into a webpage viewed by other users.
Types of HTML Injection
1. Stored HTML Injection (Persistent)
- Definition: The malicious HTML code is permanently stored on the server (e.g., in a database) and is served to users whenever they access the affected page.
- Example Scenario:
- A user submits a comment with HTML tags like
<img src="x" onerror="alert('Hacked!')">
- The website stores this comment and displays it to every visitor without sanitization.
- A user submits a comment with HTML tags like
- Impact: Affects multiple users; ideal for phishing, fake forms, or triggering client-side scripts (if not blocked).
2. Reflected HTML Injection (Non-Persistent)
- Definition: The malicious HTML is embedded in a URL or user input and immediately reflected back in the response.
- Example Scenario:
- A search query like
?q=<b>hacked</b>
appears unsanitized in the results page. - The HTML is not stored but reflected back instantly.
- A search query like
- Impact: Temporary, usually affects users who are tricked into clicking a malicious link.
(Bonus) – Client-Side HTML Injection
- Definition: Injection occurs in the browser via JavaScript manipulating the DOM using unsafe input (e.g.,
innerHTML
). - Example Scenario: javascriptCopyEdit
document.getElementById("output").innerHTML = location.hash;
- If the URL contains:
#<h1>Injected</h1>
, it gets rendered as HTML.
- If the URL contains:
Summary Table
Type | Stored? | Triggered by | Affects Who? |
---|---|---|---|
Stored | Yes | Viewing infected page | All users |
Reflected | No | Clicking a crafted URL | Individual victims |
Client-Side | No | Malicious DOM input | Depends on JS usage |
How HTML Injection Works
HTML Injection happens when user-supplied input is inserted into a web page’s HTML output without proper validation or escaping. This allows attackers to inject malicious or unexpected HTML code into the page, potentially manipulating the structure or behavior of the site.
Basic Workflow
- User submits input (e.g., via a form, URL parameter, comment field).
- Application improperly handles the input — it directly adds it to the HTML response.
- Injected HTML is rendered by the browser — potentially altering the page or tricking users.
Example
Suppose a web page displays user feedback like this:
<div>
Your message: [user_input]
</div>
Legitimate input:
nginxCopyEditHello world!
Becomes:
<div>
Your message: Hello world!
</div>
Malicious input:
php-templateCopyEdit<b>Hello!</b><script>alert('Hacked!')</script>
Becomes:
<div>
Your message: <b>Hello!</b><script>alert('Hacked!')</script>
</div>
- The browser renders the
<b>
tag (bold text). - It may execute the
<script>
tag, leading to XSS if scripts aren’t blocked.
🧱 Common Injection Points
- URL parameters (
?q=<b>search</b>
) - Form fields (comments, names, reviews)
- HTTP headers (like the Referer)
- JavaScript-based DOM updates (
innerHTML
,document.write()
)
⚠️ What Attackers Can Do
- Modify page layout or content
- Fake login forms (phishing)
- Inject malicious links
- Execute JavaScript (if not filtered, escalating to XSS)
- Trick users into actions by altering the interface
🔐 Why It Happens
- Lack of input validation
- Missing or incorrect output encoding
- Unsafe use of DOM methods like
.innerHTML
,.insertAdjacentHTML()
🛡️ Prevention Tips
Mitigation Strategy | Description |
---|---|
HTML escaping | Convert < , > , " into safe equivalents (< ) |
Input validation | Reject or sanitize unwanted HTML tags or characters |
Use safe DOM methods | Use .textContent or .innerText instead of .innerHTML |
Content Security Policy (CSP) | Restrict inline scripts and dangerous content |
How to find HTML Injection?
Finding HTML injection involves testing whether user input is reflected in the HTML output of a web page without proper encoding. Here’s a practical step-by-step guide:
✅ 1. Manual Testing
🧪 Step-by-Step:
- Identify input points:
- Search boxes
- Comment forms
- Contact or feedback forms
- URL parameters (e.g.,
?q=value
) - Headers (like Referer or User-Agent)
- Inject basic HTML code into inputs: htmlCopyEdit
<b>test</b> <h1>hello</h1> <img src=x onerror=alert('injected')>
- Submit the form or load the URL and inspect the response.
- Right-click > View Page Source
- Check if your HTML is rendered, not escaped (e.g., bold text is visible).
- Use browser dev tools (F12) to observe where your input appears in the DOM.
🧰 2. Using a Web Proxy (like Burp Suite or OWASP ZAP)
- Intercept requests and insert test payloads into any form fields or parameters.
- Look for:
- Rendered HTML in response
- Changes to page layout
- Alert boxes or broken structure
- Use Burp’s Repeater to test variations systematically.
⚙️ 3. Automated Tools
These tools can help identify possible HTML injection points:
Tool | Purpose |
---|---|
Burp Suite Scanner | Automatic detection of HTML and XSS |
OWASP ZAP | Passive/active scan for injections |
Nikto | Scans for various vulnerabilities |
Wapiti, Arachni, etc. | Web app vulnerability scanning |
⚠️ Automated tools often miss logic-based vulnerabilities or custom HTML contexts. Always combine with manual testing.
🔍 4. Payload Examples for Testing
Payload | Expected Behavior |
---|---|
<b>bold</b> | Text appears bold |
<h1>Test</h1> | Large heading shows up |
<img src=x onerror=alert(1)> | JavaScript alert (if XSS works) |
<marquee>scroll</marquee> | Scrolling text (in some browsers) |
🚩 5. Signs of Vulnerability
- Input is rendered as HTML instead of displayed as plain text.
- Layout is broken or modified by your input.
- JavaScript from your input is executed (escalates to XSS).
- No escaping (
<
,>
,"
, etc.) in the source code.
🛑 Note:
HTML Injection is not always exploitable as XSS, but it still poses risks like:
- Phishing
- Fake UI elements
- Clickjacking preparation
- Misleading content