🔐Encoding & Escaping

HTML Entity Escaping

Escape HTML special characters to prevent XSS and display safely

Explanation

HTML entity escaping converts special characters like <, >, &, " to their entity equivalents to prevent code injection.

Examples

Escape HTML
Input
<script>alert("XSS")</script>
Output
&lt;script&gt;alert(&quot;XSS&quot;)&lt;/script&gt;
Unescape HTML
Input
&lt;p&gt;Hello&lt;/p&gt;
Output
<p>Hello</p>

Code Examples

JavaScript
// Basic HTML escape
function escapeHtml(text) {
  const map = {
    '&': '&amp;',
    '<': '&lt;',
    '>': '&gt;',
    '"': '&quot;',
    "'": '&#039;'
  };
  return text.replace(/[&<>"']/g, m => map[m]);
}

// Unescape HTML entities
function unescapeHtml(text) {
  const map = {
    '&amp;': '&',
    '&lt;': '<',
    '&gt;': '>',
    '&quot;': '"',
    '&#039;': "'"
  };
  return text.replace(/&(?:amp|lt|gt|quot|#039);/g, m => map[m]);
}

// Using browser DOM (more complete)
function escapeHtmlDom(text) {
  const div = document.createElement('div');
  div.textContent = text;
  return div.innerHTML;
}

function unescapeHtmlDom(html) {
  const div = document.createElement('div');
  div.innerHTML = html;
  return div.textContent;
}

// Extended escape (all special chars)
function escapeHtmlExtended(text) {
  return text.replace(/[&<>"']/g, char => {
    return '&#' + char.charCodeAt(0) + ';';
  });
}

// Named entities
const htmlEntities = {
  nbsp: ' ',
  copy: '©',
  reg: '®',
  trade: '™',
  euro: '€',
  pound: '£',
  yen: '¥'
};

// Usage
const userInput = '<img src=x onerror="alert(1)">';
const safe = escapeHtml(userInput);
// Result: &lt;img src=x onerror=&quot;alert(1)&quot;&gt;

// Safe to insert into HTML
document.getElementById('output').textContent = userInput; // Safe (no HTML parsing)
document.getElementById('output').innerHTML = safe; // Also safe (escaped)
Python
import html

# Escape HTML
text = '<script>alert("XSS")</script>'
escaped = html.escape(text)
print(escaped)  # &lt;script&gt;alert(&quot;XSS&quot;)&lt;/script&gt;

# Unescape HTML
unescaped = html.unescape(escaped)
print(unescaped)  # <script>alert("XSS")</script>

# Custom escape function
def escape_html(text):
    return (text
        .replace('&', '&amp;')
        .replace('<', '&lt;')
        .replace('>', '&gt;')
        .replace('"', '&quot;')
        .replace("'", '&#039;'))

# For templates (use jinja2 auto-escaping)
from jinja2 import Template
template = Template('<p>{{ user_input }}</p>')
safe_html = template.render(user_input='<script>alert(1)</script>')
# Auto-escaped by default

Try it Now

💡 Tips

  • Always escape user input before displaying in HTML
  • Use framework auto-escaping (React, Vue, Angular)
  • textContent is safer than innerHTML
  • Escape for context (HTML, attribute, JavaScript)
  • Use DOMPurify for rich text/HTML
  • Named entities (&copy;) vs numeric (&#169;)
  • Essential for XSS prevention

⚠️ Common Pitfalls

  • Not escaping allows XSS attacks
  • Double escaping shows escaped entities
  • innerHTML interprets HTML even if escaped
  • Context matters (HTML vs JS vs CSS)
  • Some frameworks auto-escape, manual escape causes double
  • Incomplete character set leaves vulnerabilities