🔍Regex Recipes

HTML Tag Stripper (WARNING)

Basic HTML tag removal pattern. WARNING: Regex cannot reliably parse HTML - use proper HTML parser.

Pattern

<[^>]+>

Explanation

Matches anything between < and >. IMPORTANT: This fails on malformed HTML, comments, CDATA, scripts. Use HTML parser instead.

Examples

Simple tag
Input
<p>Hello</p>
Output
Strips to: Hello
With attributes
Input
<div class="test">Text</div>
Output
Strips to: Text
Fails on script
Input
<script>if(a<b){}</script>
Output
⚠️ Breaks on < in script!

Code Examples

JavaScript
// ❌ DON'T USE REGEX FOR HTML
const badApproach = html.replace(/<[^>]+>/g, '');

// ✅ USE PROPER HTML PARSER
function stripHTMLSafely(html) {
  const tmp = document.createElement('div');
  tmp.innerHTML = html;
  return tmp.textContent || tmp.innerText || '';
}

// For Node.js, use libraries like:
// - jsdom
// - htmlparser2
// - cheerio

Try it Now

💡 Tips

  • Use DOMParser or libraries (jsdom, cheerio)
  • For sanitization, use DOMPurify
  • Regex is only OK for guaranteed simple HTML
  • Test with malformed and malicious input

⚠️ Common Pitfalls

  • CRITICAL: Regex cannot properly parse HTML
  • Fails on: comments, CDATA, doctype, script content
  • Security risk: can miss malicious content
  • Breaks on < and > in attributes or content
  • Use proper HTML parser/sanitizer