🔍Regex Recipes
HTML Tag Stripper (WARNING)
Basic HTML tag removal pattern. WARNING: Regex cannot reliably parse HTML - use proper HTML parser.
Pattern
<[^>]+>Explanation
Matches anything between < and >. IMPORTANT: This fails on malformed HTML, comments, CDATA, scripts. Use HTML parser instead.
Examples
Simple tag
Input
<p>Hello</p>
Output
Strips to: Hello
With attributes
Input
<div class="test">Text</div>
Output
Strips to: Text
Fails on script
Input
<script>if(a<b){}</script>Output
⚠️ Breaks on < in script!
Code Examples
JavaScript
// ❌ DON'T USE REGEX FOR HTML
const badApproach = html.replace(/<[^>]+>/g, '');
// ✅ USE PROPER HTML PARSER
function stripHTMLSafely(html) {
const tmp = document.createElement('div');
tmp.innerHTML = html;
return tmp.textContent || tmp.innerText || '';
}
// For Node.js, use libraries like:
// - jsdom
// - htmlparser2
// - cheerioTry it Now
💡 Tips
- Use DOMParser or libraries (jsdom, cheerio)
- For sanitization, use DOMPurify
- Regex is only OK for guaranteed simple HTML
- Test with malformed and malicious input
⚠️ Common Pitfalls
- CRITICAL: Regex cannot properly parse HTML
- Fails on: comments, CDATA, doctype, script content
- Security risk: can miss malicious content
- Breaks on < and > in attributes or content
- Use proper HTML parser/sanitizer