πŸ”Regex Recipes

Detect Non-ASCII Characters

Find characters outside ASCII range (0-127) for text sanitization.

Pattern

[^\x00-\x7F]

Explanation

Matches any character outside the ASCII range (characters with code > 127). Useful for detecting international characters.

Examples

ASCII only
Input
Hello World
Output
βœ— No match
Has accents
Input
CafΓ©
Output
βœ“ Match: Γ©
Has emoji
Input
Hello πŸ‘‹
Output
βœ“ Match: πŸ‘‹
Cyrillic
Input
ΠŸΡ€ΠΈΠ²Π΅Ρ‚
Output
βœ“ Match: all chars

Code Examples

JavaScript
const nonAsciiRegex = /[^\x00-\x7F]/g;

// Check if string contains non-ASCII
function hasNonASCII(str) {
  return nonAsciiRegex.test(str);
}

// Remove non-ASCII characters
function removeNonASCII(str) {
  return str.replace(nonAsciiRegex, '');
}

// Replace with placeholder
function sanitize(str) {
  return str.replace(nonAsciiRegex, '?');
}

// Find all non-ASCII characters
function findNonASCII(str) {
  return str.match(nonAsciiRegex) || [];
}

// Example
const text = 'Hello CafΓ© πŸ‘‹';
console.log(hasNonASCII(text)); // true
console.log(removeNonASCII(text)); // 'Hello Caf '
console.log(findNonASCII(text)); // ['Γ©', 'πŸ‘‹']

Try it Now

πŸ’‘ Tips

  • Consider if you need to support international text
  • Use for validation, not blindly removing
  • ASCII range: 0-127 (0x00-0x7F)
  • Extended ASCII: 128-255
  • For slug generation, transliterate instead of removing

⚠️ Common Pitfalls

  • Removing non-ASCII may break international text
  • UTF-8 and Unicode are different from ASCII
  • Extended ASCII (128-255) also matches
  • Emoji are non-ASCII and may need special handling