Regex Essentials: Validating HTML id Attributes

Ghost Inspector is an automated web testing tool that helps QA testers and engineers to easily build, edit, and run low-code/no-code web tests. Start a free trial.
Ghost Inspector mascot Ghostie with coding graphic and mouse pointer

When we initially started building our test recorder, we needed a way to validate the id attributes being used on the page. We would sometimes capture an id attribute in a recording, only to find that it failed when we used it in a test, because it didn’t meet specification. For instance, sometimes websites would use an id with a number in front, like this:

<div id="5-answer"><!-- … --></div>

That is technically invalid, at least in the HTML4 specification:

ID and NAME tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens (“-“), underscores (“_”), colons (“:”), and periods (“.”).

The HTML5 specification is a little more lax. It’s only requirement is that “the value must not contain any space characters.” However, many websites are still technically using HTML4, so sometimes these still need to be validated. Luckily, you can test for these requirements with a simple regular expression. Here’s the function in Javascript:

function isCssIdValid(id) {
  re = /^[A-Za-z]+[\w\-\:\.]*$/
  return re.test(id)
}

First, we make sure that the id starts with a letter, then we ensure the rest of it is an alphanumeric character, an underscore, a dash, a colon or a period. Note that the \w is the equivalent of [A-Za-z0-9_].

That’s it! 

Automate your web testing workflow with
Ghost Inspector

Our 14 day free trial gives you and your team full access. Create tests in minutes. No credit card required.