Skip to content

[selectors] is #42 a valid ID selector? #202

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dstoc opened this issue Jun 20, 2016 · 3 comments
Closed

[selectors] is #42 a valid ID selector? #202

dstoc opened this issue Jun 20, 2016 · 3 comments

Comments

@dstoc
Copy link

dstoc commented Jun 20, 2016

https://drafts.csswg.org/selectors/#id-selectors says:

An ID selector consists of a “number sign” (U+0023, #) immediately followed by the ID value, which must be a CSS identifier.

But in the grammar <id-selector> is defined as <hash-token> which would appear to allow a sequence beginning with numbers.

@tabatkins
Copy link
Member

It's just something I accidentally dropped when converting the grammar. Fixed now (with a prose requirement below the main grammar).

@ericrannaud
Copy link

ericrannaud commented Jun 22, 2016

Shouldn't the CSS and HTML specs agree on the form of an ID?

To quote from the current draft (https://github.com/w3c/html/blob/master/sections/dom.include):

There are no other restrictions on what form an ID can take; in particular, IDs can consist of
just digits, start with a digit, start with an underscore, consist of just punctuation, etc.

I understand that one spec talks about the form of ID attribute values and the other spec defines the ID selector syntax, and these are two different things.

And I understand that you can escape the ID 42 to get a valid ID selector #\x34\x32 (Is that correct? Neither Chrome nor Firefox accept this).

EDIT: the correct escape syntax is "#\34\32" and this does indeed work in Chrome and Firefox.

However, developers routinely use simple string operations to build selectors from attribute values. To be safe, assuming the above is true, they would have to always try and escape ID attribute values if they start with a digit. But no one ever does that, or is aware they need to.

jQuery and Sprint actually check the form of the selector, and have a partial workaround. If the selector matches /^#[\w-]+$/, then they use document.getElementById(selector.substring(1)) rather than document.querySelectorAll(selector). This works for:

$("#42")

But this doesn't work for:

$("#42.test")

because they don't recognize it. Admittedly, ".test" is redundant if IDs are unique in the document, but the point remains that building selectors with string operations is much more complicated if one needs to care whether we're handling ID values or not.

With the following document, with non-unique IDs:

<div id=42>A</div>
<div id=B>
  <div id=42>B</div>
</div>

jQuery is actually able to successfully execute $("#B").find("#42"), but only by doing a lot of slow manual work. The call to $b.find("#42") follows this sequence:

  1. They first try result = document.getElementById("42"), but check if result is contained within $b, here it's not;
  2. Try to call querySelectorAll("#42") on the Element in $b to restrict the search to its descendants, but that fails because the selector is not valid;
  3. Manually go through the descendants of the Element in $b using a custom matcher built by parsing the selector (note that jQuery's CSS selector parser accepts #42...).

Sprint (a jQuery alternative) is not that sophisticated and fails.

If Element had a getElementById() method, then it wouldn't be so bad. Right now (maybe because of Firefox and Chrome bugs), there is no way to resolve $b.find("#42") using fast, native browser methods.

Are there fundamental objections to relaxing the ID selector syntax so that "#" + element.id be always a valid ID selector when working with an HTML5 document?

@tabatkins
Copy link
Member

Are there fundamental objections to relaxing the ID selector syntax so that "#" + element.id be always a valid ID selector when working with an HTML5 document?

Yes, "# + element.id" is far wider than the CSS Syntax spec's notion of a <hash-token>, which is well-established for two decades or so. Extending this to accommodate basically any character whatsoever would be a significant change to CSS parsing, and would be pretty weird in the context of what CSS normally understands as a token. (For example, an ID can contain the # character, or other characters used in Selector parsing like . or :.) Basically it would require a hash-token to be "every character between a # and the next space", which is super not-compatible.

You can still select any ID you want in CSS, you just have to escape it to match the grammar. In particular, if you want to match an element with id="42", you can write a selector like #\34 2 (note you have to double-escape the slash if writing this in JS, so document.querySelector("#\\34 2")). Is this convenient? No, of course not. But if you want convenience, stick to the generous syntax that CSS is friendly to.

@tabatkins tabatkins added the selectors-4 Current Work label Jun 28, 2016
birtles added a commit to birtles/csswg-drafts that referenced this issue Dec 4, 2017
Revise simple iteration progress when an interval is clipped
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants