-
Notifications
You must be signed in to change notification settings - Fork 756
Description
In WICG/construct-stylesheets#61 (comment), Boris points out that the Syntax spec specifies that the input to the tokenizer is a stream of code points, and wonders if I actually mean scalar values there. (That is, all codepoints except surrogates.)
I think at the time I wrote this, USVString didn't yet exist, and the distinction between the two wasn't really present in specs. But if I were writing it today, I'm pretty sure I'd use scalar values.
In particular, note that you can't produce a surrogate code point from an escape, which suggests that I assumed no surrogates would show up in the stream.
So, I think I should switch the spec over to referring to scalar values, and have a conversion step for going from codepoints to scalars (probably just converting non-scalars to U+FFFD? I'll look at impls and see what's up).
<!DOCTYPE html>
<style></style>
<script>
document.querySelector("style").textContent = ".fo\ud800o { color: blue; }";
w([...document.styleSheets[0].cssRules[0].selectorText].map(x=>x.codePointAt(0).toString(16)));
</script>Looks like Chrome retains the character as U+D800, while Firefox censors it to U+FFFD. Perhaps this is just related to which definition each uses for CSSOMString?