Skip to content

Commit a96d682

Browse files
committed
[css-syntax] Clarify how to decode from bytes
1 parent 02624fb commit a96d682

2 files changed

Lines changed: 36 additions & 32 deletions

File tree

css-syntax/Overview.html

Lines changed: 18 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -429,35 +429,40 @@ <h3 class="heading settled heading" data-level=3.2 id=input-byte-stream><span cl
429429
which the user agent must use to decode the bytes into <a data-link-type=dfn href=#code-point title="code points">code points</a>.
430430

431431
<p> To decode the stream of bytes into a stream of <a data-link-type=dfn href=#code-point title="code points">code points</a>,
432-
UAs must follow these steps.
432+
UAs must use the <a href=http://encoding.spec.whatwg.org/#decode>decode</a> algorithm
433+
defined in <a data-biblio-type=normative data-link-type=biblio href=#encoding title=encoding>[ENCODING]</a>,
434+
with the fallback encoding determined as follows.
433435

434-
<p> The algorithms to <dfn data-dfn-type=dfn data-noexport="" id=get-an-encoding><a href=http://encoding.spec.whatwg.org/#concept-encoding-get>get an encoding</a><a class=self-link href=#get-an-encoding></a></dfn>
435-
and <dfn data-dfn-type=dfn data-noexport="" id=decode><a href=http://encoding.spec.whatwg.org/#decode>decode</a><a class=self-link href=#decode></a></dfn>
436-
are defined in <a data-biblio-type=normative data-link-type=biblio href=#encoding title=encoding>[ENCODING]</a>.
436+
<p class=note> Note: The <a href=http://encoding.spec.whatwg.org/#decode>decode</a> algorithm
437+
gives precedence to a byte order mark (BOM),
438+
and only uses the fallback when none is found.
437439

438-
<p> First, <dfn data-dfn-type=dfn data-noexport="" id=determine-the-fallback-encoding>determine the fallback encoding<a class=self-link href=#determine-the-fallback-encoding></a></dfn>:
440+
<p> To <dfn data-dfn-type=dfn data-noexport="" id=determine-the-fallback-encoding>determine the fallback encoding<a class=self-link href=#determine-the-fallback-encoding></a></dfn>:
439441

440442
<ol>
441443
<li>
442444
If HTTP or equivalent protocol defines an encoding (e.g. via the charset parameter of the Content-Type header),
443-
<a data-link-type=dfn href=#get-an-encoding title="get an encoding">get an encoding</a> for the specified value.
445+
<a href=http://encoding.spec.whatwg.org/#concept-encoding-get>get an encoding</a> <a data-biblio-type=normative data-link-type=biblio href=#encoding title=encoding>[ENCODING]</a>
446+
for the specified value.
444447
If that does not return failure,
445448
use the return value as the fallback encoding.
446449

447450
<li>
448451
Otherwise, check the byte stream. If the first several bytes match the hex sequence
449452

450453
<pre>40 63 68 61 72 73 65 74 20 22 (not 22)* 22 3B</pre>
451-
<p> then <a data-link-type=dfn href=#get-an-encoding title="get an encoding">get an encoding</a> for the sequence of <code>(not 22)*</code> bytes,
454+
<p> then <a href=http://encoding.spec.whatwg.org/#concept-encoding-get>get an encoding</a> <a data-biblio-type=normative data-link-type=biblio href=#encoding title=encoding>[ENCODING]</a>
455+
for the sequence of <code>(not 22)*</code> bytes,
452456
decoded per <code>windows-1252</code>.
453457

454-
<p class=note> Note: Anything ASCII-compatible will do, so using <code>windows-1252</code> is fine.
458+
<p class=note> Note: Anything ASCII-compatible will do since valid labels are all ASCII,
459+
so using <code>windows-1252</code> is fine.
455460

456461

457462
<p class=note> Note: The byte sequence above,
458463
when decoded as ASCII,
459464
is the string "<code>@charset "…";</code>",
460-
where the "…" is the sequence of bytes corresponding to the encoding’s name.
465+
where the "…" is the sequence of bytes corresponding to the encoding’s label.
461466

462467
<p> If the return value was <code>utf-16</code> or <code>utf-16be</code>,
463468
use <code>utf-8</code> as the fallback encoding;
@@ -474,10 +479,6 @@ <h3 class="heading settled heading" data-level=3.2 id=input-byte-stream><span cl
474479
Otherwise, use <code>utf-8</code> as the fallback encoding.
475480
</ol>
476481

477-
<p> Then, <a data-link-type=dfn href=#decode title=decode>decode</a> the byte stream using the fallback encoding.
478-
479-
<p class=note> Note: the <a data-link-type=dfn href=#decode title=decode>decode</a> algorithm lets the byte order mark (BOM) take precedence,
480-
hence the usage of the term "fallback" above.
481482

482483
<h3 class="heading settled heading" data-level=3.3 id=environment-encoding><span class=secno>3.3 </span><span class=content>
483484
Environment encoding</span><a class=self-link href=#environment-encoding></a></h3>
@@ -502,7 +503,8 @@ <h4 class="heading settled heading" data-level=3.3.2 id=environment-encoding-htm
502503

503504
<p> <ul>
504505
<li>
505-
<a data-link-type=dfn href=#get-an-encoding title="get an encoding">Get an encoding</a> for the value of the <code>charset</code> attribute of the element, if any.
506+
<a href=http://encoding.spec.whatwg.org/#concept-encoding-get>Get an encoding</a> <a data-biblio-type=normative data-link-type=biblio href=#encoding title=encoding>[ENCODING]</a>
507+
for the value of the <code>charset</code> attribute of the element, if any.
506508
If that does not return failure,
507509
use the return value as the environment encoding.
508510

@@ -525,7 +527,8 @@ <h4 class="heading settled heading" data-level=3.3.3 id=environment-encoding-xml
525527

526528
<p> <ul>
527529
<li>
528-
<a data-link-type=dfn href=#get-an-encoding title="get an encoding">Get an encoding</a> for the value of the <code>charset</code> <a href=http://www.w3.org/TR/xml-stylesheet/#dt-pseudo-attribute>pseudo-attribute</a> of the processing instruction, if any.
530+
<a href=http://encoding.spec.whatwg.org/#concept-encoding-get>Get an encoding</a> <a data-biblio-type=normative data-link-type=biblio href=#encoding title=encoding>[ENCODING]</a>
531+
for the value of the <code>charset</code> <a href=http://www.w3.org/TR/xml-stylesheet/#dt-pseudo-attribute>pseudo-attribute</a> of the processing instruction, if any.
529532
If that does not return failure,
530533
use the return value as the environment encoding.
531534

@@ -5038,7 +5041,6 @@ <h2 class="no-num no-ref heading settled heading" id=index><span class=content>
50385041
<li>&lt;dashndashdigit-ident&gt;, <a href=#typedef-dashndashdigit-ident title="section 6.2">6.2</a>
50395042
<li>declaration, <a href=#declaration title="section 5">5</a>
50405043
<li>&lt;declaration-list&gt;, <a href=#typedef-declaration-list title="section 7.1">7.1</a>
5041-
<li>decode, <a href=#decode title="section 3.2">3.2</a>
50425044
<li>&lt;delim-token&gt;, <a href=#typedef-delim-token title="section 4">4</a>
50435045
<li>determine the fallback encoding, <a href=#determine-the-fallback-encoding title="section 3.2">3.2</a>
50445046
<li>digit, <a href=#digit title="section 4.2">4.2</a>
@@ -5052,7 +5054,6 @@ <h2 class="no-num no-ref heading settled heading" id=index><span class=content>
50525054
<li>escaping, <a href=#escaping0 title="section 2.1">2.1</a>
50535055
<li>function, <a href=#function title="section 5">5</a>
50545056
<li>&lt;function-token&gt;, <a href=#typedef-function-token title="section 4">4</a>
5055-
<li>get an encoding, <a href=#get-an-encoding title="section 3.2">3.2</a>
50565057
<li>&lt;hash-token&gt;, <a href=#typedef-hash-token title="section 4">4</a>
50575058
<li>hex digit, <a href=#hex-digit title="section 4.2">4.2</a>
50585059
<li>identifier, <a href=#identifier title="section 4.2">4.2</a>

css-syntax/Overview.src.html

Lines changed: 18 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -265,18 +265,21 @@ <h3 id="input-byte-stream">
265265
which the user agent must use to decode the bytes into <a>code points</a>.
266266

267267
To decode the stream of bytes into a stream of <a>code points</a>,
268-
UAs must follow these steps.
268+
UAs must use the <a href="http://encoding.spec.whatwg.org/#decode">decode</a> algorithm
269+
defined in [[!ENCODING]],
270+
with the fallback encoding determined as follows.
269271

270-
The algorithms to <dfn><a href="http://encoding.spec.whatwg.org/#concept-encoding-get">get an encoding</a></dfn>
271-
and <dfn><a href="http://encoding.spec.whatwg.org/#decode">decode</a></dfn>
272-
are defined in [[!ENCODING]].
272+
Note: The <a href="http://encoding.spec.whatwg.org/#decode">decode</a> algorithm
273+
gives precedence to a byte order mark (BOM),
274+
and only uses the fallback when none is found.
273275

274-
First, <dfn>determine the fallback encoding</dfn>:
276+
To <dfn>determine the fallback encoding</dfn>:
275277

276278
<ol>
277279
<li>
278280
If HTTP or equivalent protocol defines an encoding (e.g. via the charset parameter of the Content-Type header),
279-
<a>get an encoding</a> for the specified value.
281+
<a href="http://encoding.spec.whatwg.org/#concept-encoding-get">get an encoding</a> [[!ENCODING]]
282+
for the specified value.
280283
If that does not return failure,
281284
use the return value as the fallback encoding.
282285

@@ -285,16 +288,18 @@ <h3 id="input-byte-stream">
285288

286289
<pre>40 63 68 61 72 73 65 74 20 22 (not 22)* 22 3B</pre>
287290

288-
then <a>get an encoding</a> for the sequence of <code>(not 22)*</code> bytes,
291+
then <a href="http://encoding.spec.whatwg.org/#concept-encoding-get">get an encoding</a> [[!ENCODING]]
292+
for the sequence of <code>(not 22)*</code> bytes,
289293
decoded per <code>windows-1252</code>.
290294

291-
Note: Anything ASCII-compatible will do, so using <code>windows-1252</code> is fine.
295+
Note: Anything ASCII-compatible will do since valid labels are all ASCII,
296+
so using <code>windows-1252</code> is fine.
292297

293298

294299
Note: The byte sequence above,
295300
when decoded as ASCII,
296301
is the string "<code>@charset "…";</code>",
297-
where the "…" is the sequence of bytes corresponding to the encoding's name.
302+
where the "…" is the sequence of bytes corresponding to the encoding's label.
298303

299304
If the return value was <code>utf-16</code> or <code>utf-16be</code>,
300305
use <code>utf-8</code> as the fallback encoding;
@@ -311,10 +316,6 @@ <h3 id="input-byte-stream">
311316
Otherwise, use <code>utf-8</code> as the fallback encoding.
312317
</ol>
313318

314-
Then, <a>decode</a> the byte stream using the fallback encoding.
315-
316-
Note: the <a>decode</a> algorithm lets the byte order mark (BOM) take precedence,
317-
hence the usage of the term "fallback" above.
318319

319320
<h3 id="environment-encoding">
320321
Environment encoding</h3>
@@ -339,7 +340,8 @@ <h4 id="environment-encoding-html">
339340

340341
<ul>
341342
<li>
342-
<a>Get an encoding</a> for the value of the <code>charset</code> attribute of the element, if any.
343+
<a href="http://encoding.spec.whatwg.org/#concept-encoding-get">Get an encoding</a> [[!ENCODING]]
344+
for the value of the <code>charset</code> attribute of the element, if any.
343345
If that does not return failure,
344346
use the return value as the environment encoding.
345347

@@ -365,7 +367,8 @@ <h4 id="environment-encoding-xml">
365367

366368
<ul>
367369
<li>
368-
<a>Get an encoding</a> for the value of the <code>charset</code> <a href=http://www.w3.org/TR/xml-stylesheet/#dt-pseudo-attribute>pseudo-attribute</a> of the processing instruction, if any.
370+
<a href="http://encoding.spec.whatwg.org/#concept-encoding-get">Get an encoding</a> [[!ENCODING]]
371+
for the value of the <code>charset</code> <a href=http://www.w3.org/TR/xml-stylesheet/#dt-pseudo-attribute>pseudo-attribute</a> of the processing instruction, if any.
369372
If that does not return failure,
370373
use the return value as the environment encoding.
371374

0 commit comments

Comments
 (0)