Skip to content

Commit 6722010

Browse files
committed
[css3-syntax] Update the charset rules to latest proposal from Anne.
1 parent 36fdf72 commit 6722010

2 files changed

Lines changed: 88 additions & 38 deletions

File tree

css3-syntax/Overview.html

Lines changed: 48 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -11,10 +11,10 @@
1111

1212
<meta content="CSS Syntax Module Level 3 (CSS3 Syntax)" name=dcterms.title>
1313
<meta content=text name=dcterms.type>
14-
<meta content=2012-10-23 name=dcterms.issued>
14+
<meta content=2012-10-31 name=dcterms.issued>
1515
<meta content="http://dev.w3.org/csswg/css3-syntax/" name=dcterms.creator>
1616
<meta content=W3C name=dcterms.publisher>
17-
<meta content="http://www.w3.org/TR/2012/ED-css3-syntax-20121023/"
17+
<meta content="http://www.w3.org/TR/2012/ED-css3-syntax-20121031/"
1818
name=dcterms.identifier>
1919
<link href="#contents" rel=contents>
2020
<link href="#index" rel=index>
@@ -47,7 +47,7 @@
4747

4848
<h1>CSS Syntax Module Level 3</h1>
4949

50-
<h2 class="no-num no-toc" id=longstatus-date>Editor's Draft 23 October
50+
<h2 class="no-num no-toc" id=longstatus-date>Editor's Draft 31 October
5151
2012</h2>
5252

5353
<dl>
@@ -536,39 +536,59 @@ <h3 id=the-input-byte-stream><span class=secno>3.2. </span> The input byte
536536
<p> To decode the stream of bytes into a stream of characters, UAs must
537537
follow these steps.
538538

539+
<p> The algorithms to <a
540+
href="http://encoding.spec.whatwg.org/#concept-encoding-get"><dfn
541+
id=get-an-encoding>get an encoding</dfn></a> and <a
542+
href="http://encoding.spec.whatwg.org/#decode"><dfn
543+
id=decode>decode</dfn></a> are defined in the <a
544+
href="http://encoding.spec.whatwg.org/">Encoding Standard</a>.
545+
539546
<ol>
540-
<li> Let <var>label</var> be null.
547+
<li> Let <var>encoding</var> be utf-8.
541548

542549
<li> If HTTP or equivalent protocol defines an encoding (e.g. via the
543-
charset parameter of the Content-Type header), set <var>label</var> to
544-
the value so defined.
550+
charset parameter of the Content-Type header), <a
551+
href="#get-an-encoding"><i>get an encoding</i></a> for the specified
552+
value. If that does not return failure, set <var>encoding</var> to the
553+
return value and jump to the last step of this algorithm.
545554

546-
<li> If <var>label</var> is null, check the byte stream. If the first
547-
several bytes match the hex sequence
555+
<li> Check the byte stream. If the first several bytes match the hex
556+
sequence
548557
<pre>40 63 68 61 72 73 65 74 20 22 (XX)* 22 3B</pre>
549-
then set <var>label</var> to the sequence of XX bytes, decoded per
550-
<code>windows-1252</code>.
558+
then <a href="#get-an-encoding"><i>get an encoding</i></a> for the
559+
sequence of XX bytes, decoded per <code>windows-1252</code>, and let
560+
<var>temp</var> be the return value.
551561
<p class=note> Note: Anything ASCII-compatible will do, so using
552562
<code>windows-1252</code> is fine.
553563

554-
<li> If <var>label</var> is null, set <var>label</var> to the value of
555-
charset attribute on the <code>&lt;link></code> element that caused the
556-
style sheet to be included, if any.
564+
<p> If <var>temp</var> is <code>utf-16</code> or <code>utf-16be</code>,
565+
set <var>temp</var> to <code>utf-8</code>. If <var>temp</var> is not
566+
failure, set <var>encoding</var> to it and jump to the last step.
567+
568+
<p class=note> This mimics HTML <code>&lt;meta></code> behavior.
557569

558-
<li> If <var>label</var> is null, there is a referring style sheet or
559-
document, and that referring sheet or document's encoding is not
560-
<code>utf-16</code> or <code>utf-16be</code>, set <var>label</var> to
561-
that encoding.
570+
<li> <a href="#get-an-encoding"><i>Get an encoding</i></a> for the value
571+
of the <code>charset</code> attribute on the <code>&lt;link></code>
572+
element or <code>&lt;?xml-stylesheet?></code> processing instruction that
573+
caused the style sheet to be included, if any. If that does not return
574+
failure, set <var>encoding</var> to the return value and jump to the last
575+
step.
562576

563-
<li> Let <var>encoding</var> be the result of <a
564-
href="http://encoding.spec.whatwg.org/#concept-encoding-get">getting an
565-
encoding from <var>label</var></a>. If <var>encoding</var> is
566-
<code>failure</code>, set it to <code>utf-8</code>.
577+
<li> Set <var>encoding</var> to the encoding of the referring style sheet
578+
or document, if any.
567579

568-
<li> <a href="http://encoding.spec.whatwg.org/#decode">Decode the byte
569-
stream</a> using fallback encoding <var>encoding</var>.
580+
<li> <a href="#decode"><i>Decode</i></a> the byte stream using fallback
581+
encoding <var>encoding</var>.
582+
<p class=note> Note: the <a href="#decode"><i>decode</i></a> algorithm
583+
lets the byte order mark (BOM) take precedence, hence the usage of the
584+
term "fallback" above.
570585
</ol>
571586

587+
<p class=issue> Anne says that steps 4/5 should be an input to this
588+
algorithm from the specs that define importing stylesheet, to make the
589+
algorithm as a whole cleaner. Perhaps abstract it into the concept of an
590+
"environment charset" or something?
591+
572592
<h4 id=preprocessing-the-input-stream><span class=secno>3.2.1. </span>
573593
Preprocessing the input stream</h4>
574594

@@ -3808,6 +3828,8 @@ <h2 class=no-num id=index> Index</h2>
38083828
<li>Declaration-value mode, <a href="#declaration-value-mode0"
38093829
title="Declaration-value mode"><strong>3.6.8.</strong></a>
38103830

3831+
<li>decode, <a href="#decode" title=decode><strong>3.2.</strong></a>
3832+
38113833
<li>digit, <a href="#digit" title=digit><strong>3.4.3.</strong></a>
38123834

38133835
<li>Dimension state, <a href="#dimension-state0"
@@ -3828,6 +3850,9 @@ <h2 class=no-num id=index> Index</h2>
38283850
<li>Finish parsing, <a href="#finish-parsing0"
38293851
title="Finish parsing"><strong>3.6.22.</strong></a>
38303852

3853+
<li>get an encoding, <a href="#get-an-encoding"
3854+
title="get an encoding"><strong>3.2.</strong></a>
3855+
38313856
<li>hashless color quirk list, <a href="#hashless-color-quirk-list"
38323857
title="hashless color quirk list"><strong>3.6.15.</strong></a>
38333858

css3-syntax/Overview.src.html

Lines changed: 40 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -253,42 +253,67 @@ <h3>
253253
To decode the stream of bytes into a stream of characters,
254254
UAs must follow these steps.
255255

256+
<p>
257+
The algorithms to <a href="http://encoding.spec.whatwg.org/#concept-encoding-get"><dfn>get an encoding</dfn></a>
258+
and <a href="http://encoding.spec.whatwg.org/#decode"><dfn>decode</dfn></a>
259+
are defined in the <a href="http://encoding.spec.whatwg.org/">Encoding Standard</a>.
260+
256261
<ol>
257262
<li>
258-
Let <var>label</var> be null.
263+
Let <var>encoding</var> be utf-8.
259264

260265
<li>
261266
If HTTP or equivalent protocol defines an encoding (e.g. via the charset parameter of the Content-Type header),
262-
set <var>label</var> to the value so defined.
267+
<i>get an encoding</i> for the specified value.
268+
If that does not return failure,
269+
set <var>encoding</var> to the return value
270+
and jump to the last step of this algorithm.
263271

264272
<li>
265-
If <var>label</var> is null, check the byte stream. If the first several bytes match the hex sequence
273+
Check the byte stream. If the first several bytes match the hex sequence
266274

267275
<pre>40 63 68 61 72 73 65 74 20 22 (XX)* 22 3B</pre>
268276

269-
then set <var>label</var> to the sequence of XX bytes, decoded per <code>windows-1252</code>.
277+
then <i>get an encoding</i> for the sequence of XX bytes,
278+
decoded per <code>windows-1252</code>,
279+
and let <var>temp</var> be the return value.
270280

271281
<p class='note'>
272282
Note: Anything ASCII-compatible will do, so using <code>windows-1252</code> is fine.
273-
274-
<li>
275-
If <var>label</var> is null,
276-
set <var>label</var> to the value of charset attribute on the <code>&lt;link></code> element that caused the style sheet to be included, if any.
283+
284+
<p>
285+
If <var>temp</var> is <code>utf-16</code> or <code>utf-16be</code>,
286+
set <var>temp</var> to <code>utf-8</code>.
287+
If <var>temp</var> is not failure,
288+
set <var>encoding</var> to it
289+
and jump to the last step.
290+
291+
<p class='note'>
292+
This mimics HTML <code>&lt;meta></code> behavior.
277293

278294
<li>
279-
If <var>label</var> is null,
280-
there is a referring style sheet or document,
281-
and that referring sheet or document's encoding is not <code>utf-16</code> or <code>utf-16be</code>,
282-
set <var>label</var> to that encoding.
295+
<i>Get an encoding</i> for the value of the <code>charset</code> attribute on the <code>&lt;link></code> element or <code>&lt;?xml-stylesheet?></code> processing instruction that caused the style sheet to be included, if any.
296+
If that does not return failure,
297+
set <var>encoding</var> to the return value
298+
and jump to the last step.
283299

284300
<li>
285-
Let <var>encoding</var> be the result of <a href="http://encoding.spec.whatwg.org/#concept-encoding-get">getting an encoding from <var>label</var></a>.
286-
If <var>encoding</var> is <code>failure</code>, set it to <code>utf-8</code>.
301+
Set <var>encoding</var> to the encoding of the referring style sheet or document,
302+
if any.
287303

288304
<li>
289-
<a href="http://encoding.spec.whatwg.org/#decode">Decode the byte stream</a> using fallback encoding <var>encoding</var>.
305+
<i>Decode</i> the byte stream using fallback encoding <var>encoding</var>.
306+
307+
<p class='note'>
308+
Note: the <i>decode</i> algorithm lets the byte order mark (BOM) take precedence,
309+
hence the usage of the term "fallback" above.
290310
</ol>
291311

312+
<p class='issue'>
313+
Anne says that steps 4/5 should be an input to this algorithm from the specs that define importing stylesheet,
314+
to make the algorithm as a whole cleaner.
315+
Perhaps abstract it into the concept of an "environment charset" or something?
316+
292317

293318
<h4>
294319
Preprocessing the input stream</h4>

0 commit comments

Comments
 (0)