11<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
22<html lang="en">
3- <!-- $Id: syndata.src,v 2.193 2013-05-02 13:02:40 bbos Exp $ -->
3+ <!-- $Id: syndata.src,v 2.194 2013-05-02 14:01:28 bbos Exp $ -->
44<head>
55<title>Syntax and basic data types</title>
66<!--script src="http://www.w3c-test.org/css/harness/annotate.js#CSS21_DEV" type="text/javascript" defer></script-->
@@ -1409,23 +1409,15 @@ encoding::default|default::character encoding">character
14091409encoding</span> (from highest priority to lowest):
14101410</p>
14111411<ol>
1412- <li><span class="index-inst">BOM</span>
14131412<li>An HTTP "charset" parameter in a "Content-Type" field
14141413(or similar parameters in other protocols)</li>
1415- <li><span
1414+ <li><span class="index-inst">BOM</span> and/or <span
14161415class="index-inst">@charset</span> (see below)</li>
14171416<li><code><link charset=""></code> or other metadata from the linking mechanism (if any)</li>
14181417<li>charset of referring style sheet or document (if any)</li>
14191418<li>Assume UTF-8</li>
14201419</ol>
14211420
1422- <p class=note>Note that it is not possible to use a 1-byte character
1423- encoding and start the CSS file with the characters 255 and 254 in
1424- either order, because the two characters will be interpreted as a
1425- BOM. E.g., "ÿ" and "þ" in ISO-8859-1, "˙" and "ţ"
1426- in ISO-8859-2, etc. Authors should start such files with something
1427- else, e.g., a space.
1428-
14291421<p>Authors using an <span class="index-inst">@charset</span> rule must
14301422place the rule at the very beginning of the style sheet, preceded by
14311423no characters. (If a byte order mark is appropriate for the encoding
@@ -1452,6 +1444,27 @@ registry.
14521444<p>User agents must support at least the <span
14531445class="index-inst">UTF-8</span> encoding.
14541446</p>
1447+
1448+ <p>If rule 1 above (an HTTP "charset" parameter or similar) yields a
1449+ character encoding and it is one of UTF-8, UTF-16, UTF-16BE, UTF-16LE,
1450+ UTF-32, UTF-32BE or UTF-32LE, then a BOM, if any, at the start of the
1451+ file overrides that character encoding, as follows:
1452+
1453+ <table>
1454+ <thead>
1455+ <th><th>First bytes (hexadecimal) <th>Resulting encoding
1456+ <tbody>
1457+ <tr><td>00 00 FE FF <td>UTF-32, big-endian
1458+ <tr><td>FF FE 00 00 <td>UTF-32, little-endian
1459+ <tr><td>FE FF <td>UTF-16, big-endian
1460+ <tr><td>FF FE <td>UTF-16, little-endian
1461+ <tr><td>EF BB BF <td>UTF-8
1462+ </table>
1463+
1464+ <p class=note>Note that, if rule 1 yields UTF-16BE, UTF-16LE, UTF-32BE
1465+ or UTF-32LE, a BOM at the start of the file is an error. (Unicode
1466+ forbids a BOM in such files).
1467+
14551468<p>User agents must ignore any @charset rule not at the beginning of the
14561469style sheet. When user agents detect the character encoding using the
14571470BOM and/or the @charset rule, they should follow the following rules:
0 commit comments