11<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
22 "http://www.w3.org/TR/1998/REC-html40-19980424/loose.dtd">
33<html lang="en">
4- <!-- $Id: syndata.src,v 2.93 2003-12-08 16:23:21 bbos Exp $ -->
4+ <!-- $Id: syndata.src,v 2.94 2004-02-09 17:24:09 bbos Exp $ -->
55<head>
66
133E
td><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
77<title>Syntax and basic data types</title>
88<meta name="editor" lang="tr" content="Tantek Çelik">
9- <!-- Changed by: Tantek Celik, 2003-08-24 -->
9+ <!-- Changed by: Tantek Celik, 2004-02-05 -->
1010<style type="text/css">
1111 span.colorsquare { float:left; width:5em; height:3em; text-align:center; padding:1.2em 0 .8em }
1212 span.colorname { font-weight:bold }
@@ -130,14 +130,14 @@ in the grammar (to keep it readable), but any number of these tokens
130130may appear anywhere between other tokens.
131131</p>
132132<p>The token S in the grammar above stands for <a
133- name="whitespace">whitespace</a>. Only the characters "space" (Unicode
134- code 32), "tab" (9), "line feed" (10), "carriage return" (13), and
135- "form feed" (12) can occur in whitespace. Other space-like characters,
136- such as "em-space" (8195) and "ideographic space" (12288), are never
137- part of whitespace.
133+ name="whitespace">whitespace</a>. Only the characters "space" (<!--Unicode
134+ code 32-->U+0020), "tab" (U+0009), "line feed" (<!--10-->U+000A), "carriage return" (<!--13-->U+000D), and
135+ "form feed" (<!--12-->U+000C) can occur in whitespace. Other space-like characters,
136+ such as "em-space" (<!--8195-->U+2003) and "ideographic space" (<!--12288-->U+3000), are never part of whitespace.
138137</p>
139138<p>The meaning of input that cannot be tokenized or parsed is
140139undefined in CSS 2.1.
140+ </p>
141141
142142<h3><a name="keywords">Keywords</a></h3>
143143
@@ -230,9 +230,12 @@ href="#parsing-errors">rules for handling parsing errors</a>. However, because t
230230 name="value-def-identifier"><dfn>identifiers</dfn></a></span>
231231 (including element names, classes, and IDs in <a
232232 href="selector.html">selectors</a>) can contain only the
233- characters [A-Za-z0-9] and ISO 10646 characters 161 and higher,
234- plus the hyphen (-) and the underscore (_); they cannot start with a
235- digit. They can also contain escaped characters and any ISO 10646
233+ characters [A-Za-z0-9] and ISO 10646 characters <!--161--> U+00A1 and higher,
234+ plus the hyphen (-) and the underscore (_); they cannot start with a digit.
235+ Only properties, values, units, pseudo-classes,
236+ pseudo-elements, and at-rules may start with a hyphen (-); other
237+ identifiers (e.g. element names, classes, or IDs) may not.
238+ Identifiers can also contain escaped characters and any ISO 10646
236239 character as a numeric code (see next item).
237240 <span class="example">For instance, the identifier "B&W?" may
238241 be written as "B\&W\?" or "B\26 W\3F".</span>
@@ -259,15 +262,16 @@ href="#parsing-errors">rules for handling parsing errors</a>. However, because t
259262 <p>Third, backslash escapes allow authors to refer to characters
260263 they can't easily put in a document. In this case, the backslash
261264 is followed by at most six hexadecimal digits (0..9A..F), which
262- stand for the ISO 10646 ([[ISO10646]]) character with
263- that number. If a character in the range [0-9a-zA-Z] follows the hexadecimal number,
265+ stand for the ISO 10646 ([[ISO10646]])
266+ character with that number, which must not be zero.
267+ If a character in the range [0-9a-fA-F] follows the hexadecimal number,
264268 the end of the number needs to be made clear. There are two ways
265269 to do that:
266270 </p>
267271 <ol>
268272 <li>with a space (or other whitespace character): "\26 B" ("&B").
269273 In this case, user agents should treat a "CR/LF" pair
270- (13/10) as a single whitespace character.</li>
274+ (<!-- 13/10-->U+000D/U+000A ) as a single whitespace character.</li>
271275 <li>by providing exactly 6 hexadecimal digits: "\000026B" ("&B")</li>
272276 </ol>
273277
@@ -643,6 +647,7 @@ h1 { color: blue }
643647
644648<p>How to handle unparseable and untokenizable stylesheets is
645649undefined in CSS2.1
650+ </p>
646651
647652<h2><a name="values">Values</a></h2>
648653
@@ -791,7 +796,7 @@ on a low-res one"></p>
791796</div>
792797
793798<p>Child elements do not inherit the relative values specified for
794- their parent; they (generally) inherit the <a
799+ their parent; they inherit the <a
795800href="cascade.html#computed-value">computed values</a>.</p>
796801
797802<div class="example"><p>
@@ -830,7 +835,7 @@ h4 { font-size: 1pc } /* picas */
830835</pre>
831836</div>
832837
833- <p> In cases where the computed length cannot be supported, user
838+ <p>In cases where the used length cannot be supported, user
834839agents must approximate it in the <a
835840href="cascade.html#actual-value">actual value.</a>
836841</p>
@@ -1124,9 +1129,11 @@ inside double quotes, unless escaped (e.g., as '\"' or as
11241129</div>
11251130
11261131<p>A string cannot directly contain a <span class="index-inst"
1127- title="newline">newline</span>. To include a newline in a string, use
1128- the escape "\A" (hexadecimal A is the line feed character in Unicode,
1129- but represents the generic notion of "newline" in CSS). See the <span
1132+ title="newline">newline</span>.
1133+ To include a newline in a string, use an escape representing the line feed
1134+ character in Unicode (U+000A), such as "\A" or "\00000a".
1135+ This character represents the generic notion of "newline" in CSS.
1136+ See the <span
11301137class="propinst-content">'content'</span> property for an example.
11311138</p>
11321139<p>It is possible to break strings over several lines, for esthetic
@@ -1165,7 +1172,7 @@ display declaration.
11651172</p>
11661173
11671174
1168- <h2>CSS document representation</h2>
1175+ <h2>CSS style sheet representation</h2>
11691176
11701177<p>A CSS style sheet is a sequence of characters from the Universal
11711178Character Set (see [[ISO10646]]). For transmission and
@@ -1184,7 +1191,7 @@ character encoding of the whole document.
11841191<p>When a style sheet resides in a separate file, user agents must
11851192observe the following <span class="index-inst" title="character
11861193encoding::user agent's determination of">priorities</span> when
8096
1187- determining a document 's <span class="index-inst" title="character
1194+ determining a style sheet 's <span class="index-inst" title="character
11881195encoding::default|default::character encoding">character
11891196encoding</span> (from highest priority to lowest):
11901197</p>
@@ -1195,75 +1202,65 @@ at-rule.</li>
11951202<li>Mechanisms of the language of the
11961203referencing document (e.g., in HTML, the "charset"
11971204attribute of the LINK element).</li>
1198- <li>UA-dependent mechanisms <ins> (e.g., guessing based on the <a
1199- href="#BOM">BOM</a>)</ins></ li>
1205+ <li>UA-dependent mechanisms (e.g., guessing based on the <a
1206+ href="#BOM">BOM</a>)</li>
12001207</ol>
12011208
1202- <del>
1203-
1204- <p>At most one @charset rule may appear in an external style sheet
1205- — it must <em>not</em> appear in an embedded style sheet —
1206- and it must appear at the very start of the document, not preceded by
1207- any characters.
1208-
1209- </del>
1210- <ins>
1211-
12121209<p>At most one @charset rule may appear in an external style sheet and
1213- it must appear at the very start of the document , not preceded by any
1210+ it must appear at the very start of the style sheet , not preceded by any
12141211characters, except possibly a <a href="#BOM">"BOM" (see below)</a>.
12151212Any other @charset rules must be ignored by the UA.
1216-
1217- </ins>
1213+ </p>
12181214
12191215<p>After "@charset", authors specify the name of a character encoding.
12201216The name must be a charset name as described in the IANA registry (See
12211217[[IANA]]. Also, see [[-CHARSETS]] for a complete list of charsets).
12221218For example:
1219+ </p>
12231220
12241221<pre class="example">@charset "ISO-8859-1";</pre>
12251222
12261223<p>This specification does not mandate which character encodings
12271224a user agent must support.
12281225</p>
12291226
1230- <ins cite="http://www.damowmow.com/temp/csswg/css21/issues"
1231- title="issue 44">
1232-
12331227<p id="BOM">If an external style sheet has U+FEFF ("zero width
12341228non-breaking space") as the first character (i.e., even before any
12351229@charset rule), this character is interpreted as a so-called "Byte
12361230Order Mark" (BOM), as follows:
1231+ </p>
1232+
12371233<ul>
12381234<li>If the style sheet is encoded as "UTF-16" [[RFC2781]] or "UTF-32"
1239- [[UNICODE]], the BOM determines the byte order ("big-endian" or
1235+ [[UNICODE]], the BOM determines the byte order (e.g. "big-endian" or
12401236"little-endian") as explained in the cited RFC.
1241-
1237+ </li>
12421238<li>If the style sheet is encoded as anything else, the U+FEFF
12431239character is ignored.
1240+ </li>
12441241</ul>
12451242
12461243<p>An external style sheet <em>should</em> start with a BOM if it is
12471244encoded as "UTF-16" or "UTF-32" and <em>should not</em> have a BOM in
12481245any other encodings.
1246+ </p>
12491247
12501248<p class="note">Note that the BOM can only be ignored if it agrees
12511249with the encoding. E.g., if a style sheet encoded as "UTF-8" starts
12521250with 0xEF 0xBB 0xBF those three bytes are ignored, since they
12531251correctly encode the character U+FEFF in UTF-8. But if a style sheet
12541252encoded as "ISO-8859-1" starts with the two bytes 0xFE 0xFF (the BOM
12551253for big-endian UTF-16), the two bytes are simply interpreted as the
1256- two characters "�" and "�".
1257-
1258- </ins>
1254+ two characters "þ" and "ÿ".
1255+ </p>
12591256
12601257<p class="note">Note that reliance on the @charset construct
12611258theoretically poses a
12621259problem since there is no <em>a priori</em> information on how it is
12631260encoded. In practice, however, the encodings in wide use on the
12641261Internet are either based on ASCII, UTF-16, UCS-4, or (rarely) on
12651262EBCDIC. This means that in general, the initial byte values of a
1266- document enable a user agent to detect the encoding family reliably,
1263+ style sheet enable a user agent to detect the encoding family reliably,
12671264which provides enough information to decode the @charset rule, which
12681265in turn determines the exact character encoding.
12691266</p>
@@ -1281,9 +1278,9 @@ character references in HTML or XML documents (see [[HTML40]],
12811278chapters 5 and 25).
12821279</p>
12831280<p>The character escape mechanism should be used when only a few
1284- characters must be represented this way. If most of a document
1281+ characters must be represented this way. If most of a style sheet
12851282requires escaping, authors should encode it with a more appropriate
1286- encoding (e.g., if the document contains a lot of Greek characters,
1283+ encoding (e.g., if the style sheet contains a lot of Greek characters,
12871284authors might use "ISO-8859-7" or "UTF-8").
12881285</p>
12891286<p>Intermediate processors using a different character encoding may
@@ -1296,7 +1293,7 @@ of an ASCII character.
12961293correctly map to Unicode all characters in any character encodings
12971294that they recognize (or they must behave as if they did).
12981295</p>
1299- <p>For example, a document transmitted as ISO-8859-1
1296+ <p>For example, a style sheet transmitted as ISO-8859-1
13001297(Latin-1) cannot contain Greek letters directly:
13011298"κουρος" (Greek: "kouros") has to be
13021299written as "\3BA\3BF\3C5\3C1\3BF\3C2".
0 commit comments