Skip to content

Commit 90cade1

Browse files
committed
[css2] First draft of section on encodings and @charset
--HG-- extra : convert_revision : svn%3A73dc7c4b-06e6-40f3-b4f7-9ed1dbc14bfc/trunk%40841
1 parent d902221 commit 90cade1

1 file changed

Lines changed: 85 additions & 32 deletions

File tree

css2/syndata.src

Lines changed: 85 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
22
<html lang="en">
3-
<!-- $Id: syndata.src,v 2.0 1998-02-02 18:48:13 bbos Exp $ -->
3+
<!-- $Id: syndata.src,v 2.1 1998-02-10 00:16:10 ijacobs Exp $ -->
44
<HEAD>
55
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
66
<TITLE>CSS2 syntax and basic data types</TITLE>
@@ -922,50 +922,103 @@ o very long title"] {border: double}
922922
A[TITLE="a not so very long title"] {border: double}
923923
</pre>
924924

925+
<H2>CSS document representation</H2>
925926

926-
<H2><a name="css-in-html">CSS embedded in HTML</a></H2>
927+
<P>A CSS style sheet is a sequence of characters from the Universal
928+
Character Set, defined in <a rel="biblioentry"
929+
href="./refs.html#ref-ISO10646">[ISO10646]</a> (see <a
930+
rel="biblioentry" href="./refs.html#ref-HTML40"
931+
class="informref">[HTML40]</a>, Chapter 5, for a discussion of
932+
character sets and character encodings). For transmission and
933+
storage, these characters must be <span class="index-def"
934+
title="character encoding">encoded</span> by a character encoding that
935+
supports the ASCII character set and in which the ASCII characters
936+
are encoded as themselves (e.g., ISO 8859-1, SHIFT JIS, etc.).
927937

928-
<P> CSS style sheets may be embedded in HTML documents, and to be able
929-
to hide style sheets from older UAs, it is convenient put the style
930-
sheets inside HTML comments. Please consult <a rel="biblioentry"
931-
href="./refs.html#ref-HTML40" class="informref">[HTML40]</a> for more
932-
information.
938+
<!-- Bert doesn't agree. What about EBCDIC? -IJ -->
933939

934-
<P>When CSS is embedded in HTML, it shares the <tt>charset</tt>
935-
parameter used to transmit the enclosing HTML document. As with HTML,
936-
the value of the <tt>charset</tt> parameter is used to convert from
937-
the transfer encoding to the document character set, which is
938-
specified by <a rel="biblioentry"
939-
href="./refs.html#ref-ISO10646">[ISO10646]</a>.
940+
<P>When a style sheet is embedded in another document, the style sheet
941+
shares the character encoding of the whole document (which is
942+
determined by the <a href="conform.html#doclanguage">document
943+
language</a>). In HTML, for example, the character encoding may be
944+
specified by HTTP headers or the META element, as in:
940945

941-
<H2><a name="css-by-itself">CSS as a stand-alone file</a></H2>
942-
<!-- Add reference to rfc2045? -->
946+
<pre>
947+
&lt;META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"&gt;
948+
</pre>
949+
950+
951+
<P>When a style sheet resides in a separate file, user agents must
952+
observe the following <span class="index-inst" title="character
953+
encoding::user agent's determination of">priorities</span> when
954+
determining a document's <span class="index-inst" title="character
955+
encoding::default|default::character encoding">character
956+
encoding</span> (from highest priority to lowest):</p>
957+
958+
<ol>
959+
<li>An HTTP "charset" parameter in a "Content-Type" field.
960+
<li>Mechanisms of the language of the
961+
referencing document (e.g., in HTML, the "charset"
962+
attribute of the LINK element).
963+
<li>The <span class="index-def" title="@charset">@charset</span>
964+
at-rule.
965+
</ol>
966+
967+
<!-- This same "general-to-specific" ordering in HTML 4.0 has raised
968+
some eyebrows, but recently received support from the HTML WG
969+
in the html-editors mailing list -IJ -->
970+
971+
<P>When present, only one @charset rule may appear in an external
972+
style sheet -- it must <em>not</em> appear in an embedded style sheet
973+
-- and it must be the first data in the document. After "@charset",
974+
authors specify the name of a character encoding. The name must be a
975+
charset name as described in <a href="refs.html#ref-IANA"
976+
class="normref">[IANA]</a> registry (see <a rel="biblioentry"
977+
href="./refs.html#ref-CHARSETS" class="informref">[CHARSETS]</a>
978+
for a complete list). For example:</p>
979+
980+
<div class="example"><P>
981+
@charset "ISO-8859-1";
982+
</div>
943983

944-
<p>CSS style sheets may exist in files by themselves, being linked
945-
from the document. In this case, the CSS files are served with the
946-
media type <tt>text/css</tt>. As with all text media types, a
947-
<tt>charset</tt> parameter may be added which is used to convert from
948-
the transfer encoding to <a rel="biblioentry"
949-
href="./refs.html#ref-ISO10646">[ISO10646]</a>.
984+
<P>This specification does not mandate which character encodings
985+
a user agent must support.
950986

951-
<h2><a name="char-escapes">Character escapes in CSS</a></h2>
987+
<!-- More examples of good encodings to use? -IJ -->
952988

953-
<p>CSS may need to use characters that are outside the encoding used
954-
to transmit the document. For example, the "class" attribute of HTML
955-
allows more characters in a class name than the set allowed for
956-
selectors above. In CSS2, such characters can be <a
957-
href="#escaped-characters">escaped</a> or written as <a
958-
rel="biblioentry" href="./refs.html#ref-ISO10646">[ISO10646]</a>
959-
numbers.
989+
<!-- Encodings not to use? (cf. HTML 4.0) -IJ -->
990+
991+
<h3>Encoding characters not represented in a character encoding</h3>
992+
993+
<P>A style sheet may have to refer to characters that cannot be
994+
represented in the current character encoding. These characters must
995+
be written as <a href="#escaped-characters">escaped</a> references to
996+
<a rel="biblioentry" href="./refs.html#ref-ISO10646">[ISO10646]</a>
997+
characters.
998+
999+
<P><a href="conform.html#conformance">Conforming user agents</a> must
1000+
correctly map to Unicode all characters in any character encodings
1001+
that they recognize (or they must behave as if they did).
9601002

9611003
<P>For instance, "B&amp;W?" may be written as "B\&amp;W\?" or
9621004
"B\26W\3F". For example, a document transmitted as ISO-8859-1
9631005
(Latin-1) cannot contain Greek letters directly:
9641006
"&#954;&#959;&#965;&#961;&#959;&#962;" (Greek: "kouros") has to be
9651007
written as "\3BA\3BF\3C5\3C1\3BF\3C2". These escapes are thus the CSS
966-
equivalent of numeric character references in HTML or XML documents.
967-
968-
1008+
equivalent of numeric character references in HTML or XML documents
1009+
(see <a rel="biblioentry" href="./refs.html#ref-HTML40">[HTML40],
1010+
Chapters 5 and 25).</a>
1011+
1012+
<div class="note"><P>
1013+
<em><strong>Note.</strong>
1014+
The character escape mechanism should be used when only a few
1015+
characters must be represented this way. If most of a
1016+
document requires escaping, authors should encode it
1017+
with a more appropriate encoding (e.g., if the document
1018+
contains a lot of Greek characters, authors might use ISO 8859-7
1019+
or UTF 8).
1020+
</em>
1021+
</div>
9691022

9701023
</BODY>
9711024
</html>

0 commit comments

Comments
 (0)