CSS3 module: Syntax

W3C Working Draft [DATE: 13 August 2003]

This version:: http://www.w3.org/TR/[YEAR]/WD-css3-syntax-[ISODATE]
Latest version:: http://www.w3.org/TR/css3-syntax
Previous version:: http://www.w3.org/TR/2003/WD-css3-syntax-20030813
Editor:: L. David Baron, <dbaron@dbaron.org>
Additional Contributors:: Original CSS2 Authors; Bert Bos (W3C), <bert@w3.org>; Peter Linss (Netscape)

[Here will be included the file "../copyright.inc"]

Abstract

This CSS3 module describes the basic structure of CSS style sheets, some of the details of the syntax, and the rules for parsing CSS style sheets. It also describes (in some cases, informatively) how stylesheets can be linked to documents and how those links can be media-dependent. Additional details of the syntax of some parts of CSS described in other modules will be described in those modules. The selectors module has a grammar for selectors. Modules that define properties give the grammar for the values of those properties, in a format described in this document.

Status of this document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/. The latest revision can also be found by following the "Latest Version" link above.

This document is a draft of one of the modules of CSS Level 3 (CSS3). Some parts of the document are derived from the CSS Level 1 and CSS Level 2 recommendations, and those parts are thus relatively stable. However, it is otherwise an early draft, and considerable revision is expected in later drafts, especially in formalization of error handling behavior, the conformance requirements for partial implementations (given the modularization of CSS3), and integration with other CSS3 modules.

This document is a working draft of the CSS working group which is part of the style activity (see summary).

The working group would like to receive feedback: discussion takes place on the (archived) public mailing list www-style@w3.org (see instructions). W3C Members can also send comments directly to the CSS working group.

This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress. Its publication does not imply endorsement by the W3C membership or the CSS & FP Working Group (members only).

Patent disclosures relevant to CSS may be found on the Working Group's public patent disclosure page.

This document may be available in translations in the future. The English version of this specification is the only normative version.

Dependencies on other modules

This CSS3 module depends on the following other CSS3 modules:

Selectors [[!SELECT]]
CSS3 module: Values & Units [[!CSS3VAL]]
CSS3 module: Cascading & Inheritance [[!CSS3CASCADE]]

It has non-normative (informative) references to the following other CSS3 modules:

CSS3 module: Paged media [[CSS3PAGE]]
CSS3 module: Speech [[CSS3SPEECH]]
Media queries [[MEDIAQ]]
Syntax of CSS rules in HTML's "style" attribute [[CSSSTYLEATTR]]

Introduction

This specification describes the basic syntax of CSS3 and the syntax conventions used in the property definitions spread through the CSS3 modules. The syntax of CSS3 has some error-handling requirements for forward-compatibility, but much of the error-handling behavior depends on the user agent.

CSS style sheet representation

A CSS style sheet is a sequence of characters from the Universal Character Set (see [[!ISO10646]]). For transmission and storage, these characters must be encoded by a character encoding that supports the set of characters available in US-ASCII (e.g., ISO-8859-x, Shift_JIS, etc.). A byte order mark (BOM), as described in section 2.7 of [[!UNICODE310]], that begins the sequence of characters should not be considered, for purposes of applying the grammar below, as a part of the style sheet. For a good introduction to character sets and character encodings, please consult the HTML 4.0 specification ([[!HTML40]], chapter 5). See also the XML 1.0 specification ([[XML10]], sections 2.2 and 4.3.3).

When a style sheet is embedded in another document, such as in the STYLE element or "style" attribute of HTML, the style sheet shares the character encoding of the whole document.

When a style sheet resides in a separate file, user agents must observe the following priorities when determining a style sheet's character encoding (from highest priority to lowest):

A character encoding specified by a higher level protocol (e.g., the "charset" parameter to the MIME type specified in an HTTP "Content-Type" field). (The HTTP protocol ([[!HTTP11]], section 3.7.1) mentions ISO-8859-1 as a default character encoding when the "charset" parameter is absent from the "Content-Type" header field. In practice, this recommendation has proved useless because some servers don't allow a "charset" parameter to be sent, and others may not be configured to send the parameter. Therefore, user agents must not assume any default value for the "charset" parameter, but must instead look for the @charset rule.)
The @charset at-rule.
Assume that the style sheet is UTF-8.

Since the third point differs from CSS1 and CSS2, authors should not rely on user agents to assume that style sheets without encoding information are UTF-8 encoded. Authors should specify the encoding using one of the first two methods.

At most one @charset rule may appear in an external style sheet — it must not appear in an embedded style sheet — and it must appear at the very start of the style sheet, not preceded by any characters (except for the optional Byte Order Mark described above). After "@charset", authors specify the name of a character encoding. The name must be a charset name as described in the IANA registry (See [[!RFC2978]]. Also, see [[CHARSETS]] for a complete list of charsets). For example:

@charset "ISO-8859-1";

This specification does not mandate which character encodings a user agent must support. [Should we require a certain minimal set, such as UTF-8 and UCS2?]

Note that reliance on the @charset construct theoretically poses a problem since there is no a priori information on how it is encoded. In practice, however, the encodings in wide use on the Internet are either based on ASCII, UTF-16, UCS-4, or (rarely) on EBCDIC. This means that in general, the initial byte values of a style sheet enable a user agent to detect the encoding family reliably, which provides enough information to decode the @charset rule, which in turn determines the exact character encoding.

[Should this specification describe how to handle encoding errors? Can a user agent ignore the @charset rule if it's wrong? What if the user agent does not support the encoding used? Should this specification describe how to handle a @charset rule that specifies a character encoding that is incompatible with the family of encodings used to decode the @charset rule (and BOM) itself?]

Referring to characters not represented in a character encoding

A style sheet may have to refer to characters that cannot be represented in the current character encoding. These characters must be written as escaped references to ISO 10646 characters. These escapes serve the same purpose as numeric character references in HTML or XML documents (see [[!HTML40]], chapters 5 and 25).

The character escape mechanism should be used when only a few characters must be represented this way. If most of a style sheet requires escaping, authors should encode it with a more appropriate encoding (e.g., if the style sheet contains a lot of Greek characters, authors might use "ISO-8859-7" or "UTF-8").

Intermediate processors using a different character encoding may translate these escaped sequences into byte sequences of that encoding. Intermediate processors must not, on the other hand, alter escape sequences that cancel the special meaning of an ASCII character.

Conforming user agents must correctly map to Unicode all characters in any character encodings that they recognize (or they must behave as if they did).

For example, a style sheet transmitted as ISO-8859-1 (Latin-1) cannot contain Greek letters directly: "κουρος" (Greek: "kouros") has to be written as "\3BA\3BF\3C5\3C1\3BF\3C2".

Note. In HTML 4.0, numeric character references are interpreted in "style" attribute values but not in the content of the STYLE element. Because of this asymmetry, we recommend that authors use the CSS character escape mechanism rather than numeric character references for both the "style" attribute and the STYLE element. For example, we recommend:

<span style="voice-family: D\FC rst">...</span>

rather than:

<span style="voice-family: D&#252;rst">...</span>

The text/css content type

CSS style sheets that exist in separate files are sent over the Internet as a sequence of bytes accompanied by encoding information. The structure of the transmission, termed a message entity, is defined by MIME and HTTP 1.1 (see [[!RFC2045]] and [[!HTTP11]]). A message entity with a content type of "text/css" represents an independent CSS style sheet. The "text/css" content type has been registered by RFC 2318 ([[!RFC2318]]).

General syntax of CSS

This section describes a grammar (and forward-compatible parsing rules) common to any version of CSS (including CSS3). Future versions of CSS will adhere to this core syntax, although they may add additional syntactic constraints.

See the section on characters and case for information on case-sensitivity.

These descriptions are normative.

Characters and case

The following rules always hold:

All CSS style sheets are case-insensitive, except for parts that are not under the control of CSS. For example, the case-sensitivity of values of the HTML attributes "id" and "class", of font names, and of URIs lies outside the scope of this specification. Note in particular that element names are case-insensitive in HTML, but case-sensitive in XML.
In CSS3, identifiers (including element names, classes, and IDs in selectors (see [[!SELECT]] [or is this still true])) can contain only the characters [A-Za-z0-9] and ISO 10646 characters 161 and higher, plus the hyphen (-) and the underscore (_); they cannot start with a digit or a hyphen followed by a digit. They can also contain escaped characters and any ISO 10646 character as a numeric code (see next item). For instance, the identifier "B&W?" may be written as "B\&W\?" or "B\26 W\3F". (See [[!UNICODE310]] and [[!ISO10646]].)
In CSS3, a backslash (\) character indicates three types of character escapes.

First, inside a string (see [[!CSS3VAL]]), a backslash followed by a newline is ignored (i.e., the string is deemed not to contain either the backslash or the newline).

Second, it cancels the meaning of special CSS characters. Any character (except a hexadecimal digit) can be escaped with a backslash to remove its special meaning. For example, "\"" is a string consisting of one double quote. Style sheet preprocessors must not remove these backslashes from a style sheet since that would change the style sheet's meaning.

Third, backslash escapes allow authors to refer to characters they can't easily put in a style sheet. In this case, the backslash is followed by at most six hexadecimal digits (0..9A..F), which stand for the ISO 10646 ([[!ISO10646]]) character with that number. If a digit or letter follows the hexadecimal number, the end of the number needs to be made clear. There are two ways to do that:
1. with a space (or other whitespace character): "\26 B" ("&B"). In this case, user agents should treat a "CR/LF" pair (13/10) as a single whitespace character.
2. by providing exactly 6 hexadecimal digits: "\000026B" ("&B")
In fact, these two methods may be combined. Only one whitespace character is ignored after a hexadecimal escape. Note that this means that a "real" space after the escape sequence must itself either be escaped or doubled.
Backslash escapes are always considered to be part of an identifier or a string (i.e., "\7B" is not punctuation, even though "{" is, and "\32" is allowed at the start of a class name, even though "2" is not).

Tokenization

[This needs to be integrated with the selectors module. How should that be done?]

All levels of CSS — level 1, level 2, level 3, and any future levels — use the same core syntax. This allows UAs to parse (though not completely understand) style sheets written in levels of CSS that didn't exist at the time the UAs were created. Designers can use this feature to create style sheets that work with older user agents, while also exercising the possibilities of the latest levels of CSS.

At the lexical level, CSS style sheets consist of a sequence of tokens. Hexadecimal codes (e.g., #x20) refer to ISO 10646 ([[!ISO10646]]). In case of multiple matches, the longest match determines the token.

The following productions are parts of tokens:

[We need something to allow signs on integers. Do we need to go as far as css3-selectors?]

ident	::=	`'-'? nmstart nmchar*`
name	::=	`nmchar+`
nmstart	::=	`[a-zA-Z] \| '_' \| nonascii \| escape`
nonascii	::=	`[#x80-#xD7FF#xE000-#xFFFD#x10000-#x10FFFF]`
unicode	::=	`'\' [0-9a-fA-F]{1,6} wc?`
escape	::=	`unicode \| '\' [#x20-#x7E#x80-#xD7FF#xE000-#xFFFD#x10000-#x10FFFF]`
nmchar	::=	`[a-zA-Z0-9] \| '-' \| '_' \| nonascii \| escape`
num	::=	`[0-9]+ \| [0-9]* '.' [0-9]+`
string	::=	`'"' (stringchar \| "'")* '"' \| "'" (stringchar \| '"')* "'"`
stringchar	::=	`urlchar \| #x20 \| '\' nl`
urlchar	::=	`[#x9#x21#x23-#x26#x27-#x7E] \| nonascii \| escape`
nl	::=	`#xA \| #xD #xA \| #xD \| #xC`
w	::=	`wc*`
wc	::=	`#x9 \| #xA \| #xC \| #xD \| #x20`

The following productions are the complete list of tokens in CSS3:

IDENT	::=	`ident`
ATKEYWORD	::=	`'@' ident`
STRING	::=	`string`
HASH	::=	`'#' name`
NUMBER	::=	`num`
PERCENTAGE	::=	`num '%'`
DIMENSION	::=	`num ident`
URI	::=	`"url(" w (string \| urlchar* ) w ")"`
UNICODE-RANGE	::=	`"U+" [0-9A-F?]{1,6} ('-' [0-9A-F]{1,6})?`
CDO	::=	`"<!--"`
CDC	::=	`"-->"`
S	::=	`wc+`
COMMENT	::=	`"/" [^]* ''+ ([^/] [^]* ''+) "/"`
FUNCTION	::=	`ident '('`
INCLUDES	::=	`"~="`
DASHMATCH	::=	`"\|="`
PREFIXMATCH	::=	`"^="`
SUFFIXMATCH	::=	`"$="`
SUBSTRINGMATCH	::=	`"*="`
CHAR	::=	any other character not matched by the above rules, except for `"` or `'`
BOM	::=	`#xFEFF`

Since any single character other than ' or " that cannot be part of a larger token is a single character token, there cannot be errors in tokenization other than the inability to tokenize an unmatched quotation mark. If at some point it is not possible to continue tokenizing an incoming style sheet, the remainder of the style sheet should be ignored and only the largest initial segment of the style sheet that can be tokenized according to the above rules (that is, the entire style sheet except for the part from the unmatched (single or double) quotation mark to the end) should be used to form the sequence of tokens to be parsed according to the grammar.

[This isn't exactly right. Since the string token can't contain newlines that aren't escaped by backslashes, an untokenizable sequence can occur in the middle of a file. Would it be better to change things so that unmatched quotation marks become single-character tokens and all character streams are tokenizable?]

Grammar

Principles of CSS error handling

All levels of CSS, starting from CSS1, have required that user agents ignore certain types of invalid style sheets in well-defined ways. This allows forward-compatibility, since it allows future extensions to CSS within basic grammatical constraints that will be ignored in well-defined ways by user agents implementing earlier versions of CSS.

Handling of CSS that is not valid CSS3 but is valid according to the forward-compatible syntax requires first determining the beginning and end of the part that is invalid and then handling that part in a specified way. The latter is described in the rules for handling parsing errors. The mechanism for the former is described within the grammar.

The handling of style sheets that do not parse according to the forward-compatible core syntax is not defined by this specification. [Should it be?]

Certain productions within the grammar are error handling points. Every error handling point has a backup production that is to be used if it is not possible to parse the stream of tokens based on the primary production. If the error handling production is represented as prod, then the backup production is represented as FAIL(prod).

[The grammar given in Appendix D of CSS2 still needs to be incorporated into this specification. The editor hopes that it can be done by unifying it with the forward-compatible grammar into a single grammar that describes both the rules for forward-compatible parsing and the syntax of what is currently possible in CSS, but that may not be possible. However, hopefully it will be possible to do this by describing the general grammar in terms of the concepts described in the previous paragraph.]

Excluding the transformation of a production into its backup production, this grammar is LL(1). [We should explain briefly what this means, except that it's probably not true. It's probably just LALR(1).]

The portion of a CSS style sheet that is to be used is the largest initial stream of the tokens resulting from the tokenization process that can be parsed according to the grammar presented in this chapter. (For example, if a brace closing a declaration block [link-ify this] is not present, the declaration block must be ignored since the closing brace is required to satisfy this grammar.) [This might lead to highly unexpected behavior when there's an extra closing brace (etc.). Do we really want this?]

Some of the constraints of CSS are not expressed in the grammar. For example, an @charset rule is not permitted in an embedded style sheet, or a namespace prefix that has not been defined by an @namespace rule is an invalid selector. These constraints should be handled just as a parsing error would be (by ignoring out to the next backup production) unless specified otherwise.

Style sheets

Below is the core syntax for CSS. Lowercase identifiers represent productions in this grammar, uppercase identifiers represent tokens (see above), and characters in single quotes (') represent CHAR tokens (see above). The sections that follow describe how to use it.

[This might need better integration with the selectors module, although maybe it's ok.]

stylesheet  : [ CDO | CDC | S | statement ]*;
statement   : ruleset | at-rule;
at-rule     : ATKEYWORD S* any* [ block | ';' S* ];
block       : '{' S* [ any | block | ATKEYWORD S* | ';' S* ]* '}' S*;
ruleset     : selector? '{' S* declaration? [ ';' S* declaration? ]* '}' S*;
selector    : any+;
declaration : property ':' S* value;
property    : IDENT S*;
value       : [ any | block | ATKEYWORD S* ]+;
any         : [ IDENT | NUMBER | PERCENTAGE | DIMENSION | STRING
              | DELIM | URI | HASH | UNICODE-RANGE | INCLUDES
              | FUNCTION S* any* ')' | DASHMATCH | '(' S* any* ')'
              | '[' S* any* ']' ] S*;

[The definitions of these productions should be spread below into the prose describing what they mean. Furthermore, they should be combined with the Appendix D grammar from CSS2, perhaps using notation like:

ruleset	::=	...
FAIL(ruleset)	::=	...

]

COMMENT tokens do not occur in the grammar (to keep it readable), but any number of these tokens may appear anywhere between other tokens.

The token S in the grammar above stands for whitespace. Only the characters "space" (Unicode code 32), "tab" (9), "line feed" (10), "carriage return" (13), and "form feed" (12) can occur in whitespace. Other space-like characters, such as "em-space" (8195) and "ideographic space" (12288), are never part of whitespace.

Keywords

Keywords have the form of identifiers. Keywords must not be placed between quotes ("..." or '...'). Thus,

red

is a keyword, but

"red"

is not. (It is a string.) Other illegal examples:

width: "auto";
border: "none";
background: "red";

Statements

A CSS style sheet, for any version of CSS, consists of a list of statements (see the grammar above). There are two kinds of statements: at-rules and rule sets. There may be whitespace around the statements.

At-rules

At-rules start with an at-keyword, an '@' character followed immediately by an identifier (for example, '@import', '@page').

An at-rule consists of everything up to and including the next semicolon (;) or the next block, whichever comes first. A CSS user agent that encounters an unrecognized or misplaced at-rule must ignore the whole of the at-rule and continue parsing after it.

Assume, for example, that a CSS3 parser encounters this style sheet:

@import "subs.css";
h1 { color: blue }
@import "list.css";

The second '@import' is illegal according to CSS3 since '@import' rules must occur before all rules other than '@charset' rules. The CSS3 parser ignores the whole at-rule, effectively reducing the style sheet to:

@import "subs.css";
h1 { color: blue }

In the following example, the second '@import' rule is invalid, since it occurs inside a '@media' block.

@import "subs.css";
@media print {
  @import "print-main.css";
  BODY { font-size: 10pt }
}
h1 {color: blue }

Blocks

A block starts with a left curly brace ({) and ends with the matching right curly brace (}). In between there may be any characters, except that parentheses (( )), brackets ([ ]) and braces ({ }) must always occur in matching pairs and may be nested. Single (') and double quotes (") must also occur in matching pairs, and characters between them are parsed as a string. See Tokenization above for the definition of a string.

Here is an example of a block. Note that the right brace between the double quotes does not match the opening brace of the block, and that the second single quote is an escaped character, and thus doesn't match the first single quote:

{ causta: "}" + ({7} * '\'') }

Note that the above rule is not valid CSS3, but it is still a block as defined above.

Rule sets, declaration blocks, and selectors

A rule set (also called "rule") consists of a selector followed by a declaration block.

A declaration-block (also called a {}-block in the following text) starts with a left curly brace ({) and ends with the matching right curly brace (}). In between there must be a list of zero or more semicolon-separated (;) declarations.

The selector (see the Selectors module [[!SELECT]]) consists of everything up to (but not including) the first left curly brace ({). A selector always goes together with a {}-block. When a user agent can't parse the selector (i.e., it is not valid CSS3), it must ignore the {}-block as well.

CSS3 gives a special meaning to the comma (,) in selectors. However, since it is not known if the comma may acquire other meanings in future versions of CSS, the whole statement should be ignored if there is an error anywhere in the selector, even though the rest of the selector may look reasonable in CSS3.

For example, since the "&" is not a valid token in a CSS3 selector, a CSS3 user agent must ignore the whole second line, and not set the color of H3 to red:

h1, h2 {color: green }
h3, h4 & h5 {color: red }
h6 {color: black }

Here is a more complex example. The first two pairs of curly braces are inside a string, and do not mark the end of the selector. This is a valid CSS3 statement.

p[example="public class foo\
{\
    private int x;\
\
    foo(int x) {\
        this.x = x;\
    }\
\
}"] { color: red }

Note. The \ characters in the above example cause the newlines to be ignored. Newlines can be placed in strings only using the correct numeric character escape. See characters and case above.

Declarations and properties

A declaration is either empty or consists of a property, followed by a colon (:), followed by a value. Around each of these there may be whitespace.

Because of the way selectors work, multiple declarations for the same selector may be organized into semicolon (;) separated groups.

Thus, the following rules:

h1 { font-weight: bold }
h1 { font-size: 2em }
h1 { line-height: 1.2 }
h1 { font-family: Helvetica, Arial, sans-serif }
h1 { font-variant: normal }
h1 { font-style: normal }

are equivalent to:

h1 {
  font-weight: bold;
  font-size: 2em;
  line-height: 1.2;
  font-family: Helvetica, Arial, sans-serif; 
  font-variant: normal;
  font-style: normal
}

A property is an identifier. Any character may occur in the value, but parentheses ("( )"), brackets ("[ ]"), braces ("{ }"), single quotes (') and double quotes (") must come in matching pairs, and semicolons not in strings must be escaped. Parentheses, brackets, and braces may be nested. Inside the quotes, characters are parsed as a string.

The syntax of values is specified separately for each property, but in any case, values are built from identifiers, strings, numbers, lengths, percentages, URIs, colors, angles, times, and frequencies.

A user agent must ignore a declaration with an invalid property name or an invalid value. Every CSS3 property has its own syntactic and semantic restrictions on the values it accepts.

For example, assume a CSS3 parser encounters this style sheet:

h1 { color: red; font-style: 12px }  /* Invalid value: 12px */
p { color: blue;  font-vendor: any;  /* Invalid prop.: font-vendor */
    font-variant: small-caps }
em em { font-style: normal }

The second declaration on the first line has an invalid value '12pt'. The second declaration on the second line contains an undefined property 'font-vendor'. The CSS3 parser will ignore these declarations, effectively reducing the style sheet to:

h1 { color: red; }
p { color: blue;  font-variant: small-caps }
em em { font-style: normal }

Comments

Comments begin with the characters "/*" and end with the characters "*/". They may occur anywhere between tokens, and their contents have no influence on the rendering. Comments may not be nested.

CSS also allows the SGML comment delimiters ("") in certain places, but they do not delimit CSS comments. They are permitted so that style rules appearing in an HTML source document (in the STYLE element) may be hidden from pre-HTML 3.2 user agents. See the HTML 4.0 specification ([[!HTML40]]) for more information.

Rules for handling parsing errors or unsupported features

[Hopefully (assuming it can be formalized within the rules above) this section will not need so much detail and can be folded into the previous section.]

In some cases, user agents must ignore part of an illegal style sheet. This specification defines ignore to mean that the user agent parses the illegal part according to the grammar above (in order to find its beginning and end), but otherwise acts as if it had not been there.

If a style sheet cannot be parsed according to the grammar above, the user agent must behave the same as it would if the style sheet had the smallest sequence of characters removed from its end that would allow it to be parsed according to the grammar.

To ensure that new properties and new values for existing properties can be added in the future, user agents are required to obey the following rules when they encounter the following scenarios:

Unknown properties. User agents must ignore a declaration with an unknown property. For example, if the style sheet is:
```
h1 { color: red; rotation: 70minutes }
```
the user agent will treat this as if the style sheet had been
```
H1 { color: red }
```
Illegal values. User agents must ignore a declaration with an illegal value. For example:
```
img { float: left }       /* correct CSS3 */
img { float: left here }  /* "here" is not a value of 'float' */
img { background: "red" } /* keywords cannot be quoted in CSS3 */
img { border-width: 3 }   /* a unit must be specified for length values */
```
A CSS3 parser would honor the first rule and ignore the rest, as if the style sheet had been:
```
img { float: left }
img { }
img { }
img { }
```
A user agent conforming to a future CSS specification may accept one or more of the other rules as well.

[A general comment on how to handle negative numbers when disallowed might be useful. It should be a parsing error (and thus ignored). We might want to add additional grammar productions for potentially negative numbers.]

Malformed declarations. User agents must handle unexpected tokens encountered while parsing a declaration by reading until the end of the declaration, while observing the rules for matching pairs of (), [], {}, "", and '', and correctly handling escapes. For example, a malformed declaration may be missing a property, colon (:) or value. The following are all equivalent:


p { color:green }
p { color:green; color }  /* malformed declaration missing ':', value */
p { color:red;   color; color:green }  /* same with expected recovery */
p { color:green; color: } /* malformed declaration missing value */
p { color:red;   color:; color:green } /* same with expected recovery */
p { color:green; color{;color:maroon} } /* unexpected tokens { } */
p { color:red;   color{;color:maroon}; color:green } /* same with recovery */

Invalid at-keywords. User agents must ignore an invalid at-keyword together with everything following it, up to and including the next semicolon (;) or block ({...}), whichever comes first. For example, consider the following:
```
@three-dee {
  @background-lighting {
    azimuth: 30deg;
    elevation: 190deg;
  }
  h1 { color: red }
}
h1 { color: blue }
```
The '@three-dee' at-rule is not part of CSS3. Therefore, the whole at-rule (up to, and including, the third right curly brace) is ignored. A CSS3 user agent ignores it, effectively reducing the style sheet to:
```
h1 { color: blue }
```
Unsupported ValuesIf a UA does not support a particular value, it should ignore that value when parsing stylesheets, as if that value was an illegal value. For example:
```
  h3 {
    display: inline;
    display: run-in;
  }
```
A UA that supports the 'run-in' value for the 'display' property will accept the first display declaration and then "write over" that value with the second display declaration. A UA that does not support the 'run-in' value will process the first display declaration and ignore the second display declaration.

Partial implementations

CSS3, unlike CSS1 and CSS2, is modular and thus allows for partial implementations. The conformance requirements of some modules may also allow for conformant implementations to implement only part of a module.

Implementations that do not implement a feature of any CSS3 module (whether a property, an at-rule, or a property value) should behave as they would according to the forward-compatible parsing rules had that feature not been part of a known CSS specification.

Vendor-specific extensions

Although proprietary extensions should be avoided in general, there are situations (experiments, implementations of W3C drafts that have not yet reached Candidate Recommendation, intra-nets, debugging, etc.) where it is convenient to add some nonstandard, i.e., proprietary identifiers to a CSS style sheet. It is often hard to predict how long these proprietary extensions will be in use and hard to avoid that their names clash with new, standard properties or with other proprietary properties. Therefore the CSS Working Group proposes the following simple naming convention. A proprietary name should have a prefix consisting of:

an underscore ("_") or a dash ("-"),
the (possibly abbreviated) name of the company, organization, etc. that created it,
an underscore or a dash.

Some examples (and the companies/organizations that created them):

-moz-box-sizing, -moz-border-radius (The Mozilla Organization)
-wap-accesskey (The WAP Forum)
_xyz-dwiw (hypothetical)

The advantage of the initial dash is that it is not a valid start character for identifiers in CSS, so it is guaranteed never to be used by any current or future level of CSS. CSS-conforming parsers will skip rules that contain identifiers with such a character. [Should the grammar allow '-' as part of identifiers or should it require that vendors who use '-' to begin identifiers extend the grammar in their tokenizer? Currently it does, but perhaps it shouldn't.]

That is also a possible disadvantage: a conforming parser will skip them, so in order to use them, an extended parser is required.

For that reason, the underscore is also proposed. Although it is a valid start character, the CSS Working Group believes it will never define any identifiers that start with that character.

Historical notes

This section is informative, not normative.

Microsoft created a number of proprietary properties for use inside their Microsoft Office product, at a time when there was not yet a consensus in the working group about the naming convention. They chose to prefix properties with "mso-" rather than "-mso-".

At the time of writing, the following prefixes are known to exist:

mso- (Microsoft Corporation)
-moz- (The Mozilla Organization)
-o- (Opera Software)
-atsc- (Advanced Television Standards Committee)
-wap- (The WAP Forum)

Conformance

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 (see [[!RFC2119]]). However, for readability, these words do not appear in all uppercase letters in this specification.

Definitions

[Some of these need heavy rewriting given modularization. I've avoided substituting CSS3 where it's clearly not going to be appropriate.]

This module and other CSS3 modules rely on the following definitions:

Style sheet: A set of statements that specify presentation of a document.
Style sheets may have three different origins: author, user, and user agent. The interaction of these sources is described in the cascading and inheritance module [[!CSS3CASCADE]].
Valid style sheet: The validity of a style sheet depends on the level of CSS used for the style sheet. All valid CSS1 style sheets are valid CSS 2.1 style sheets, but some changes from CSS1 mean that a few CSS1 style sheets will have slightly different semantics in CSS 2.1. Some features in CSS2 are not part of CSS 2.1, so not all CSS2 style sheets are valid CSS 2.1 style sheets.
A valid CSS 2.1 style sheet must be written according to the grammar of CSS 2.1. Furthermore, it must contain only at-rules, property names, and property values defined in this specification. An illegal (invalid) at-rule, property name, or property value is one that is not valid.
Ignore: This term has three slightly different meanings in this specification. First, a CSS parser must follow certain rules when it discovers unknown or illegal syntax in a style sheet. The parser must then ignore certain parts of the style sheets. The exact rules for what parts must be ignored is given in these section: Declarations and properties, Rules for handling parsing errors, Unsupported Values, or may be explained in the text where the term "ignore" appears. Second, a user agent may (and, in some cases must) disregard certain properties or values in the style sheet even if the syntax is legal. For example, table-column-group elements cannot have borders around them, so the border properties must be ignored.
Author: An author is a person who writes documents and associated style sheets. An authoring tool generates documents and associated style sheets.
User: A user is a person who interacts with a user agent to view, hear, or otherwise use a document and its associated style sheet. The user may provide a personal style sheet that encodes personal preferences.
User agent (UA): A user agent is any program that interprets a document written in the document language and applies associated style sheets according to the terms of this specification. A user agent may display a document, read it aloud, cause it to be printed, convert it to another format, etc.; An HTML user agent is one that supports the HTML 2.x, HTML 3.x, or HTML 4.x specifications. A user agent that supports XHTML [[XHTML10]], but not HTML (as listed in the previous sentence) is not considered an HTML user agent for the purpose of conformance with this specification.

User agent conformance

[This section should contain rules for user style sheet conformance, author style sheet disabling, handling parsing errors, etc., from the CSS2.1 specification.]

Error conditions

In general, this document does not specify error handling behavior for user agents (e.g., how they behave when they cannot find a resource designated by a URI).

However, user agents must observe the rules for handling parsing errors.

Since user agents may vary in how they handle error conditions, authors and users must not rely on specific error recovery behavior.

Style sheet conformance

Authoring tool conformance

Authoring tools may use a modified form of the rules for handling parsing errors: when those rules require that user agents ignore something, authoring tools are not required to ignore it. However, authoring tools should not present such parts of the style sheet to the user in the same way as valid parts of the style sheet. In so far as authoring tools display the application of a style sheet to a document, they are required to meet the user agent conformance rules.

[informative reference to canonicalization proposal?]

Format of property definitions in other modules

Each CSS property definition begins with a summary of key information that resembles the following:

'property-name'

Value:	legal values & syntax
Initial:	initial value
Applies to:	elements this property applies to
Inherited:	whether the property is inherited
Computed Value:	the computed value of the property
Percentages:	how percentage values are interpreted
Media:	which media groups the property applies to

Value

This part specifies the set of valid values for the property. Value types may be designated in several ways:

keyword values (e.g., auto, disc, etc.)
basic data types, which appear between "<" and ">" (e.g., <length>, <percentage>, etc.). In the electronic version of the document, each instance of a basic data type links to its definition.
types that have the same range of values as a property bearing the same name (e.g., <'border-width'> <'background-attachment'>, etc.). In this case, the type name is the property name (complete with quotes) between "<" and ">" (e.g., <'border-width'>). Such a type does not include the value 'inherit'. In the electronic version of the document, each instance of this type of non-terminal links to the corresponding property definition.
non-terminals that do not share the same name as a property. In this case, the non-terminal name appears between "<" and ">", as in <border-width>. Notice the distinction between <border-width> and <'border-width'>; the latter is defined in terms of the former. The definition of a non-terminal is located near its first appearance in the specification. In the electronic version of the document, each instance of this type of value links to the corresponding value definition.

Other words in these definitions are keywords that must appear literally, without quotes (e.g., red). The slash (/) and the comma (,) must also appear literally.

Values may be arranged as follows:

Several juxtaposed words mean that all of them must occur, in the given order.
A bar (|) separates two or more alternatives: exactly one of them must occur.
A double bar (||) separates two or more options: one or more of them must occur, in any order.
Brackets ([ ]) are for grouping.

Juxtaposition is stronger than the double bar, and the double bar is stronger than the bar. Thus, the following lines are equivalent:

    a b   |   c || d e
  [ a b ] | [ c || [ d e ]]

Every type, keyword, or bracketed group may be followed by one of the following modifiers:

An asterisk (*) indicates that the preceding type, word, or group occurs zero or more times.
A plus (+) indicates that the preceding type, word, or group occurs one or more times.
A question mark (?) indicates that the preceding type, word, or group is optional.
A pair of numbers in curly braces ({A,B}) indicates that the preceding type, word, or group occurs at least A and at most B times.

The following examples illustrate different value types:

Value: N | NW | NE
Value: [ <length> | thick | thin ]{1,4}
Value: [<family-name> , ]* <family-name>
Value: <uri>? <color> [ / <color> ]?
Value: <uri> || <color>

See the values and units [[!CSS3VAL]] module for the formal definitions of the basic values types.

[We need a more formal grammar for parsing of properties. Refer to section on keywords (they can't be quoted).]

`initial` and `inherit` values

In addition to the legal values stated, initial and inherit values are also legal for every property. The meaning of these values is described in the Values & Units [[!CSS3VAL]] and Cascading & Inheritance [[!CSS3CASCADE]] modules.

Initial

This part specifies the property's initial value. If the property is inherited, this is the value that is given to the root element of the document tree. Please consult the the Cascading & Inheritance module [[!CSS3CASCADE]] for information about the interaction between style sheet-specified, inherited, and initial values.

Applies to

This part lists the elements to which the property applies. All elements are considered to have all properties, but some properties have no rendering effect on some types of elements. For example, 'white-space' only affects block-level elements.

Inherited

This part indicates whether the value of the property is inherited from an ancestor element by default ("Inherited: yes") or the value of the property is its initial value by default ("Inherited: no"). Please consult the Cascading & Inheritance Module [[!CSS3CASCADE]] for information about the interaction between style sheet-specified, inherited, and initial values.

Computed value

This part indicates the computed value of the property. The concept of computed value is described in the Cascading & Inheritance Module [[!CSS3CASCADE]]. (It is needed both for inheritance and for the definitions of some DOM properties.)

When the computed value line says "as specified", then for the special 'initial' and 'inherit' values, the computed value is as though the initial value or the inherited value had been specified -- it is not 'initial' or 'inherit' itself. [Check this with the definition of "specified value". It may not be needed.]

[What is the computed value for elements to which the property does not apply? Do some existing inherited properties rely on inheritance through elements to which the property doesn't apply?]

Percentage values

This part indicates how percentages should be interpreted, if they occur in the value of the property. If "N/A" appears here, it means that the property does not accept percentages as values.

Media groups

[Some of this needs to be relaxed to deal with profiles.]

This section is informative, not normative.

This part indicates the media groups for which the property must be implemented by a conforming user agent. Since properties generally apply to several media, the "Applies to media" section of each property definition lists media groups rather than individual media types. User agents must support the property if they support rendering to the media types included in these media groups. Each property applies to all media types in the media groups listed in its definition.

CSS3 defines the following media groups:

continuous or paged.
visual, audio, speech, or tactile.
grid (for character grid devices), or bitmap.
interactive (for devices that allow user interaction), or static (for those that don't).
all (includes all media types)

The following table shows the relationships between media groups and media types:

Relationship between media groups and media types
Media Types	Media Groups
	continuous/paged	visual/audio/speech/tactile	grid/bitmap	interactive/static
braille	continuous	tactile	grid	both
embossed	paged	tactile	grid	static
handheld	both	visual, audio, speech	both	both
print	paged	visual	bitmap	static
projection	paged	visual	bitmap	interactive
screen	continuous	visual, audio	bitmap	both
speech	continuous	speech	N/A	both
tty	continuous	visual	grid	both
tv	both	visual, audio	bitmap	both

Shorthand properties

Some properties are shorthand properties, meaning they allow authors to specify the values of several properties with a single property.

For instance, the 'font' property is a shorthand property for setting 'font-style', 'font-variant', 'font-weight', 'font-size', 'line-height', and 'font-family' all at once.

The syntax of a shorthand property may allow some of the properties that can be specified by that shorthand to be omitted. When such values are omitted from a shorthand form, each "missing" property is assigned its initial value (see the Cascading & Inheritance module [[!CSS3CASCADE]]). The definition of a shorthand property may further say that it resets the definitions of other properties to their initial value.

The multiple style rules of this example:

h1 { 
  font-weight: bold; 
  font-size: 2em;
  line-height: 1.2; 
  font-family: Helvetica, Arial, sans-serif; 
  font-variant: normal;
  font-style: normal;
  font-stretch: normal;
  font-size-adjust: none
}

may be rewritten with a single shorthand property:

h1 { font: bold 2em/1.2 Helvetica, Arial, sans-serif }

In this example, 'font-variant', 'font-stretch', 'font-size-adjust', and 'font-style' take their initial values.

Appendix: Second grammar

[This grammar was the Appendix D grammar of CSS2, augmented by the additions from the @namespace draft. It needs to be incorporated into the above normative text in some way.]

The grammar below defines the syntax of CSS2. It is in some sense, however, a superset of CSS2 as this specification imposes additional semantic constraints not expressed in this grammar. A conforming UA must also adhere to the forward-compatible parsing rules, the property and value notation, and the unit notation. In addition, the document language may impose restrictions, e.g. HTML imposes restrictions on the possible values of the "class" attribute.

Grammar

The grammar below is LL(1) (but note that most UA's should not use it directly, since it doesn't express the forward-compatible parsing conventions, only the CSS3 syntax). The format of the productions is optimized for human consumption and some shorthand notation beyond Yacc (see [[YACC]]) is used:

[It's probably not LL(1), but rather just LALR(1).]

[This needs a lot more revisions to reflect all the additions in CSS3.]

[This doesn't allow nested at-rules, such as @page inside @media. Do we want to allow this?]

*: 0 or more
+: 1 or more
?: 0 or 1
|: separates alternatives
[ ]: grouping

The productions are:

stylesheet
  : [ CHARSET_SYM S* STRING S* ';' ]?
    [S|CDO|CDC]* [ import [S|CDO|CDC]* ]*
    [ namespace [S|CDO|CDC]* ]*
    [ [ ruleset | media | page | font_face ] [S|CDO|CDC]* ]*
  ;
import
  : IMPORT_SYM S*
    [STRING|URI] S* [ medium [ ',' S* medium]* ]? ';' S*
  ;
namespace
  : NAMESPACE_SYM S* [namespace_prefix S*]? [STRING|URI] S* ';' S*
  ;
namespace_prefix
  : IDENT
  ;
media
  : MEDIA_SYM S* medium [ ',' S* medium ]* '{' S* ruleset* '}' S*
  ;
medium
  : IDENT S*
  ;
page
  : PAGE_SYM S* IDENT? pseudo_page? S*
    '{' S* declaration [ ';' S* declaration ]* '}' S*
  ;
pseudo_page
  : ':' IDENT
  ;
font_face
  : FONT_FACE_SYM S*
    '{' S* declaration [ ';' S* declaration ]* '}' S*
  ;
operator
  : '/' S* | ',' S* | /* empty */
  ;
combinator
  : '+' S* | '>' S* | /* empty */
  ;
unary_operator
  : '-' | '+'
  ;
property
  : IDENT S*
  ;
ruleset
  : selector [ ',' S* selector ]*
    '{' S* declaration [ ';' S* declaration ]* '}' S*
  ;
selector
  : simple_selector [ combinator simple_selector ]*
  ;
simple_selector
  : element_name? [ HASH | class | attrib | pseudo ]* S*
  ;
class
  : '.' IDENT
  ;
element_name
  : IDENT | '*'
  ;
attrib
  : '[' S* IDENT S* [ [ '=' | INCLUDES | DASHMATCH ] S*
    [ IDENT | STRING ] S* ]? ']'
  ;
pseudo
  : ':' [ IDENT | FUNCTION S* IDENT S* ')' ]
  ;
declaration
  : property ':' S* expr prio?
  | /* empty */
  ;
prio
  : IMPORTANT_SYM S*
  ;
expr
  : term [ operator term ]*
  ;
term
  : unary_operator?
    [ NUMBER S* | PERCENTAGE S* | LENGTH S* | EMS S* | EXS S* | ANGLE S* |
      TIME S* | FREQ S* | function ]
  | STRING S* | IDENT S* | URI S* | UNICODERANGE S* | hexcolor
  ;
function
  : FUNCTION S* expr ')' S*
  ;
/*
 * There is a constraint on the color that it must
 * have either 3 or 6 hex-digits (i.e., [0-9a-fA-F])
 * after the "#"; e.g., "#000" is OK, but "#abcd" is not.
 */
hexcolor
  : HASH S*
  ;

Lexical scanner

The following is the tokenizer, written in Flex (see [[FLEX]]) notation. The tokenizer is case-insensitive.

The two occurrences of "\377" represent the highest character number that current versions of Flex can deal with (decimal 255). They should be read as "\4177777" (decimal 1114111), which is the highest possible code point in Unicode/ISO-10646, except excluding the characters excluded in the nonascii production above.

%option case-insensitive

h		[0-9a-f]
nonascii	[\200-\377]
unicode		\\{h}{1,6}[ \t\r\n\f]?
escape		{unicode}|\\[ -~\200-\377]
nmstart		[a-z]|{nonascii}|{escape}
nmchar		[a-z0-9-]|{nonascii}|{escape}
string1		\"([\t !#$%&(-~]|\\{nl}|\'|{nonascii}|{escape})*\"
string2		\'([\t !#$%&(-~]|\\{nl}|\"|{nonascii}|{escape})*\'

ident		[-]?{nmstart}{nmchar}*
name		{nmchar}+
num		[0-9]+|[0-9]*"."[0-9]+
string		{string1}|{string2}
url		([!#$%&*-~]|{nonascii}|{escape})*
w		[ \t\r\n\f]*
nl		\n|\r\n|\r|\f
range		\?{1,6}|{h}(\?{0,5}|{h}(\?{0,4}|{h}(\?{0,3}|{h}(\?{0,2}|{h}(\??|{h})))))

%%

[ \t\r\n\f]+		{return S;}

\/\*[^*]*\*+([^/][^*]*\*+)*\/	/* ignore comments */

"<!--"			{return CDO;}
"-->"			{return CDC;}
"~="			{return INCLUDES;}
"|="			{return DASHMATCH;}

{string}		{return STRING;}

{ident}			{return IDENT;}

"#"{name}		{return HASH;}

"@import"		{return IMPORT_SYM;}
"@page"			{return PAGE_SYM;}
"@media"		{return MEDIA_SYM;}
"@font-face"		{return FONT_FACE_SYM;}
"@charset"		{return CHARSET_SYM;}
"@namespace"		{return NAMESPACE_SYM;}

"!{w}important"		{return IMPORTANT_SYM;}

{num}em			{return EMS;}
{num}ex			{return EXS;}
{num}px			{return LENGTH;}
{num}cm			{return LENGTH;}
{num}mm			{return LENGTH;}
{num}in			{return LENGTH;}
{num}pt			{return LENGTH;}
{num}pc			{return LENGTH;}
{num}deg		{return ANGLE;}
{num}rad		{return ANGLE;}
{num}grad		{return ANGLE;}
{num}ms			{return TIME;}
{num}s			{return TIME;}
{num}Hz			{return FREQ;}
{num}kHz		{return FREQ;}
{num}{ident}		{return DIMEN;}
{num}%			{return PERCENTAGE;}
{num}			{return NUMBER;}

"url("{w}{string}{w}")"	{return URI;}
"url("{w}{url}{w}")"	{return URI;}
{ident}"("		{return FUNCTION;}

U\+{range}		{return UNICODERANGE;}
U\+{h}{1,6}-{h}{1,6}	{return UNICODERANGE;}

.			{return *yytext;}

Changes from CSS2

This section is informative.

The parts of CSS2 that have become part of this CSS3 draft are sections 1.3.2 (1.4.2 in CSS 2.1), 1.3.3 (1.4.3 in CSS2.1), 3 (parts), 4.1, 4.2, 4.4, and 6.3, 7, and Appendix D. This draft also contains new material on namespaces and on vendor extensions to CSS that began as separate documents.

The text taken from Chapter 1 of CSS2 is now normative rather than informative.
added the section on vendor extensions and added the '-' character to the ident productions so that identifiers can begin with it
modified the rules for handling parsing errors to allow implementations that support only parts of CSS3
described how style sheets that cannot be parsed according to the grammar should be handled (by saying that parsing uses the parseable initial sequence of tokens)
[to be done] described handling of unmatched quotation marks (by saying that tokenization uses the tokenizable initial sequence of characters)
[to be done] formalized the error handling rules by combining the forward-compatible grammar in chapter 4 with the CSS2 grammar in Appendix D.
combined DELIM and other single character tokens into CHAR
explicitly stated that the byte order mark should be ignored for purposes of the grammar
Changed the acceptable characters to exclude surrogate blocks and #xFFFE, #xFFFF, as in XML.
Changed the rules on what to do when a character encoding is not specified.

Acknowledgments

Since most of this draft is taken from CSS2, it would not have been possible to write it so easily without the work of the editors and authors of [[CSS1]] and [[CSS2]]. This draft also borrows heavily from earlier drafts on CSS namespace support by Peter Linss [[CSS3NAMESPACE]] and early (unpublished) drafts on vendor extensions to CSS by Bert Bos. Many current and former members of the working group have contributed to this document. Discussions on www-style@w3.org and in other places have also contributed ideas to this specification. Comments from Glenn Adams, Björn Höhrmann, and Etan Wexler have been particularly helpful.

CSS3 module: Syntax

W3C Working Draft [DATE: 13 August 2003]

Abstract

Status of this document

Table of contents

Dependencies on other modules

Introduction

CSS style sheet representation

Referring to characters not represented in a character encoding

The text/css content type

General syntax of CSS

Characters and case

Tokenization

Grammar

Principles of CSS error handling

Style sheets

Keywords

Statements

At-rules

Blocks

Rule sets, declaration blocks, and selectors

Declarations and properties

Comments

Rules for handling parsing errors or unsupported features

Partial implementations

Vendor-specific extensions

Historical notes

Conformance

Definitions

User agent conformance

Error conditions

Style sheet conformance

Authoring tool conformance

Format of property definitions in other modules

Value

initial and inherit values

Initial

Applies to

Inherited

Computed value

Percentage values

Media groups

Shorthand properties

Appendix: Second grammar

Grammar

Lexical scanner

Changes from CSS2

Acknowledgments

References

Normative references

Other references

Index

`initial` and `inherit` values