Status: ED ED: http://dev.w3.org/csswg/css-syntax Shortname: css-syntax Level: 3 Editor: Tab Atkins Jr., Google, http://xanthir.com/contact/ Editor: Simon Sapin, Mozilla, http://exyr.org/about/ Abstract: This module describes, in general terms, the basic structure and syntax of CSS stylesheets. It defines, in detail, the syntax and parsing of CSS - how to turn a stream of bytes into a meaningful stylesheet. Ignored Properties: color, animation-timing-function, text-decoration Ignored Terms:, , , ,
style attribute).
It defines algorithms for converting a stream of codepoints
(in other words, text)
into a stream of CSS tokens,
and then further into CSS objects
such as stylesheets, rules, and declarations.
p > a {
color: blue;
text-decoration: underline;
}
In the above rule, "p > a" is the selector,
which, if the source document is HTML,
selects any <a> elements that are children of a <p> element.
"color: blue;" is a declaration specifying that,
for the elements that match the selector,
their 'color' property should have the value ''blue''.
Similiarly, their 'text-decoration' property should have the value ''underline''.
@import "my-styles.css";The ''@import'' at-rule is a simple statement. After its name, it takes a single string or ''url()'' function to indicate the stylesheet that it should import.
@page :left {
margin-left: 4cm;
margin-right: 3cm;
}
The ''@page'' at-rule consists of an optional page selector (the '':left'' pseudoclass),
followed by a block of properties that apply to the page when printed.
In this way, it's very similar to a normal style rule,
except that its properties don't apply to any "element",
but rather the page itself.
@media print {
body { font-size: 10pt }
}
The ''@media'' at-rule begins with a media type
and a list of optional media queries.
Its block contains entire rules,
which are only applied when the ''@media''s conditions are fulfilled.
An identifier with the value "&B" could be written as ''\26 B'' or ''\000026B''.
A "real" space after the escape sequence must be doubled.
40 63 68 61 72 73 65 74 20 22 (not 22)* 22 3Bthen get an encoding for the sequence of
(not 22)* bytes,
decoded per windows-1252.
Note: Anything ASCII-compatible will do, so using windows-1252 is fine.
Note: The byte sequence above,
when decoded as ASCII,
is the string "@charset "…";",
where the "…" is the sequence of bytes corresponding to the encoding's name.
If the return value was utf-16 or utf-16be,
use utf-8 as the fallback encoding;
if it was anything else except failure,
use the return value as the fallback encoding.
Note: This mimics HTML <meta> behavior.
charset attribute on the <link> element or <?xml-stylesheet?> processing instruction that caused the style sheet to be included, if any.
If that does not return failure,
use the return value as the fallback encoding.
utf-8 as the fallback encoding.
Anne says that steps 3/4 should be an input to this algorithm from the specs that define importing stylesheet, to make the algorithm as a whole cleaner. Perhaps abstract it into the concept of an "environment charset" or something?
Should we only take the charset from the referring document if it's same-origin?
U+D800 to U+DFFF are the surrogate code points.
s·(i + f·10-d)·10te.
Should we go ahead and generalize the important flag to be a list of bang values? Suggested by Zack Weinburg. Declarations are further categorized as "properties" or "descriptors", with the former typically appearing in qualified rules and the latter appearing in at-rules. (This categorization does not occur at the Syntax level; instead, it is a product of where the declaration appears, and is defined by the respective specifications defining the given rule.)
CSSStyleSheet#insertRule method,
and similar functions which might exist,
which parse text into a single rule.
style attribute,
which parses text into the contents of a single style rule.
media HTML attribute.
Examples:
2n+0 /* represents all of the even elements in the list */ even /* same */ 4n+1 /* represents the 1st, 5th, 9th, 13th, etc. elements in the list */
Example:
-n+6 /* represents the first 6 elements of the list */
Examples:
0n+5 /* represents the 5th element in the list */ 5 /* same */
1 may be omitted from the rule.
Examples:
The following notations are therefore equivalent:
1n+0 /* represents all elements in the list */ n+0 /* same */ n /* same */
Examples:
2n+0 /* represents every even element in the list */ 2n /* same */
Valid Examples with white space:
3n + 1 +3n - 2 -n+ 6 +6
Invalid Examples with white space:
3 n + 2n + 2
<an+b> type<an+b> = odd | even | <integer> | <n-dimension> | '+'?† n | -n | <ndashdigit-dimension> | '+'?† <ndashdigit-ident> | <dashndashdigit-ident> | <n-dimension> <signed-integer> | '+'?† n <signed-integer> | -n <signed-integer> | <ndash-dimension> <signless-integer> | '+'?† n- <signless-integer> | -n- <signless-integer> | <n-dimension> ['+' | '-'] <signless-integer> '+'?† n ['+' | '-'] <signless-integer> | -n ['+' | '-'] <signless-integer>where:
<n-dimension> is a <<<ndash-dimension> is a <<<ndashdigit-dimension> is a <<<ndashdigit-ident> is an <<<dashndashdigit-ident> is an <<<integer> is a <<<signed-integer> is a <<<signless-integer> is a <<†: When a plus sign (+) precedes an ident starting with "n", as in the cases marked above, there must be no whitespace between the two tokens, or else the tokens do not match the above grammar. The clauses of the production are interpreted as follows:
<integer>
<n-dimension>
'+'? n
-n
<ndashdigit-dimension>
'+'? <ndashdigit-ident>
<dashndashdigit-ident>
<n-dimension> <signed-integer>
'+'? n <signed-integer>
-n <signed-integer>
<ndash-dimension> <signless-integer>
'+'? n- <signless-integer>
-n- <signless-integer>
<n-dimension> ['+' | '-'] <signless-integer>
'+'? n ['+' | '-'] <signless-integer>
-n ['+' | '-'] <signless-integer>
<foo> refers to the "foo" grammar term,
assumed to be defined elsewhere.
Substituting the <foo> for its definition results in a semantically identical grammar.
Several types of tokens are written literally, without quotes:
:), <<,), <<;), <<<(>>>, <<<)>>>, <<<{>>>, and <<<}>>>s.
Although it is possible, with escaping,
to construct an <<
For example, qualified rules inside ''@media'' rules [[CSS3-CONDITIONAL]] are style rules,
but qualified rules inside ''@keyframes'' rules are not [[CSS3-ANIMATIONS]].
Thanks for feedback and contributions from
David Baron,
呂康豪 (Kang-Hao Lu),
and Marc O'Morain.
@ or ends with (,
such a tokens is not an <<'+'.
Similarly, the <<<[>>> and <<<]>>>s must be written in single quotes,
as they're used by the syntax of the grammar itself to group clauses.
<<translateX( <
However, the stylesheet may end with the function unclosed, like:
.foo { transform: translate(50px
The CSS parser parses this as a style rule containing one declaration,
whose value is a function named "translate".
This matches the above grammar,
even though the ending token didn't appear in the token stream,
because by the time the parser is finished,
the presence of the ending token is no longer possible to determine;
all you have is the fact that there's a block and a function.
Defining Block Contents: the <
The CSS parser is agnostic as to the contents of blocks,
such as those that come at the end of some at-rules.
Defining the generic grammar of the blocks in terms of tokens is non-trivial,
but there are dedicated and unambiguous algorithms defined for parsing this.
The <declaration-list> production represents a list of declarations.
It may only be used in grammars as the sole value in a block,
and represents that the contents of the block must be parsed using the consume a list of declarations algorithm.
Similarly, the <rule-list> production represents a list of rules,
and may only be used in grammars as the sole value in a block.
It represents that the contents of the block must be parsed using the consume a list of rules algorithm.
Finally, the <stylesheet> production represents a list of rules.
It is identical to <@font-face { <
This is a complete and sufficient definition of the rule's grammar.
For another example,
''@keyframes'' rules are more complex,
interpreting their prelude as a name and containing keyframes rules in their block
Their grammar is:
@keyframes <
!important is automatically invalid on any descriptors.
If the rule accepts properties,
the spec for the rule must define whether the properties interact with the cascade,
and with what specificity.
If they don't interact with the cascade,
properties containing !important are automatically invalid;
otherwise using !important is valid and has its usual effect on the cascade origin of the property.
<
Keyframe rules, then,
must further define that they accept as declarations all animatable CSS properties,
plus the 'animation-timing-function' property,
but that they do not interact with the cascade.
@media <
It additionally defines a restriction that the <
CSS stylesheets
To parse a CSS stylesheet,
first parse a stylesheet.
Interpret all of the resulting top-level qualified rules as style rules, defined below.
If any style rule is invalid,
or any at-rule is not recognized or is invalid according to its grammar or context,
it's a parse error.
Discard that rule.
Style rules
A style rule is a qualified rule
that associates a selector list [[!SELECT]]
with a list of property declarations.
They are also called
rule sets in [[!CSS21]].
CSS Cascading and Inheritance [[!CSS3CASCADE]] defines how the declarations inside of style rules participate in the cascade.
The prelude of the qualified rule is parsed as a
selector list.
If this results in an invalid selector list,
the entire style rule is invalid.
The content of the qualified rule’s block is parsed as a
list of declarations.
Unless defined otherwise by another specification or a future level of this specification,
at-rules in that list are invalid
and must be ignored.
Declaration for an unknown CSS property
or whose value does not match the syntax defined by the property are invalid
and must be ignored.
The validity of the style rule’s contents have no effect on the validity of the style rule itself.
Unless otherwise specified, property names are ASCII case-insensitive.
Note: The names of Custom Properties [[CSS-VARIABLES]] are case-sensitive.
Qualified rules at the top-level of a CSS stylesheet are style rules.
Qualified rules in other contexts may or may not be style rules,
as defined by the context.
The ''@charset'' Rule
The ''@charset'' rule is a very special at-rule associated with determining the character encoding of the stylesheet.
In general, its grammar is:
<at-charset-rule> = @charset <
Additionally, an ''@charset'' rule is invalid if it is not at the top-level of a stylesheet,
or if it is not the very first rule of a stylesheet.
''@charset'' rules have an encoding,
given by the value of the <@charset "XXX";,
where XXX is a sequence of bytes other than 22 (ASCII for ").
While this resembles an ''@charset'' rule,
it's not actually the same thing.
For example, the necessary sequence of bytes will spell out something entirely different
if the stylesheet is in an encoding that's not ASCII-compatible,
such as UTF-16.
Serialization
This specification does not define how to serialize CSS in general,
leaving that task to the CSSOM and individual feature specifications.
However, there is one important facet that must be specified here regarding comments,
to ensure accurate "round-tripping" of data from text to CSS objects and back.
The tokenizer described in this specification does not produce tokens for comments,
or otherwise preserve them in any way.
Implementations may preserve the contents of comments and their location in the token stream.
If they do, this preserved information must have no effect on the parsing step,
but must be serialized in its position as
"/*"
followed by its contents
followed by "*/".
If the implementation does not preserve comments,
it must insert the text "/**/" between the serialization of adjacent tokens
when the two tokens are of the following pairs:
Note: The preceding pairs of tokens can only be adjacent due to comments in the original text,
so the above rule reinserts the minimum number of comments into the serialized text to ensure an accurate round-trip.
(Roughly. The <<
Serializing <an+b>
To serialize an <an+b> value,
let s initially be the empty string:
Return s.
Changes from CSS 2.1 and Selectors Level 3
This section is non-normative.
Note: The point of this spec is to match reality;
changes from CSS2.1 are nearly always because CSS 2.1 specified something that doesn't match actual browser behavior,
or left something unspecified.
If some detail doesn't match browsers,
please let me know
as it's almost certainly unintentional.
Changes in decoding from a byte stream:
Tokenization changes:
Parsing changes:
An+B changes from Selectors Level 3 [[SELECT]]:
Acknowledgments