Skip to content

[Do not merge yet] Batched breaking changes #117

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,15 @@ build = "build.rs"

exclude = ["src/css-parsing-tests"]

[lib]
doctest = false

[dev-dependencies]
rustc-serialize = "0.3"
tempdir = "0.3"
encoding_rs = "0.3.2"

[dependencies]
encoding = "0.2"
heapsize = {version = ">=0.1.1, <0.4.0", optional = true}
matches = "0.1"
serde = {version = ">=0.6.6, <0.9", optional = true}
Expand Down
2 changes: 1 addition & 1 deletion build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
* License, v. 2.0. If a copy of the MPL was not distributed with this
* file, You can obtain one at http://mozilla.org/MPL/2.0/. */

#[macro_use] extern crate quote;
extern crate quote;
extern crate syn;

use std::env;
Expand Down
37 changes: 20 additions & 17 deletions src/css-parsing-tests/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ CSS parsing tests
This repository contains implementation-independent test for CSS parsers,
based on the 2013 draft of the `CSS Syntax Level 3`_ specification.

.. _CSS Syntax Level 3: http://dev.w3.org/csswg/css-syntax-3/
.. _CSS Syntax Level 3: https://drafts.csswg.org/css-syntax-3/

The upstream repository for these tests is at
https://github.com/SimonSapin/css-parsing-tests
Expand Down Expand Up @@ -51,51 +51,51 @@ associated with the expected result.

``component_value_list.json``
Tests `Parse a list of component values
<http://dev.w3.org/csswg/css-syntax-3/#parse-a-list-of-component-values>`_.
<https://drafts.csswg.org/css-syntax-3/#parse-a-list-of-component-values>`_.
The Unicode input is represented by a JSON string,
the output as an array of `component values`_ as described below.

``component_value_list.json``
Tests `Parse a component value
<http://dev.w3.org/csswg/css-syntax-3/#parse-a-component-value>`_.
<https://drafts.csswg.org/css-syntax-3/#parse-a-component-value>`_.
The Unicode input is represented by a JSON string,
the output as a `component value`_.

``declaration_list.json``
Tests `Parse a list of declarations
<http://dev.w3.org/csswg/css-syntax-3/#parse-a-list-of-declarations>`_.
<https://drafts.csswg.org/css-syntax-3/#parse-a-list-of-declarations>`_.
The Unicode input is represented by a JSON string,
the output as an array of declarations_ and at-rules_.

``one_declaration.json``
Tests `Parse a declaration
<http://dev.w3.org/csswg/css-syntax-3/#parse-a-declaration>`_.
<https://drafts.csswg.org/css-syntax-3/#parse-a-declaration>`_.
The Unicode input is represented by a JSON string,
the output as a declaration_.

``one_rule.json``
Tests `Parse a rule
<http://dev.w3.org/csswg/css-syntax-3/#parse-a-rule>`_.
<https://drafts.csswg.org/css-syntax-3/#parse-a-rule>`_.
The Unicode input is represented by a JSON string,
the output as a `qualified rule`_ or at-rule_.

``rule_list.json``
Tests `Parse a list of rules
<http://dev.w3.org/csswg/css-syntax-3/#parse-a-list-of-rules>`_.
<https://drafts.csswg.org/css-syntax-3/#parse-a-list-of-rules>`_.
The Unicode input is represented by a JSON string,
the output as a list of `qualified rules`_ or at-rules_.

``stylesheet.json``
Tests `Parse a stylesheet
<http://dev.w3.org/csswg/css-syntax-3/#parse-a-stylesheet>`_.
<https://drafts.csswg.org/css-syntax-3/#parse-a-stylesheet>`_.
The Unicode input is represented by a JSON string,
the output as a list of `qualified rules`_ or at-rules_.

``stylesheet_bytes.json``
Tests `Parse a stylesheet
<http://dev.w3.org/csswg/css-syntax-3/#parse-a-stylesheet>`_
<https://drafts.csswg.org/css-syntax-3/#parse-a-stylesheet>`_
together with `The input byte stream
<http://dev.w3.org/csswg/css-syntax/#input-byte-stream>`_.
<https://drafts.csswg.org/css-syntax-3/#input-byte-stream>`_.
The input is represented as a JSON object containing:

* A required ``css_bytes``, the input byte string,
Expand Down Expand Up @@ -132,16 +132,23 @@ associated with the expected result.
This file is generated by the ``make_color3_keywords.py`` Python script.

``An+B.json``
Tests the `An+B <http://dev.w3.org/csswg/css-syntax/#the-anb-type>`_
Tests the `An+B <https://drafts.csswg.org/css-syntax-3/#the-anb-type>`_
syntax defined in CSS Syntax Level 3.
This `differs <http://dev.w3.org/csswg/css-syntax/#changes>`_ from the
This `differs <https://drafts.csswg.org/css-syntax/#changes>`_ from the
`nth grammar rule <http://www.w3.org/TR/css3-selectors/#nth-child-pseudo>`_
in Selectors Level 3 only in that
``-`` charecters and digits can be escaped in some cases.
``-`` characters and digits can be escaped in some cases.
The Unicode input is represented by a JSON string,
the output as null for invalid syntax,
or an array of two integers ``[A, B]``.

``urange.json``
Tests the `urange <https://drafts.csswg.org/css-syntax-3/#urange>`_
syntax defined in CSS Syntax Level 3.
The Unicode input is represented by a JSON string,
the output as null for invalid syntax,
or an array of two integers ``[start, end]``.


Result representation
=====================
Expand Down Expand Up @@ -228,10 +235,6 @@ Component values
the value as a number, the type as the string ``"integer"`` or ``"number"``,
and the unit as a string.

<unicode-range>
Array of length 3: the string ``"unicode-range"``,
followed by the *start* and *end* integers as two numbers.

<include-match>
The string ``"~="``.

Expand Down
74 changes: 0 additions & 74 deletions src/css-parsing-tests/component_value_list.json
Original file line number Diff line number Diff line change
Expand Up @@ -325,80 +325,6 @@
["dimension", "12", 12, "integer", "rêd"]
],

"u+1 U+10 U+100 U+1000 U+10000 U+100000 U+1000000", [
["unicode-range", 1, 1], " ",
["unicode-range", 16, 16], " ",
["unicode-range", 256, 256], " ",
["unicode-range", 4096, 4096], " ",
["unicode-range", 65536, 65536], " ",
["unicode-range", 1048576, 1048576], " ",
["unicode-range", 1048576, 1048576], ["number", "0", 0, "integer"]
],

"u+? u+1? U+10? U+100? U+1000? U+10000? U+100000?", [
["unicode-range", 0, 15], " ",
["unicode-range", 16, 31], " ",
["unicode-range", 256, 271], " ",
["unicode-range", 4096, 4111], " ",
["unicode-range", 65536, 65551], " ",
["unicode-range", 1048576, 1048591], " ",
["unicode-range", 1048576, 1048576], "?"
],

"u+?? U+1?? U+10?? U+100?? U+1000?? U+10000??", [
["unicode-range", 0, 255], " ",
["unicode-range", 256, 511], " ",
["unicode-range", 4096, 4351], " ",
["unicode-range", 65536, 65791], " ",
["unicode-range", 1048576, 1048831], " ",
["unicode-range", 1048576, 1048591], "?"
],

"u+??? U+1??? U+10??? U+100??? U+1000???", [
["unicode-range", 0, 4095], " ",
["unicode-range", 4096, 8191], " ",
["unicode-range", 65536, 69631], " ",
["unicode-range", 1048576, 1052671], " ",
["unicode-range", 1048576, 1048831], "?"
],

"u+???? U+1???? U+10???? U+100????", [
["unicode-range", 0, 65535], " ",
["unicode-range", 65536, 131071], " ",
["unicode-range", 1048576, 1114111], " ",
["unicode-range", 1048576, 1052671], "?"
],

"u+????? U+1????? U+10?????", [
["unicode-range", 0, 1048575], " ",
["unicode-range", 1048576, 2097151], " ",
["unicode-range", 1048576, 1114111], "?"
],

"u+?????? U+1??????", [
["unicode-range", 0, 16777215], " ",
["unicode-range", 1048576, 2097151], "?"
],

"u+20-3F U+100000-2 U+1000000-2 U+10-200000", [
["unicode-range", 32, 63], " ",
["unicode-range", 1048576, 2], " ",
["unicode-range", 1048576, 1048576], ["number", "0", 0, "integer"],
["number", "-2", -2, "integer"], " ",
["unicode-range", 16, 2097152]
],

"ù+12 Ü+12 u +12 U+ 12 U+12 - 20 U+1?2 U+1?-50 U+1- 2", [
["ident", "ù"], ["number", "+12", 12, "integer"], " ",
["ident", "Ü"], ["number", "+12", 12, "integer"], " ",
["ident", "u"], " ", ["number", "+12", 12, "integer"], " ",
["ident", "U"], "+", " ", ["number", "12", 12, "integer"], " ",
["unicode-range", 18, 18], " ", "-", " ", ["number", "20", 20, "integer"], " ",
["unicode-range", 16, 31], ["number", "2", 2, "integer"], " ",
["unicode-range", 16, 31], ["number", "-50", -50, "integer"], " ",
["unicode-range", 1, 1], "-", " ", ["number", "2", 2, "integer"]
],

"~=|=^=$=*=||<!------> |/**/| ~/**/=", [
"~=", "|=", "^=", "$=", "*=", "||", "<!--", ["ident", "----"], ">",
" ", "|", "|", " ", "~", "="
Expand Down
81 changes: 81 additions & 0 deletions src/css-parsing-tests/urange.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
[

"u+1, U+10, U+100, U+1000, U+10000, U+100000, U+1000000", [
[1, 1],
[16, 16],
[256, 256],
[4096, 4096],
[65536, 65536],
[1048576, 1048576],
null
],

"u+?, u+1?, U+10?, U+100?, U+1000?, U+10000?, U+100000?", [
[0, 15],
[16, 31],
[256, 271],
[4096, 4111],
[65536, 65551],
[1048576, 1048591],
null
],

"u+??, U+1??, U+10??, U+100??, U+1000??, U+10000??", [
[0, 255],
[256, 511],
[4096, 4351],
[65536, 65791],
[1048576, 1048831],
null
],

"u+???, U+1???, U+10???, U+100???, U+1000???", [
[0, 4095],
[4096, 8191],
[65536, 69631],
[1048576, 1052671],
null
],

"u+????, U+1????, U+10????, U+100????", [
[0, 65535],
[65536, 131071],
[1048576, 1114111],
null
],

"u+?????, U+1?????, U+10?????", [
[0, 1048575],
null,
null
],

"u+??????, U+1??????", [
null,
null
],


"u+20-3F, u+3F-3F, u+3F-3E, U+0-110000, U+0-10FFFF, U+100000-2, U+1000000-2, U+10-200000", [
[32, 63],
[63, 63],
null,
null,
[0, 1114111],
null,
null,
null
],

"ù+12, Ü+12, u +12, U+ 12, U+12 - 20, U+1?2, U+1?-50, U+1- 2", [
null,
null,
null,
null,
null,
null,
null,
null
]

]
69 changes: 33 additions & 36 deletions src/from_bytes.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,23 @@
* License, v. 2.0. If a copy of the MPL was not distributed with this
* file, You can obtain one at http://mozilla.org/MPL/2.0/. */

use std::cmp;
/// Abstraction for avoiding a dependency from cssparser to an encoding library
pub trait EncodingSupport {
/// One character encoding
type Encoding;

use encoding::label::encoding_from_whatwg_label;
use encoding::all::UTF_8;
use encoding::{EncodingRef, DecoderTrap, decode};
/// https://encoding.spec.whatwg.org/#concept-encoding-get
fn from_label(ascii_label: &[u8]) -> Option<Self::Encoding>;

/// Return the UTF-8 encoding
fn utf8() -> Self::Encoding;

/// Determine the character encoding of a CSS stylesheet and decode it.
/// Whether the given encoding is UTF-16BE or UTF-16LE
fn is_utf16_be_or_le(encoding: &Self::Encoding) -> bool;
}


/// Determine the character encoding of a CSS stylesheet.
///
/// This is based on the presence of a BOM (Byte Order Mark), an `@charset` rule, and
/// encoding meta-information.
Expand All @@ -20,48 +29,36 @@ use encoding::{EncodingRef, DecoderTrap, decode};
/// * `environment_encoding`: An optional `Encoding` object for the [environment encoding]
/// (https://drafts.csswg.org/css-syntax/#environment-encoding), if any.
///
/// Returns a 2-tuple of a decoded Unicode string and the `Encoding` object that was used.
pub fn decode_stylesheet_bytes(css: &[u8], protocol_encoding_label: Option<&str>,
environment_encoding: Option<EncodingRef>)
-> (String, EncodingRef) {
/// Returns the encoding to use.
pub fn stylesheet_encoding<E>(css: &[u8], protocol_encoding_label: Option<&[u8]>,
environment_encoding: Option<E::Encoding>)
-> E::Encoding
where E: EncodingSupport {
// https://drafts.csswg.org/css-syntax/#the-input-byte-stream
match protocol_encoding_label {
None => (),
Some(label) => match encoding_from_whatwg_label(label) {
Some(label) => match E::from_label(label) {
None => (),
Some(fallback) => return decode_replace(css, fallback)
Some(protocol_encoding) => return protocol_encoding
}
}
if css.starts_with("@charset \"".as_bytes()) {
// 10 is "@charset \"".len()
// 100 is arbitrary so that no encoding label is more than 100-10 bytes.
match css[10..cmp::min(css.len(), 100)].iter().position(|&b| b == b'"') {
let prefix = b"@charset \"";
if css.starts_with(prefix) {
let rest = &css[prefix.len()..];
match rest.iter().position(|&b| b == b'"') {
None => (),
Some(label_length)
=> if css[10 + label_length..].starts_with("\";".as_bytes()) {
let label = &css[10..10 + label_length];
let label = label.iter().map(|&b| b as char).collect::<String>();
match encoding_from_whatwg_label(&*label) {
Some(label_length) => if rest[label_length..].starts_with(b"\";") {
let label = &rest[..label_length];
match E::from_label(label) {
None => (),
Some(fallback) => match fallback.name() {
"utf-16be" | "utf-16le"
=> return decode_replace(css, UTF_8 as EncodingRef),
_ => return decode_replace(css, fallback),
Some(charset_encoding) => if E::is_utf16_be_or_le(&charset_encoding) {
return E::utf8()
} else {
return charset_encoding
}
}
}
}
}
match environment_encoding {
None => (),
Some(fallback) => return decode_replace(css, fallback)
}
return decode_replace(css, UTF_8 as EncodingRef)
}


#[inline]
fn decode_replace(input: &[u8], fallback_encoding: EncodingRef)-> (String, EncodingRef) {
let (result, used_encoding) = decode(input, DecoderTrap::Replace, fallback_encoding);
(result.unwrap(), used_encoding)
environment_encoding.unwrap_or_else(E::utf8)
}
Loading