Skip to content

[Invalid] Tokenizer error #336

@audioXD

Description

@audioXD

Correction

It is the intended output as descibed in 4.3.7. Consume an escaped code point, I just read teh spec wring

EOF

This is a parse error. Return U+FFFD REPLACEMENT CHARACTER (�).

So ignore the bottom issue

Error while Tokenizing/Parsing

Input: \
Output: Ident("�")
Expected: Delim('\\')

Specification

In the CSS3 spec in 4.3. Tokenizer Algorithms it is stated :

U+005C REVERSE SOLIDUS (\)

If the input stream starts with a valid escape, reconsume the current input code point, consume an ident-like token, and return it.

Otherwise, this is a parse error. Return a <delim-token> with its value set to the current input code point.

NOTE there may be formatting errors since I copy pasted the doc.

Test

let mut input = cssparser::ParserInput::new("\\");
let mut parser = cssparser::Parser::new(&mut input);
assert_eq!(parser.next_including_whitespace_and_comments(), Ok(&cssparser::Token::Delim('\\')));

Output:

thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `Ok(Ident("�"))`,
 right: `Ok(Delim('\\'))`', examples/error.rs:6:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Cause

In ./src/tokenizer.rs:647 the bug is that it only checks if next is a new line.

b'\\' => {
    if !tokenizer.has_newline_at(1) { consume_ident_like(tokenizer) }
    else { tokenizer.advance(1); Delim('\\') }
},

Recommendation

Insted of tokenizer.has_newline_at(1) make and call tokenizer.has_escape()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions