- From: Guillaume via GitHub <sysbot+gh@w3.org>
- Date: Fri, 18 Feb 2022 05:29:59 +0000
- To: public-css-archive@w3.org
Thanks, I'm closing this issue then.
Just so I'm sure I understood correctly, and to explain a bit better the issue than in my initial comment (3 days later I had a hard time figuring it out myself, sorry):
```
Input: `svg|*`
1. Match `svg|*` vs. `<complex-selector> = <compound-selector> [<combinator>? <compound-selector>]*`
2. Match `svg|*` vs. ...sub-productions that finally includes `<type-selector>`
3. Match `svg|*` vs. `<type-selector> = <wq-name> | <ns-prefix>? '*'`
4. Match `svg|*` vs. `<wq-name> = <ns-prefix>? <ident-token>`
5. Match `svg|*` vs. `<ns-prefix> = [ <ident-token> | '*' ]? '|'`
- result: `svg|`
- resume 4
6. Match `*` vs. `<ident-token>`
- result: fails
- backtrack to 4 and discard the match for `<ns-prefix>?` (omitted)
7. Match `svg|*` vs. `<ident-token>`
- result: `svg`
- resume 4 (end) then 3, then 2, then 1
8. Match `|*` vs. `<combinator>?`
- result: fails but optional, resume 1
9. Match `|*` vs. ...sub-productions that finally includes `<type-selector>`
10. Match `|*` vs. `<type-selector> = <wq-name> | <ns-prefix>? '*'`
11. Match `|*` vs. `<wq-name> = <ns-prefix>? <ident-token>`
12. Match `|*` vs. `<ns-prefix> = [ <ident-token> | '*' ]? '|'`
- result: `|` (omitted namespace prefix)
- resume 11
13. Match `*` vs. `<ident-token>`
- result: fails
- backtrack to go back to 10 and discard the match for `<ns-prefix>?`
14. Match `*` vs. `<ns-prefix> = [ <ident-token> | '*' ]? '|'`
- result: `|` (omitted namespace prefix)
- resume 10
14. Match vs. `'*'`
-> result: `*`
Results:
- `<compound-selector>` matches `svg`
- `<combinator>?` is omitted
- `<compound-selector>` matches `|*`
```
But because CSS matching against a grammar is implicitly defined to obey longest-match, instead of the above step 7, the parser should try with the second alternative for `<type-selector>` in `<wq-name> | <ns-prefix>? '*'`, even if a match for `<wq-name>` were found. If it would fail with `<ns-prefix>? '*'`, then it moves back and returns its initial match for `<wq-name>`, right?
I feel like this is fundamental and probably missing from the spec, as well as [these](https://github.com/w3c/csswg-drafts/issues/2921#issuecomment-902187106) [comments](https://github.com/w3c/csswg-drafts/issues/2921#issuecomment-902975958):
> parsing is non-greedy; if the first branch that starts to match eventually fails, you just move on to the second branch and try again
> Right, that's a backtracking parser vs a greedy/first-match parser. CSS grammars are intended for use with a backtracking parser.
--
GitHub Notification of comment by cdoublev
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/7027#issuecomment-1043924220 using your GitHub account
--
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config
Received on Friday, 18 February 2022 05:30:00 UTC