-
-
Notifications
You must be signed in to change notification settings - Fork 4.8k
Improve candidate extraction when candidates contain . characters
#17113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve candidate extraction when candidates contain . characters
#17113
Conversation
0f377c1 to
1d7d188
Compare
Top-level "strings" don't have quotes, so if your paragraph contains a quote then it should just be a quote and not be considered a string. This also means that we have to make sure that `<` and `>` are considered brackets for the HTML case.
In Slim you don't need any type of brackets, but we still want `"px-2.5"` to be treated as a string.
1d7d188 to
8a52320
Compare
8a52320 to
462aea7
Compare
Co-authored-by: Philipp Spiess <hello@philippspiess.com>
Unfortunately this still breaks if a double-quote is used in the text. :/ @RobinMalfait have been chatting about this and we'll going to revert the string-based handling of dots inside utilities in pug and instead make it replace dots to space unless:
This way we can avoid thinking about strings in this languages |
Let's not care about the concept of a string, instead let's only replace `.` when it's not inside of a bracket stack _or_ when it's not surrounded by numbers. That means that `.flex.items-center` will become ` flex items-center` but `px-2.5` remains as `px-2.5`. It _does_ mean that some (top-level) strings will be mutated now, but that shouldn't be an issue for our extractor to run.
a8b7a03 to
ff53406
Compare
| daisy_button "😉 Get Started", css: "btn-primary text-xl", | ||
| right_icon: "arrow-right", target: "_blank", | ||
| class: "px-2.5" | ||
| href: "https://github com/profoundry-us/loco_motion locomotion-components" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit silly, but the . and # characters are replaced because:
- The
.is not surrounded by numbers - The
#is replaced because it's not inside of a bracket stack
While this does break the string, it doesn't matter from Tailwind's perspective and since we're not updating the real code, it's not an issue.
. characters
This was from before the new implementation. Removing so we can enable auto-merge.
philipp-spiess
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this approach a lot more. Having to worry about what these template languages consider strings and not felt like a rabbit-hole where the solution is to have a full parser for those languages. This fix is more surgical for sure 👍

This PR fixes an issue where some classes weren't properly extracted due to some incorrect assumptions in the pre processors.
Templating languages such as Haml, Slim and Pug allow you to write classes in a shorter way that are not properly contained inside of strings. E.g.:
These candidates are not properly extracted because there are no bounding characters like quotes. To solve this, we pre-process it and replace
.withcharacters. This results in something like:However, this has some challenges on its own. Candidates like
px-2.5cannot be written in this shorthand form, instead they need to be in strings. Now we cannot replace the.because otherwise we would changepx-2.5topx-2 5which is wrong.The next problem is that we need to know when they are in a "string". This has another set of problems because these templating languages allow you to write normal text that will eventually be the contents of the HTML tags.
.text-red-500.text-3xl | This text can't should be red ^ Wait, is this the start of a string now???In this example, if we consider the
'the start of a string, when it's clearly not, how would we know it's for sure not a string?This ended up as a bit of a rabbit hole, but we came up with another approach entirely if we think about the original problem we want to solve which is when do we change
.tocharacters.One of the rules in our current extractor is that a
.has to be between 2 numbers. Which works great in a scenario like:px-2.5. However, if you look at Haml or Slim syntax, this is also allowed:In this scenario, a
.is surrounded by numbers so we shouldn't replace it with a space. But as you can see, we clearly do... so we need another heuristic in this case.Luckily, one of the rules in Tailwind CSS is that a utility cannot start with a number, but a variant can. This means that if we see a scenario like
<digit>.<digit>then we can just check if the value after the.is a valid variant or not.In this case it is a valid variant so we do want to replace the
.with aeven though we do have the<digit>.<digit>format.🥴
Test plan