[css-text-3] Discarding Line Breaks Adjacent to Ambiguous Characters

The discussion in #337 has veered off in a wide variety of directions, but @hax originally filed the issue to bring up the question of "ambiguous" characters, i.e. those which are commonly used both within and outside Chinese and Japanese context:

> https://drafts.csswg.org/css-text-3/#line-break-transform
> 
> > Otherwise, if the East Asian Width property [UAX11] of both the character before and after the line feed is F, W, or H (not A), and neither side is Hangul, then the segment break is removed.
> 
> As this rule, common use cases of quotation marks in Chinese
>
> ```
> 简体中文的
> “引号”
> 两边不应该有空格。
> ```
> 
> will have unexpected spaces, because quotation marks are _A_.
> 
> Ideally, we should consider the language information of the context. If the context is East Asian language, _A_ should be treat as _W_. Even in the unknown language context, if any side of the line feed is _A_ and other side is _F_, _W_ or _H_, the segment break should also be removed.

We decided to switch to a Unicode Block listing instead of relying on the East Asian Width property (in particular due to some backwards-incompatible changes on Unicode's side). The current draft does not have a concept of ambiguous characters: all characters are strong "discard" or "don't discard", with discarding behavior requiring both sides of the line break to be "discard".

We might want to consider classifying some characters as "ambiguous", particularly symbols and maybe also the few common punctuation marks used in Chinese (double quotes, specifically). These could defer to the character on the other side, and if both are ambiguous, default to "don't discard".

Do we want to do this? If so, should it be language-dependent or universal?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[css-text-3] Discarding Line Breaks Adjacent to Ambiguous Characters #5017

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[css-text-3] Discarding Line Breaks Adjacent to Ambiguous Characters #5017

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions