Skip to content

[css-text-3] Discarding Line Breaks Adjacent to Ambiguous Characters #5017

Open
@fantasai

Description

@fantasai

The discussion in #337 has veered off in a wide variety of directions, but @hax originally filed the issue to bring up the question of "ambiguous" characters, i.e. those which are commonly used both within and outside Chinese and Japanese context:

https://drafts.csswg.org/css-text-3/#line-break-transform

Otherwise, if the East Asian Width property [UAX11] of both the character before and after the line feed is F, W, or H (not A), and neither side is Hangul, then the segment break is removed.

As this rule, common use cases of quotation marks in Chinese

简体中文的
“引号”
两边不应该有空格。

will have unexpected spaces, because quotation marks are A.

Ideally, we should consider the language information of the context. If the context is East Asian language, A should be treat as W. Even in the unknown language context, if any side of the line feed is A and other side is F, W or H, the segment break should also be removed.

We decided to switch to a Unicode Block listing instead of relying on the East Asian Width property (in particular due to some backwards-incompatible changes on Unicode's side). The current draft does not have a concept of ambiguous characters: all characters are strong "discard" or "don't discard", with discarding behavior requiring both sides of the line break to be "discard".

We might want to consider classifying some characters as "ambiguous", particularly symbols and maybe also the few common punctuation marks used in Chinese (double quotes, specifically). These could defer to the character on the other side, and if both are ambiguous, default to "don't discard".

Do we want to do this? If so, should it be language-dependent or universal?

Metadata

Metadata

Assignees

No one assigned

    Labels

    css-text-4i18n-clreqChinese language enablementi18n-jlreqJapanese language enablementi18n-trackerGroup bringing to attention of Internationalization, or tracked by i18n but not needing response.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions