Skip to content

[css-text-3] line breaks and ideographic space #2500

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
frivoal opened this issue Apr 4, 2018 · 10 comments
Closed

[css-text-3] line breaks and ideographic space #2500

frivoal opened this issue Apr 4, 2018 · 10 comments
Assignees
Labels
Commenter Response Pending css-text-3 Current Work i18n-clreq Chinese language enablement i18n-jlreq Japanese language enablement i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. Tested Memory aid - issue has WPT tests Tracked in DoC

Comments

@frivoal
Copy link
Collaborator

frivoal commented Apr 4, 2018

(reference: https://bugzilla.mozilla.org/show_bug.cgi?id=1450228)

Browsers differ in how they handle line breaks for IDEOGRAPHIC SPACE (U+3000):

  • Firefox allows break before and after
  • Chrome and Safari forbid a break before, and allow a break after
  • Edge forbids a break before and allow a break after, but also allows hanging overflow

try it here

Quoting @MurakamiShinyu

This is important because IDEOGRAPHIC SPACE is normally used after "!" or "?" in the middle of a paragraph to keep 1em space after such punctuation marks as explained in JLREQ 3.1.6 Positioning of Dividing Punctuation Marks (Question Mark and Exclamation Mark) and Hyphens

That logic would favor Edge's behavior.

UAX-14 says that IDEOGRAPHIC SPACE (U+3000) has class BA (for break after), which would support the behavior of Chrome/Safari/Edge over Firefox's, but:

  • I cannot I cannot find any reference in css-text-3 to UAX-14's BA class
  • The line-break property does not mention IDEOGRAPHIC SPACE (U+3000)

Should we reference the BA class somehow?
Should we directly list DEOGRAPHIC SPACE (U+3000) in some level of line-break?
Should we say something about allowing/requiring it to hang?

@frivoal frivoal added css-text-3 Current Work i18n-needs-resolution Issue the Internationalization Group has raised and looks for a response on. labels Apr 4, 2018
@fantasai
Copy link
Collaborator

fantasai commented Apr 4, 2018

CSS Text Level 3 does not normatively define the line breaking behavior of every character. Nor does it normatively reference UAX14's tables, because while they are a good start, they do not represent the ideal breaking behavior of every language.

We could add specific rules for ideographic space. Allowing it to hang seems reasonable: if @MurakamiShinyu agrees we should specify that I'm happy to do it. Note that if it hangs, it effectively is BA because it's effectively zero-width when occurring at the end of the line. ;)

@MurakamiShinyu
Copy link
Collaborator

Yes, I agree the behavior allowing hanging is preferable.

@frivoal
Copy link
Collaborator Author

frivoal commented Apr 5, 2018

I wonder what we should do about runs of IDEOGRAPHIC SPACEs should the whole thing hang? Only the last one? Hopefully we can just treat that as an edge case that's not really how people should write web pages, and spec whatever is simplest.

@r12a r12a added i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. i18n-jlreq Japanese language enablement and removed i18n-needs-resolution Issue the Internationalization Group has raised and looks for a response on. labels Apr 26, 2018
@r12a r12a added the i18n-clreq Chinese language enablement label Apr 26, 2018
@frivoal
Copy link
Collaborator Author

frivoal commented Sep 16, 2018

@MurakamiShinyu Do you have a suggestion about whether to hang multiple ideographic spaces? Or maybe we should hang one and wrap the rest?

fantasai added a commit that referenced this issue Sep 16, 2018
@frivoal
Copy link
Collaborator Author

frivoal commented Sep 16, 2018

f55dc74 hangs the whole sequence of ideographic spaces at the end of the line, let us know if that's fine.

@frivoal
Copy link
Collaborator Author

frivoal commented Oct 3, 2018

Tests: web-platform-tests/wpt#13338

@macnmm
Copy link

macnmm commented Oct 3, 2018

I know, not web, but InDesign allows the treatment of U+3000 to wrap like other non-space characters, depending on the paragraph composer chosen: Roman composer treats U+3000 like ideographic characters (they wrap); Japanese composer treats as white space (allowing them to hang). I think there are use cases where strings of U+3000 would be expected to wrap the line, so having a selector may be desirable. For me, I like that there is also the U+2003 for when you want the space to hang, so I prefer treating U+3000 as wrapping and U+2003 as hanging.

@frivoal
Copy link
Collaborator Author

frivoal commented Oct 3, 2018

@macnmm
Either of the following two comments should become new issues if you want to continue discussing them. I'm responding here for now, but if you think the case isn't closed, I'd appreciate if you could file them separately


I don't think we define U+2003 as hanging. Should we, for reasons other than CJK? currently, we only hang U+0020 and U+3000. Should that be generalized to all sorts of spaces? Many-but-not-all sorts of spaces?


As for having a switch, we effectively have one: white-space: break-spaces disallows spaces from hanging / being removed, and requires them to wrap instead. It does not allow (by itself) wrapping before U+3000, since that still has the UAX14 line breaking class of BA, but if there's a series of U+3000, the subsequent ones will wrap, and if the first one doesn't fit, it will bring the preceding character together with it to the next line.

white-space: break-spaces isn't a switch that only does that, though. It also stops sequences of U+0020 from collapsing, and preserves line breaks from the source, just like white-space: pre-wrap would.

Whether that's acceptable or not, I am not sure, it depends on the use case. Which brings me to: what is the need that drives wrapping U+3000 in the InDesign roman paragraph composer? Is that because of a particular need, or does it just falls out of not handling CJK characters specially, but isn't particularly need driven?

@macnmm
Copy link

macnmm commented Oct 8, 2018

The need was an edge case when converting old file formats in which the user used U+3000 to indent or place text on embox boundaries on the page, expecting them to wrap like other CJK characters would.
My memory of this is that both composers had this behavior; I was surprised there is a difference when I just checked, but have not dug into why the J composers now hang U+3000, when my memory was that they wrap in both.

@frivoal
Copy link
Collaborator Author

frivoal commented Oct 8, 2018

That use case doesn't really seem to apply to the web.

foolip pushed a commit to web-platform-tests/wpt that referenced this issue Nov 2, 2018
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Nov 10, 2018
…ace, a=testonly

Automatic update from web-platform-tests[css-text] Trailing ideographic white-space (#13338)

Relates to w3c/csswg-drafts#2500
--

wpt-commits: 85746ce0c5b8e8403c0238af6469d7177ed15c49
wpt-pr: 13338
jyc pushed a commit to jyc/gecko that referenced this issue Nov 11, 2018
…ace, a=testonly

Automatic update from web-platform-tests[css-text] Trailing ideographic white-space (#13338)

Relates to w3c/csswg-drafts#2500
--

wpt-commits: 85746ce0c5b8e8403c0238af6469d7177ed15c49
wpt-pr: 13338
@frivoal frivoal added the Tested Memory aid - issue has WPT tests label Apr 25, 2019
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this issue Oct 3, 2019
…ace, a=testonly

Automatic update from web-platform-tests[css-text] Trailing ideographic white-space (#13338)

Relates to w3c/csswg-drafts#2500
--

wpt-commits: 85746ce0c5b8e8403c0238af6469d7177ed15c49
wpt-pr: 13338

UltraBlame original commit: 62d9a252e6b979903db49bad58c8dfae98032012
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this issue Oct 3, 2019
…ace, a=testonly

Automatic update from web-platform-tests[css-text] Trailing ideographic white-space (#13338)

Relates to w3c/csswg-drafts#2500
--

wpt-commits: 85746ce0c5b8e8403c0238af6469d7177ed15c49
wpt-pr: 13338

UltraBlame original commit: 62d9a252e6b979903db49bad58c8dfae98032012
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this issue Oct 3, 2019
…ace, a=testonly

Automatic update from web-platform-tests[css-text] Trailing ideographic white-space (#13338)

Relates to w3c/csswg-drafts#2500
--

wpt-commits: 85746ce0c5b8e8403c0238af6469d7177ed15c49
wpt-pr: 13338

UltraBlame original commit: 62d9a252e6b979903db49bad58c8dfae98032012
frivoal added a commit to frivoal/csswg-drafts that referenced this issue Oct 15, 2019
The definition of the various values of the white-space property in
section 3 does not go into much detail about exactly how they work,
and just provides a high level intro to what they do. The details are
provided in section 4, and in particular 4.1 and 4.3 (so called
“phase 1” and “phase 2” of white space processing).

While this phase 2 has been updated (see
w3c#2500
w3c#3879
w3c#4180)
to define that not only space characters, but also other space
separators hang / don't hang / wrap / don't wrap, etc, based on the
value of the white-space property, the high level definition of these
values was not updated to reflect that.

While this is not necessarily a bug, as it is already called out that
details are in section 4, apparent contradictions or omissions can be
confusing.

Therefore, in the part of these definitions that explicitly talked
about white space wrapping/hanging (or not), include other space
separators as well, as defined in 4.3.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Commenter Response Pending css-text-3 Current Work i18n-clreq Chinese language enablement i18n-jlreq Japanese language enablement i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. Tested Memory aid - issue has WPT tests Tracked in DoC
Projects
None yet
Development

No branches or pull requests

5 participants