Skip to content

Commit 9e8f122

Browse files
committed
[css-text-3] Add note explaining the purpose of space-discarding appendix. #337
1 parent 6eda67a commit 9e8f122

File tree

1 file changed

+39
-0
lines changed

1 file changed

+39
-0
lines changed

css-text-3/Overview.bs

+39
Original file line numberDiff line numberDiff line change
@@ -5499,6 +5499,45 @@ Space-Discarding Unicode Characters</h2>
54995499
Han, Hiragana, Katakana, or Yi script
55005500
shall also be considered part of the [=space-discarding character set=].
55015501

5502+
<details class="note">
5503+
<summary>Wherefore this table of “space-discarding characters”?</summary>
5504+
5505+
The purpose of the [[#line-break-transform|segment break transformation rules]]
5506+
is to “unbreak” text that has been formatted
5507+
with extra white space for source code readability,
5508+
see [[#line-break-transform]].
5509+
5510+
In most cases, “unbreaking” a line of text requires joining them with a space,
5511+
but some writing systems don't use spaces
5512+
so such texts need to be joined without any space.
5513+
CSS uses the characters before and after to determine
5514+
whether to join lines with or without a space.
5515+
5516+
For simplicity and for ease of implementation,
5517+
the classification of characters as space-discarding or space-preserving
5518+
is done by Unicode code block.
5519+
Ideally, such a list would be maintained in [[UNICODE]],
5520+
but the Unicode Technical Committee has yet
5521+
to express any intention of taking on this task.
5522+
In the meantime, in the interest of bringing
5523+
more of the text-processing facilities of CSS and HTML
5524+
that are available to Western writing systems
5525+
to Eastern writing systems as well,
5526+
the CSSWG is maintaining this appendix
5527+
and refining the rules in [[#line-break-transform]],
5528+
and hopes that in the future,
5529+
once CSS has demonstrated its viability,
5530+
the Unicode Consortium will recognize the need for an “unbreaking” algorithm
5531+
and take over maintenance of such.
5532+
5533+
<!-- things that could use an unbreaking algorithm:
5534+
* HTML/CSS
5535+
* Markdown
5536+
* TeX
5537+
* text editors' “unbreak lines” commands
5538+
-->
5539+
</details>
5540+
55025541
<h2 id="script-tagging" class="no-num">Appendix G.
55035542
Tagging Content by Writing System</h2>
55045543

0 commit comments

Comments
 (0)