Skip to content

Commit e5ea475

Browse files
committed
Shift split between word-break and line-break to word-breaking vs. punctuation-breaking
1 parent 4fb956a commit e5ea475

2 files changed

Lines changed: 79 additions & 75 deletions

File tree

css3-text/Overview.html

Lines changed: 47 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@
8383

8484
<h1>CSS Text Level 3</h1>
8585

86-
<h2 class="no-num no-toc" id=longstatus-date>Editor's Draft 18 October
86+
<h2 class="no-num no-toc" id=longstatus-date>Editor's Draft 3 November
8787
2010</h2>
8888

8989
<dl>
@@ -264,12 +264,12 @@ <h2 class="no-num no-toc" id=contents>Table of Contents</h2>
264264
and Word Boundaries</a>
265265
<ul class=toc>
266266
<li><a href="#line-break"><span class=secno>5.1. </span> Line Breaking
267-
Restrictions for CJK Scripts: the &lsquo;<code
267+
Restrictions Strictness: the &lsquo;<code
268268
class=property>line-break</code>&rsquo; property</a>
269269

270-
<li><a href="#word-break"><span class=secno>5.2. </span> Line Breaking
271-
Rules for non-CJK Scripts: the &lsquo;<code
272-
class=property>word-break</code>&rsquo; property</a>
270+
<li><a href="#word-break"><span class=secno>5.2. </span> Word Breaking
271+
Rules: the &lsquo;<code class=property>word-break</code>&rsquo;
272+
property</a>
273273
</ul>
274274

275275
<li><a href="#wrapping"><span class=secno>6. </span> Text Wrapping</a>
@@ -1122,28 +1122,37 @@ <h2 id=line-breaking><span class=secno>5. </span> Line Breaking and Word
11221122
<p>CSS does not fully define where line breaking opportunities occur,
11231123
however some controls are provided to distinguish common variations.
11241124

1125+
<p class=note>Further information on line breaking conventions can be found
1126+
in <a href="#JIS4051" rel=biblioentry>[JIS4051]<!--{{JIS4051}}--></a> for
1127+
Japanese, <a href="#ZHMARK" rel=biblioentry>[ZHMARK]<!--{{ZHMARK}}--></a>
1128+
for Chinese, and [?] for Korean, and in <a href="#UAX14"
1129+
rel=biblioentry>[UAX14]<!--{{!UAX14}}--></a> for all scripts in Unicode.
1130+
<!-- The CSS Working Group notes that although UAX 14 contains a wealth of
1131+
information about line breaking conventions, a literal implementation
1132+
of its algorithm has been found to be inadequate in multiple situations. -->
1133+
1134+
<p class=issue>Any guidance for appropriate references here would be much
1135+
appreciated.
1136+
11251137
<p>Floated and absolutely-positioned elements do not introduce a line
11261138
breaking opportunity. The line breaking behavior of a replaced element is
11271139
equivalent to that of a Latin character.
11281140

1129-
<p class=issue>There is a question of what the default line breaking of
1130-
Korean should be, and whether dictionary-based breaking is needed for
1131-
typical layout (e.g. novels).
1132-
11331141
<p class=issue>It is not clear whether this section handles Southeast Asian
11341142
scripts well. Additionally, some guidance should be provided on how to
11351143
break or not break Southeast Asian in the absence of a dictionary.
11361144

11371145
<h3 id=line-break><span class=secno>5.1. </span> Line Breaking Restrictions
1138-
for CJK Scripts: the &lsquo;<a href="#line-break0"><code
1146+
Strictness: the &lsquo;<a href="#line-break0"><code
11391147
class=property>line-break</code></a>&rsquo; property</h3>
11401148

1141-
<p>This property specifies line break opportunities for CJK scripts.
1149+
<p>This property specifies the strictness of line-breaking rules:
1150+
particularly how line-breaking interacts with punctuation.
11421151

11431152
<p>CSS distinguishes between three levels of strictness in the rules for
1144-
implicit line breaking in CJK text. The precise set of rules in effect for
1145-
the strict and loose levels is up to the UA and should follow language
1146-
conventions. However, this specification does recommend that:
1153+
implicit line breaking. The precise set of rules in effect for the strict
1154+
and loose levels is up to the UA and should follow language conventions.
1155+
However, this specification does recommend that:
11471156

11481157
<ul>
11491158
<li>Following breaks be forbidden in strict line breaking and allowed in
@@ -1165,7 +1174,7 @@ <h3 id=line-break><span class=secno>5.1. </span> Line Breaking Restrictions
11651174
<li>breaks before iteration marks (U+3005, U+303B, U+309D, U+309E,
11661175
U+30FD, U+30FE)
11671176

1168-
<li>breaks before inseparatable characters (U+2014, U+2025, U+2026,
1177+
<li>breaks between inseparatable characters (U+2014, U+2025, U+2026,
11691178
U+3033, U+3034, U+3035)
11701179
</ul>
11711180
If the language is known to be Chinese or Japanese, then additionally the
@@ -1186,18 +1195,6 @@ <h3 id=line-break><span class=secno>5.1. </span> Line Breaking Restrictions
11861195
</ul>
11871196
</ul>
11881197

1189-
<p class=note>Information on line breaking conventions can be found in <a
1190-
href="#JIS4051" rel=biblioentry>[JIS4051]<!--{{JIS4051}}--></a> for
1191-
Japanese, <a href="#ZHMARK" rel=biblioentry>[ZHMARK]<!--{{ZHMARK}}--></a>
1192-
for Chinese, and [?] for Korean, and in <a href="#UAX14"
1193-
rel=biblioentry>[UAX14]<!--{{!UAX14}}--></a> for all scripts in Unicode.
1194-
<!-- The CSS Working Group notes that although UAX 14 contains a wealth of
1195-
information about line breaking conventions, a literal implementation
1196-
of its algorithm has been found to be inadequate in multiple situations. -->
1197-
1198-
<p class=issue>Any guidance for appropriate references here would be much
1199-
appreciated.
1200-
12011198
<table class=propdef>
12021199
<tbody>
12031200
<tr>
@@ -1266,21 +1263,17 @@ <h3 id=line-break><span class=secno>5.1. </span> Line Breaking Restrictions
12661263

12671264
<dd>Breaks CJK scripts using a more restrictive set of line-breaking rules
12681265
than &lsquo;<code class=property>normal</code>&rsquo;.
1269-
1270-
<dt><dfn id=keep-all
1271-
title="line-break:keep-all"><code>keep-all</code></dfn>
1272-
1273-
<dd>Sequences of CJK characters can no longer break on implied break
1274-
points. This option should only be used where the presence of word
1275-
separator characters still creates line-breaking opportunities, as in
1276-
Korean.
12771266
</dl>
12781267

1279-
<h3 id=word-break><span class=secno>5.2. </span> Line Breaking Rules for
1280-
non-CJK Scripts: the &lsquo;<a href="#word-break0"><code
1268+
<p class=issue>It was requested to change the name of &lsquo;<code
1269+
class=css>newspaper</code>&rsquo; to something more general, since not all
1270+
users would be newspapers and newspapers might use stricter rules.
1271+
1272+
<h3 id=word-break><span class=secno>5.2. </span> Word Breaking Rules: the
1273+
&lsquo;<a href="#word-break0"><code
12811274
class=property>word-break</code></a>&rsquo; property</h3>
12821275

1283-
<p>This property specifies line break opportunities for non-CJK scripts.
1276+
<p>This property specifies line break opportunities within words.
12841277

12851278
<table class=propdef>
12861279
<tbody>
@@ -1292,7 +1285,7 @@ <h3 id=word-break><span class=secno>5.2. </span> Line Breaking Rules for
12921285
<tr>
12931286
<th>Value:
12941287

1295-
<td>normal | break-all | hyphenate
1288+
<td>normal | keep-all | [ break-all || hyphenate ]
12961289

12971290
<tr>
12981291
<th>Initial:
@@ -1328,15 +1321,23 @@ <h3 id=word-break><span class=secno>5.2. </span> Line Breaking Rules for
13281321
<dl>
13291322
<dt><dfn id=normal1 title="word-break:normal"><code>normal</code></dfn>
13301323

1331-
<dd>Breaks non-CJK scripts according to their own rules.
1324+
<dd>Break words according to their usual rules.
13321325

13331326
<dt><dfn id=break-all
13341327
title="word-break:break-all"><code>break-all</code></dfn>
13351328

1336-
<dd>Lines may break between any two grapheme clusters for non-CJK scripts.
1337-
This option is used mostly in a context where the text is predominantly
1338-
using CJK characters with few non-CJK excerpts and it is desired that the
1339-
text be better distributed on each line.
1329+
<dd>Words may break between any two grapheme clusters within words.
1330+
Hyphenation is not applied. This option is used mostly in a context where
1331+
the text is predominantly using CJK characters with few non-CJK excerpts
1332+
and it is desired that the text be better distributed on each line.
1333+
1334+
<dt><dfn id=keep-all
1335+
title="line-break:keep-all"><code>keep-all</code></dfn>
1336+
1337+
<dd>Sequences of CJK characters can no longer break on implied break
1338+
points. This option should only be used where the presence of word
1339+
separator characters still creates line-breaking opportunities, as in
1340+
Korean.
13401341

13411342
<dt><dfn id=hyphenate
13421343
title="word-break:hyphenate"><code>hyphenate</code></dfn>
@@ -5293,7 +5294,7 @@ <h2 class=no-num id=appendix-b-property-index> Appendix B: Property index</h2>
52935294
<tr valign=baseline>
52945295
<td><a class=property href="#word-break0">word-break</a>
52955296

5296-
<td>normal | break-all | hyphenate
5297+
<td>normal | keep-all | [ break-all || hyphenate ]
52975298

52985299
<td>normal
52995300

@@ -5428,7 +5429,7 @@ <h2 class=no-num id=index>Index</h2>
54285429
title="line-break:auto"><strong>5.1.</strong></a>
54295430

54305431
<li>line-break:keep-all, <a href="#keep-all"
5431-
title="line-break:keep-all"><strong>5.1.</strong></a>
5432+
title="line-break:keep-all"><strong>5.2.</strong></a>
54325433

54335434
<li>line-break:newspaper, <a href="#newspaper"
54345435
title="line-break:newspaper"><strong>5.1.</strong></a>

css3-text/Overview.src.html

Lines changed: 32 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -749,25 +749,35 @@ <h2 id="line-breaking">
749749
<p>CSS does not fully define where line breaking opportunities occur,
750750
however some controls are provided to distinguish common variations.
751751

752+
<p class="note">Further information on line breaking conventions can be
753+
found in
754+
[[JIS4051]] for Japanese,
755+
[[ZHMARK]] for Chinese, and [?] for Korean, and
756+
in [[!UAX14]] for all scripts in Unicode.
757+
<!-- The CSS Working Group notes that although UAX 14 contains a wealth of
758+
information about line breaking conventions, a literal implementation
759+
of its algorithm has been found to be inadequate in multiple situations. --></p>
760+
761+
<p class="issue">Any guidance for appropriate references here would be
762+
much appreciated.</p>
763+
764+
752765
<p>Floated and absolutely-positioned elements do not introduce a line
753766
breaking opportunity. The line breaking behavior of a replaced element
754767
is equivalent to that of a Latin character.</p>
755768

756-
<p class="issue">There is a question of what the default line breaking
757-
of Korean should be, and whether dictionary-based breaking is needed
758-
for typical layout (e.g. novels).</p>
759-
760769
<p class="issue">It is not clear whether this section handles Southeast Asian
761770
scripts well. Additionally, some guidance should be provided on how to
762771
break or not break Southeast Asian in the absence of a dictionary.</p>
763772

764773
<h3 id="line-break">
765-
Line Breaking Restrictions for CJK Scripts: the 'line-break' property</h3>
774+
Line Breaking Restrictions Strictness: the 'line-break' property</h3>
766775

767-
<p>This property specifies line break opportunities for CJK scripts.</p>
776+
<p>This property specifies the strictness of line-breaking rules:
777+
particularly how line-breaking interacts with punctuation.</p>
768778

769779
<p>CSS distinguishes between three levels of strictness in the rules for
770-
implicit line breaking in CJK text. The precise set of rules in effect
780+
implicit line breaking. The precise set of rules in effect
771781
for the strict and loose levels is up to the UA and should follow
772782
language conventions. However, this specification does recommend that:</p>
773783

@@ -784,7 +794,7 @@ <h3 id="line-break">
784794
<li>Following breaks be forbidden in normal and strict line breaking and allowed in newspaper:
785795
<ul>
786796
<li>breaks before iteration marks (U+3005, U+303B, U+309D, U+309E, U+30FD, U+30FE)</li>
787-
<li>breaks before inseparatable characters (U+2014, U+2025, U+2026, U+3033, U+3034, U+3035)</li>
797+
<li>breaks between inseparatable characters (U+2014, U+2025, U+2026, U+3033, U+3034, U+3035)</li>
788798
</ul>
789799
If the language is known to be Chinese or Japanese, then additionally
790800
the following breaks may be allowed in ''newspaper'':
@@ -796,17 +806,6 @@ <h3 id="line-break">
796806
</ul>
797807
</ul>
798808

799-
<p class="note">Information on line breaking conventions can be found in
800-
[[JIS4051]] for Japanese,
801-
[[ZHMARK]] for Chinese, and [?] for Korean, and
802-
in [[!UAX14]] for all scripts in Unicode.
803-
<!-- The CSS Working Group notes that although UAX 14 contains a wealth of
804-
information about line breaking conventions, a literal implementation
805-
of its algorithm has been found to be inadequate in multiple situations. --></p>
806-
807-
<p class="issue">Any guidance for appropriate references here would be
808-
much appreciated.</p>
809-
810809
<table class="propdef">
811810
<tbody>
812811
<tr>
@@ -861,16 +860,16 @@ <h3 id="line-break">
861860
<dt><dfn title="line-break:strict"><code>strict</code></dfn></dt>
862861
<dd>Breaks CJK scripts using a more restrictive set of line-breaking
863862
rules than 'normal'.</dd>
864-
<dt><dfn title="line-break:keep-all"><code>keep-all</code></dfn></dt>
865-
<dd>Sequences of CJK characters can no longer break on implied break points.
866-
This option should only be used where the presence of word separator
867-
characters still creates line-breaking opportunities, as in Korean.</dd>
868863
</dl>
869864

865+
<p class="issue">It was requested to change the name of ''newspaper''
866+
to something more general, since not all users would be newspapers
867+
and newspapers might use stricter rules.</p>
868+
870869
<h3 id="word-break">
871-
Line Breaking Rules for non-CJK Scripts: the 'word-break' property</h3>
870+
Word Breaking Rules: the 'word-break' property</h3>
872871

873-
<p>This property specifies line break opportunities for non-CJK scripts.</p>
872+
<p>This property specifies line break opportunities within words.</p>
874873

875874
<table class="propdef">
876875
<tbody>
@@ -880,7 +879,7 @@ <h3 id="word-break">
880879
</tr>
881880
<tr>
882881
<th>Value:</th>
883-
<td>normal | break-all | hyphenate </td>
882+
<td>normal | keep-all | [ break-all || hyphenate ] </td>
884883
</tr>
885884
<tr>
886885
<th>Initial:</th>
@@ -911,12 +910,16 @@ <h3 id="word-break">
911910

912911
<dl>
913912
<dt><dfn title="word-break:normal"><code>normal</code></dfn></dt>
914-
<dd>Breaks non-CJK scripts according to their own rules.</dd>
913+
<dd>Break words according to their usual rules.</dd>
915914
<dt><dfn title="word-break:break-all"><code>break-all</code></dfn></dt>
916-
<dd>Lines may break between any two grapheme clusters for non-CJK scripts.
917-
This option is used mostly in a context where
915+
<dd>Words may break between any two grapheme clusters within words.
916+
Hyphenation is not applied. This option is used mostly in a context where
918917
the text is predominantly using CJK characters with few non-CJK excerpts
919918
and it is desired that the text be better distributed on each line.</dd>
919+
<dt><dfn title="line-break:keep-all"><code>keep-all</code></dfn></dt>
920+
<dd>Sequences of CJK characters can no longer break on implied break points.
921+
This option should only be used where the presence of word separator
922+
characters still creates line-breaking opportunities, as in Korean.</dd>
920923
<dt><dfn title="word-break:hyphenate"><code>hyphenate</code></dfn></dt>
921924
<dd>Words may be broken at an appropriate hyphenation point. This requires
922925
that the user agent have an hyphenation resource appropriate to the

0 commit comments

Comments
 (0)