@@ -220,6 +220,10 @@ <h2 class="no-num no-toc" id=contents>Table of Contents</h2>
220220
221221 < ul class =toc >
222222 < li > < a href ="#intro "> < span class =secno > 1. </ span > Introduction</ a >
223+ < ul class =toc >
224+ < li > < a href ="#script-groups "> < span class =secno > 1.1. </ span > Script
225+ Groups</ a >
226+ </ ul >
223227
224228 < li > < a href ="#conformance "> < span class =secno > 2. </ span > Conformance</ a >
225229 < ul class =toc >
@@ -264,8 +268,8 @@ <h2 class="no-num no-toc" id=contents>Table of Contents</h2>
264268 and Word Boundaries</ a >
265269 < ul class =toc >
266270 < li > < a href ="#line-break "> < span class =secno > 5.1. </ span > Line Breaking
267- Restrictions Strictness: the ‘< code
268- class = property > line-break </ code > ’ property</ a >
271+ Strictness: the ‘< code class = property > line-break </ code > ’
272+ property</ a >
269273
270274 < li > < a href ="#word-break "> < span class =secno > 5.2. </ span > Word Breaking
271275 Rules: the ‘< code class =property > word-break</ code > ’
@@ -452,6 +456,51 @@ <h2 id=intro><span class=secno>1. </span> Introduction</h2>
452456
453457 < p > [document here]
454458
459+ < h3 id =script-groups > < span class =secno > 1.1. </ span > Script Groups</ h3 >
460+
461+ < p > Typographic behavior varies somewhat by language, but varies drastically
462+ by writing system. For convenience, CSS3 Text defines the following script
463+ groups, which combine typographically-similar scripts together.
464+
465+ < dl >
466+ < dt id =block-scripts > < dfn id =block-scripts0 > block scripts</ dfn >
467+
468+ < dd > CJK (including Hangul and half-width kana) and by extension all "wide"
469+ characters. (See < a href ="#UAX11 "
470+ rel =biblioentry > [UAX11]<!--{{!UAX11}}--> </ a > )
471+
472+ < dt id =clustered-scripts > < dfn id =clustered-scripts0 > clustered
473+ scripts</ dfn >
474+
475+ < dd > South-East Asian scripts that have discrete units but do not use space
476+ between words (such as Thai, Lao, Khmer, Myanmar). This category also
477+ includes the Tibetan script.
478+
479+ < dt id =discrete-scripts > < dfn id =discrete-scripts0 > discrete scripts</ dfn >
480+
481+ < dd > Scripts that use spaces or visible word-separating punctuation between
482+ words and have discrete, unconnected (in print) units within words, such
483+ as Latin, Greek, Ethiopic, Cyrillic, Hebrew.
484+
485+ < dt id =cursive-scripts > < dfn id =cursive-scripts0 > cursive scripts</ dfn >
486+
487+ < dd > Arabic and similar cursive scripts.
488+
489+ < dt id =connected-scripts > < dfn id =connected-scripts0 > connected
490+ scripts</ dfn >
491+
492+ < dd > Devanagari, Ogham, and other scripts that use spaces between words and
493+ baseline connectors within words. By extension this group also includes
494+ Gurmukhi, Tamil and any other Indic scripts whose typographic behavior is
495+ similar to Devanagari.
496+ </ dl >
497+
498+ < p class =issue > Provide an appendix using Unicode script names.
499+
500+ < p class =note > These definitions are used primarily in describing < a
501+ href ="#line-breaking "> line-breaking</ a > and < a
502+ href ="#text-justify "> justification</ a > behavior.
503+
455504 < h2 id =conformance > < span class =secno > 2. </ span > Conformance</ h2 >
456505
457506 < p > Conformance requirements are expressed with a combination of descriptive
@@ -1155,61 +1204,10 @@ <h2 id=line-breaking><span class=secno>5. </span> Line Breaking and Word
11551204 scripts well. Additionally, some guidance should be provided on how to
11561205 break or not break Southeast Asian in the absence of a dictionary.
11571206
1158- < h3 id =line-break > < span class =secno > 5.1. </ span > Line Breaking Restrictions
1159- Strictness: the ‘< a href ="#line-break0 "> < code
1207+ < h3 id =line-break > < span class =secno > 5.1. </ span > Line Breaking Strictness:
1208+ the ‘< a href ="#line-break0 "> < code
11601209 class =property > line-break</ code > </ a > ’ property</ h3 >
11611210
1162- < p > This property specifies the strictness of line-breaking rules:
1163- particularly how line-breaking interacts with punctuation.
1164-
1165- < p > CSS distinguishes between three levels of strictness in the rules for
1166- implicit line breaking. The precise set of rules in effect for the
1167- ‘< code class =css > strict</ code > ’ and ‘< code
1168- class =css > loose</ code > ’ levels is up to the UA and should follow
1169- language conventions. However, this specification does recommend that:
1170-
1171- < ul >
1172- < li > Following breaks be forbidden in ‘< code
1173- class =css > strict</ code > ’ line breaking and allowed in ‘< code
1174- class =css > normal</ code > ’:
1175- < ul >
1176- < li > breaks before Japanese small kana
1177-
1178- < li > breaks before the KATAKANA-HIRAGANA PROLONGED SOUND MARK (U+30FC)
1179- </ ul >
1180- Additionally, if the language is known to be Chinese or Japanese, breaks
1181- before hyphens (U+2010, U+2013, U+301C, U+30A0) may be allowed in
1182- ‘< code class =css > normal</ code > ’.
1183-
1184- < li > Following breaks be forbidden in ‘< code
1185- class =css > normal</ code > ’ and ‘< code
1186- class =css > strict</ code > ’ line breaking and allowed in ‘< code
1187- class =css > loose</ code > ’:
1188- < ul >
1189- < li > breaks before iteration marks (U+3005, U+303B, U+309D, U+309E,
1190- U+30FD, U+30FE)
1191-
1192- < li > breaks between inseparatable characters (U+2014, U+2025, U+2026,
1193- U+3033, U+3034, U+3035)
1194- </ ul >
1195- If the language is known to be Chinese or Japanese, then additionally the
1196- following breaks may be allowed in ‘< code
1197- class =css > loose</ code > ’:
1198- < ul >
1199- < li > breaks before middle dots (U+003A, U+003B, U+30FB, U+FF1A, U+FF1B,
1200- U+FF65)
1201-
1202- < li > breaks before dividing punctuation marks (U+0021, U+003F, U+203C,
1203- U+2047, U+2048, U+2049, U+FF01, U+FF1F)
1204-
1205- < li > breaks before postfixes (U+0025, U+00A2, U+00B0, U+2030, U+2032,
1206- U+2033, U+2103, U+FF05, U+FFE0)
1207-
1208- < li > breaks after prefixes (U+0024, U+00A3, U+00A5, U+20AC, U+2116,
1209- U+FF04, U+FFE1, U+FFE5)
1210- </ ul >
1211- </ ul >
1212-
12131211 < table class =propdef >
12141212 < tbody >
12151213 < tr >
@@ -1253,8 +1251,9 @@ <h3 id=line-break><span class=secno>5.1. </span> Line Breaking Restrictions
12531251 < td > specified value
12541252 </ table >
12551253
1256- < p > This property specifies what set of line breaking restrictions are in
1257- effect within the element. Values have the following meanings:
1254+ < p > This property specifies the strictness of line-breaking rules applied
1255+ within an element: particularly how line-breaking interacts with
1256+ punctuation. Values have the following meanings:
12581257
12591258 < dl >
12601259 < dt > < dfn id =auto title ="line-break:auto "> < code > auto</ code > </ dfn >
@@ -1277,6 +1276,53 @@ <h3 id=line-break><span class=secno>5.1. </span> Line Breaking Restrictions
12771276 < dd > Breaks text using the most stringent set of line-breaking rules.
12781277 </ dl >
12791278
1279+ < p > CSS distinguishes between three levels of strictness in the rules for
1280+ implicit line breaking. The precise set of rules in effect for each level
1281+ is up to the UA and should follow language conventions. However, this
1282+ specification does recommend that:
1283+
1284+ < ul >
1285+ < li > Following breaks be forbidden in ‘< code
1286+ class =css > strict</ code > ’ line breaking and allowed in ‘< code
1287+ class =css > normal</ code > ’:
1288+ < ul >
1289+ < li > breaks before Japanese small kana
1290+
1291+ < li > breaks before the KATAKANA-HIRAGANA PROLONGED SOUND MARK (U+30FC)
1292+ </ ul >
1293+ Additionally, if the language is known to be Chinese or Japanese, breaks
1294+ before hyphens (U+2010, U+2013, U+301C, U+30A0) may be allowed in
1295+ ‘< code class =css > normal</ code > ’.
1296+
1297+ < li > Following breaks be forbidden in ‘< code
1298+ class =css > normal</ code > ’ and ‘< code
1299+ class =css > strict</ code > ’ line breaking and allowed in ‘< code
1300+ class =css > loose</ code > ’:
1301+ < ul >
1302+ < li > breaks before iteration marks (U+3005, U+303B, U+309D, U+309E,
1303+ U+30FD, U+30FE)
1304+
1305+ < li > breaks between inseparatable characters (U+2014, U+2025, U+2026,
1306+ U+3033, U+3034, U+3035)
1307+ </ ul >
1308+ If the language is known to be Chinese or Japanese, then additionally the
1309+ following breaks may be allowed in ‘< code
1310+ class =css > loose</ code > ’:
1311+ < ul >
1312+ < li > breaks before middle dots (U+003A, U+003B, U+30FB, U+FF1A, U+FF1B,
1313+ U+FF65)
1314+
1315+ < li > breaks before dividing punctuation marks (U+0021, U+003F, U+203C,
1316+ U+2047, U+2048, U+2049, U+FF01, U+FF1F)
1317+
1318+ < li > breaks before postfixes (U+0025, U+00A2, U+00B0, U+2030, U+2032,
1319+ U+2033, U+2103, U+FF05, U+FFE0)
1320+
1321+ < li > breaks after prefixes (U+0024, U+00A3, U+00A5, U+20AC, U+2116,
1322+ U+FF04, U+FFE1, U+FFE5)
1323+ </ ul >
1324+ </ ul >
1325+
12801326 < h3 id =word-break > < span class =secno > 5.2. </ span > Word Breaking Rules: the
12811327 ‘< a href ="#word-break0 "> < code
12821328 class =property > word-break</ code > </ a > ’ property</ h3 >
@@ -2521,8 +2567,8 @@ <h3 id=text-justify><span class=secno>8.4. </span> Justification Method:
25212567 class =css > inter-ideograph</ code > ’ for CJK, or ‘< code
25222568 class =css > inter-word</ code > ’ for English. Another possibility is
25232569 to use a justification method that is a universal compromise for all
2524- scripts, e.g. the ‘< code class =css > distribute </ code > ’ method
2525- with discrete scripts dropped to second priority.</ p >
2570+ scripts, e.g. the ‘< code class =css > inter-cluster </ code > ’
2571+ method with block scripts raised to first priority.</ p >
25262572
25272573 < dt > < dfn id =none2 title ="text-justify:none "> < code > none</ code > </ dfn >
25282574
@@ -2563,8 +2609,8 @@ <h3 id=text-justify><span class=secno>8.4. </span> Justification Method:
25632609 < dt > < a name =kashida-prop > </ a > < a name =text-kashida-space > </ a > < dfn
25642610 id =kashida title ="text-justify:kashida "> < code > kashida</ code > </ dfn >
25652611
2566- < dd > Justification primarily stretches Arabic and related scripts through
2567- the use of kashida or other calligraphic elongation.
2612+ < dd > Justification primarily stretches < a href =" #cursive- scripts" > cursive
2613+ scripts </ a > through the use of kashida or other calligraphic elongation.
25682614 </ dl >
25692615
25702616 < p > When justifying text, the user agent takes the remaining space between
@@ -2580,44 +2626,6 @@ <h3 id=text-justify><span class=secno>8.4. </span> Justification Method:
25802626 be followed when any justification method other than ‘< code
25812627 class =property > auto</ code > ’ is specified.
25822628
2583- < p > Justification affects different types of writing systems in different
2584- ways. For justification purposes, characters are grouped as follows:
2585-
2586- < dl >
2587- < dt id =block-scripts > block
2588-
2589- < dd > CJK (including Hangul and half-width kana) and by extension all "wide"
2590- characters. (See < a href ="#UAX11 "
2591- rel =biblioentry > [UAX11]<!--{{!UAX11}}--> </ a > )
2592-
2593- < dt id =clustered-scripts > clustered
2594-
2595- < dd > South-East Asian scripts that have discrete units but do not use space
2596- between words (such as Thai, Lao, Khmer, Myanmar). This category also
2597- includes the Tibetan script.
2598-
2599- < dt id =discrete-scripts > discrete
2600-
2601- < dd > Scripts that use spaces or visible word-separating punctuation between
2602- words and have discrete, unconnected (in print) units within words, such
2603- as Latin, Greek, Ethiopic, Cyrillic, Hebrew.
2604-
2605- < dt id =cursive-scripts > cursive
2606-
2607- < dd > Arabic and similar cursive scripts
2608-
2609- < dt id =connected-scripts > connected
2610-
2611- < dd > Devanagari, Ogham, and other scripts that use spaces between words and
2612- baseline connectors within words. By extension this group also includes
2613- Gurmukhi, Tamil and any other Indic scripts whose typographic behavior is
2614- similar to Devanagari.
2615- </ dl >
2616-
2617- < p > The UA may enable or break optional ligatures or use other font features
2618- such as alternate glyphs to help justify the text under any method. This
2619- behavior is not defined by CSS.
2620-
26212629 < p id =expansion-opportunity > CSS defines < dfn
26222630 id =expansion-opportunities > expansion opportunities</ dfn > as points where
26232631 the justification algorithm may alter spacing within the text. These
@@ -2627,7 +2635,7 @@ <h3 id=text-justify><span class=secno>8.4. </span> Justification Method:
26272635 lower priority expansion opportunities are adjusted. (Expansion and
26282636 compression limits are given by the < a
26292637 href ="#letter-spacing "> letter-spacing</ a > and < a
2630- href ="#word-spacing "> word-spacing</ a > properties.
2638+ href ="#word-spacing "> word-spacing</ a > properties.)
26312639
26322640 < p > How any remaining space is distributed once all expansion opportunities
26332641 reach their limits is up to the UA. If the inline contents of a line
@@ -2647,8 +2655,10 @@ <h3 id=text-justify><span class=secno>8.4. </span> Justification Method:
26472655 are given in the table below. Space must be distributed evenly among all
26482656 types of expansion opportunities in a given prioritization group, but may
26492657 vary within a line due to changes in the font or letter-spacing and
2650- word-spacing values. The different types of expansion opportunities are
2651- defined as follows:
2658+ word-spacing values. Since justification behavior varies by writing
2659+ system, expansion opportunities are organized by < a
2660+ href ="#script-groups "> script group</ a > . The different types of expansion
2661+ opportunities are defined as follows:
26522662
26532663 < dl >
26542664 < dt > spaces
@@ -2979,6 +2989,10 @@ <h3 id=text-justify><span class=secno>8.4. </span> Justification Method:
29792989 < p class =note > The ‘< code class =css > auto</ code > ’ column defined
29802990 above is informative.
29812991
2992+ < p > The UA may enable or break optional ligatures or use other font features
2993+ such as alternate glyphs to help justify the text under any method. This
2994+ behavior is not defined by CSS.
2995+
29822996 < div class =example >
29832997 < p > Japanese is one of the languages for which compression is preferred to
29842998 expansion in applying justification.</ p >
@@ -5957,12 +5971,27 @@ <h2 class=no-num id=index>Index</h2>
59575971 < li > ‘< code class =css > below right</ code > ’, < a
59585972 href ="#below-right " title ="''below right'' "> < strong > 11.1.6.</ strong > </ a >
59595973
5974+ < li > block scripts, < a href ="#block-scripts0 " title ="block
5975+ scripts "> < strong > 1.1.</ strong > </ a >
5976+
5977+ < li > clustered scripts, < a href ="#clustered-scripts0 " title ="clustered
5978+ scripts "> < strong > 1.1.</ strong > </ a >
5979+
59605980 < li > collapsible, < a href ="#collapsible "
59615981 title =collapsible > < strong > 4.2.</ strong > </ a >
59625982
59635983 < li > ‘< code class =css > column</ code > ’, < a href ="#column "
59645984 title ="''column'' "> < strong > 6.4.</ strong > </ a >
59655985
5986+ < li > connected scripts, < a href ="#connected-scripts0 " title ="connected
5987+ scripts "> < strong > 1.1.</ strong > </ a >
5988+
5989+ < li > cursive scripts, < a href ="#cursive-scripts0 " title ="cursive
5990+ scripts "> < strong > 1.1.</ strong > </ a >
5991+
5992+ < li > discrete scripts, < a href ="#discrete-scripts0 " title ="discrete
5993+ scripts "> < strong > 1.1.</ strong > </ a >
5994+
59665995 < li > < a href ="#each-line "> < code > each-line</ code > </ a > , < a href ="#each-line "
59675996 title =each-line > < strong > 10.1.</ strong > </ a >
59685997
0 commit comments