Skip to content

Commit 4a04edf

Browse files
committed
Move script group definitions to intro so they're available to other parts of the draft
1 parent 79b079b commit 4a04edf

2 files changed

Lines changed: 183 additions & 142 deletions

File tree

css3-text/Overview.html

Lines changed: 131 additions & 102 deletions
Original file line numberDiff line numberDiff line change
@@ -220,6 +220,10 @@ <h2 class="no-num no-toc" id=contents>Table of Contents</h2>
220220

221221
<ul class=toc>
222222
<li><a href="#intro"><span class=secno>1. </span> Introduction</a>
223+
<ul class=toc>
224+
<li><a href="#script-groups"><span class=secno>1.1. </span>Script
225+
Groups</a>
226+
</ul>
223227

224228
<li><a href="#conformance"><span class=secno>2. </span> Conformance</a>
225229
<ul class=toc>
@@ -264,8 +268,8 @@ <h2 class="no-num no-toc" id=contents>Table of Contents</h2>
264268
and Word Boundaries</a>
265269
<ul class=toc>
266270
<li><a href="#line-break"><span class=secno>5.1. </span> Line Breaking
267-
Restrictions Strictness: the &lsquo;<code
268-
class=property>line-break</code>&rsquo; property</a>
271+
Strictness: the &lsquo;<code class=property>line-break</code>&rsquo;
272+
property</a>
269273

270274
<li><a href="#word-break"><span class=secno>5.2. </span> Word Breaking
271275
Rules: the &lsquo;<code class=property>word-break</code>&rsquo;
@@ -452,6 +456,51 @@ <h2 id=intro><span class=secno>1. </span> Introduction</h2>
452456

453457
<p>[document here]
454458

459+
<h3 id=script-groups><span class=secno>1.1. </span>Script Groups</h3>
460+
461+
<p>Typographic behavior varies somewhat by language, but varies drastically
462+
by writing system. For convenience, CSS3 Text defines the following script
463+
groups, which combine typographically-similar scripts together.
464+
465+
<dl>
466+
<dt id=block-scripts><dfn id=block-scripts0>block scripts</dfn>
467+
468+
<dd>CJK (including Hangul and half-width kana) and by extension all "wide"
469+
characters. (See <a href="#UAX11"
470+
rel=biblioentry>[UAX11]<!--{{!UAX11}}--></a>)
471+
472+
<dt id=clustered-scripts><dfn id=clustered-scripts0>clustered
473+
scripts</dfn>
474+
475+
<dd>South-East Asian scripts that have discrete units but do not use space
476+
between words (such as Thai, Lao, Khmer, Myanmar). This category also
477+
includes the Tibetan script.
478+
479+
<dt id=discrete-scripts><dfn id=discrete-scripts0>discrete scripts</dfn>
480+
481+
<dd>Scripts that use spaces or visible word-separating punctuation between
482+
words and have discrete, unconnected (in print) units within words, such
483+
as Latin, Greek, Ethiopic, Cyrillic, Hebrew.
484+
485+
<dt id=cursive-scripts><dfn id=cursive-scripts0>cursive scripts</dfn>
486+
487+
<dd>Arabic and similar cursive scripts.
488+
489+
<dt id=connected-scripts><dfn id=connected-scripts0>connected
490+
scripts</dfn>
491+
492+
<dd>Devanagari, Ogham, and other scripts that use spaces between words and
493+
baseline connectors within words. By extension this group also includes
494+
Gurmukhi, Tamil and any other Indic scripts whose typographic behavior is
495+
similar to Devanagari.
496+
</dl>
497+
498+
<p class=issue>Provide an appendix using Unicode script names.
499+
500+
<p class=note>These definitions are used primarily in describing <a
501+
href="#line-breaking">line-breaking</a> and <a
502+
href="#text-justify">justification</a> behavior.
503+
455504
<h2 id=conformance><span class=secno>2. </span> Conformance</h2>
456505

457506
<p>Conformance requirements are expressed with a combination of descriptive
@@ -1155,61 +1204,10 @@ <h2 id=line-breaking><span class=secno>5. </span> Line Breaking and Word
11551204
scripts well. Additionally, some guidance should be provided on how to
11561205
break or not break Southeast Asian in the absence of a dictionary.
11571206

1158-
<h3 id=line-break><span class=secno>5.1. </span> Line Breaking Restrictions
1159-
Strictness: the &lsquo;<a href="#line-break0"><code
1207+
<h3 id=line-break><span class=secno>5.1. </span> Line Breaking Strictness:
1208+
the &lsquo;<a href="#line-break0"><code
11601209
class=property>line-break</code></a>&rsquo; property</h3>
11611210

1162-
<p>This property specifies the strictness of line-breaking rules:
1163-
particularly how line-breaking interacts with punctuation.
1164-
1165-
<p>CSS distinguishes between three levels of strictness in the rules for
1166-
implicit line breaking. The precise set of rules in effect for the
1167-
&lsquo;<code class=css>strict</code>&rsquo; and &lsquo;<code
1168-
class=css>loose</code>&rsquo; levels is up to the UA and should follow
1169-
language conventions. However, this specification does recommend that:
1170-
1171-
<ul>
1172-
<li>Following breaks be forbidden in &lsquo;<code
1173-
class=css>strict</code>&rsquo; line breaking and allowed in &lsquo;<code
1174-
class=css>normal</code>&rsquo;:
1175-
<ul>
1176-
<li>breaks before Japanese small kana
1177-
1178-
<li>breaks before the KATAKANA-HIRAGANA PROLONGED SOUND MARK (U+30FC)
1179-
</ul>
1180-
Additionally, if the language is known to be Chinese or Japanese, breaks
1181-
before hyphens (U+2010, U+2013, U+301C, U+30A0) may be allowed in
1182-
&lsquo;<code class=css>normal</code>&rsquo;.
1183-
1184-
<li>Following breaks be forbidden in &lsquo;<code
1185-
class=css>normal</code>&rsquo; and &lsquo;<code
1186-
class=css>strict</code>&rsquo; line breaking and allowed in &lsquo;<code
1187-
class=css>loose</code>&rsquo;:
1188-
<ul>
1189-
<li>breaks before iteration marks (U+3005, U+303B, U+309D, U+309E,
1190-
U+30FD, U+30FE)
1191-
1192-
<li>breaks between inseparatable characters (U+2014, U+2025, U+2026,
1193-
U+3033, U+3034, U+3035)
1194-
</ul>
1195-
If the language is known to be Chinese or Japanese, then additionally the
1196-
following breaks may be allowed in &lsquo;<code
1197-
class=css>loose</code>&rsquo;:
1198-
<ul>
1199-
<li>breaks before middle dots (U+003A, U+003B, U+30FB, U+FF1A, U+FF1B,
1200-
U+FF65)
1201-
1202-
<li>breaks before dividing punctuation marks (U+0021, U+003F, U+203C,
1203-
U+2047, U+2048, U+2049, U+FF01, U+FF1F)
1204-
1205-
<li>breaks before postfixes (U+0025, U+00A2, U+00B0, U+2030, U+2032,
1206-
U+2033, U+2103, U+FF05, U+FFE0)
1207-
1208-
<li>breaks after prefixes (U+0024, U+00A3, U+00A5, U+20AC, U+2116,
1209-
U+FF04, U+FFE1, U+FFE5)
1210-
</ul>
1211-
</ul>
1212-
12131211
<table class=propdef>
12141212
<tbody>
12151213
<tr>
@@ -1253,8 +1251,9 @@ <h3 id=line-break><span class=secno>5.1. </span> Line Breaking Restrictions
12531251
<td>specified value
12541252
</table>
12551253

1256-
<p>This property specifies what set of line breaking restrictions are in
1257-
effect within the element. Values have the following meanings:
1254+
<p>This property specifies the strictness of line-breaking rules applied
1255+
within an element: particularly how line-breaking interacts with
1256+
punctuation. Values have the following meanings:
12581257

12591258
<dl>
12601259
<dt><dfn id=auto title="line-break:auto"><code>auto</code></dfn>
@@ -1277,6 +1276,53 @@ <h3 id=line-break><span class=secno>5.1. </span> Line Breaking Restrictions
12771276
<dd>Breaks text using the most stringent set of line-breaking rules.
12781277
</dl>
12791278

1279+
<p>CSS distinguishes between three levels of strictness in the rules for
1280+
implicit line breaking. The precise set of rules in effect for each level
1281+
is up to the UA and should follow language conventions. However, this
1282+
specification does recommend that:
1283+
1284+
<ul>
1285+
<li>Following breaks be forbidden in &lsquo;<code
1286+
class=css>strict</code>&rsquo; line breaking and allowed in &lsquo;<code
1287+
class=css>normal</code>&rsquo;:
1288+
<ul>
1289+
<li>breaks before Japanese small kana
1290+
1291+
<li>breaks before the KATAKANA-HIRAGANA PROLONGED SOUND MARK (U+30FC)
1292+
</ul>
1293+
Additionally, if the language is known to be Chinese or Japanese, breaks
1294+
before hyphens (U+2010, U+2013, U+301C, U+30A0) may be allowed in
1295+
&lsquo;<code class=css>normal</code>&rsquo;.
1296+
1297+
<li>Following breaks be forbidden in &lsquo;<code
1298+
class=css>normal</code>&rsquo; and &lsquo;<code
1299+
class=css>strict</code>&rsquo; line breaking and allowed in &lsquo;<code
1300+
class=css>loose</code>&rsquo;:
1301+
<ul>
1302+
<li>breaks before iteration marks (U+3005, U+303B, U+309D, U+309E,
1303+
U+30FD, U+30FE)
1304+
1305+
<li>breaks between inseparatable characters (U+2014, U+2025, U+2026,
1306+
U+3033, U+3034, U+3035)
1307+
</ul>
1308+
If the language is known to be Chinese or Japanese, then additionally the
1309+
following breaks may be allowed in &lsquo;<code
1310+
class=css>loose</code>&rsquo;:
1311+
<ul>
1312+
<li>breaks before middle dots (U+003A, U+003B, U+30FB, U+FF1A, U+FF1B,
1313+
U+FF65)
1314+
1315+
<li>breaks before dividing punctuation marks (U+0021, U+003F, U+203C,
1316+
U+2047, U+2048, U+2049, U+FF01, U+FF1F)
1317+
1318+
<li>breaks before postfixes (U+0025, U+00A2, U+00B0, U+2030, U+2032,
1319+
U+2033, U+2103, U+FF05, U+FFE0)
1320+
1321+
<li>breaks after prefixes (U+0024, U+00A3, U+00A5, U+20AC, U+2116,
1322+
U+FF04, U+FFE1, U+FFE5)
1323+
</ul>
1324+
</ul>
1325+
12801326
<h3 id=word-break><span class=secno>5.2. </span> Word Breaking Rules: the
12811327
&lsquo;<a href="#word-break0"><code
12821328
class=property>word-break</code></a>&rsquo; property</h3>
@@ -2521,8 +2567,8 @@ <h3 id=text-justify><span class=secno>8.4. </span> Justification Method:
25212567
class=css>inter-ideograph</code>&rsquo; for CJK, or &lsquo;<code
25222568
class=css>inter-word</code>&rsquo; for English. Another possibility is
25232569
to use a justification method that is a universal compromise for all
2524-
scripts, e.g. the &lsquo;<code class=css>distribute</code>&rsquo; method
2525-
with discrete scripts dropped to second priority.</p>
2570+
scripts, e.g. the &lsquo;<code class=css>inter-cluster</code>&rsquo;
2571+
method with block scripts raised to first priority.</p>
25262572

25272573
<dt><dfn id=none2 title="text-justify:none"><code>none</code></dfn>
25282574

@@ -2563,8 +2609,8 @@ <h3 id=text-justify><span class=secno>8.4. </span> Justification Method:
25632609
<dt><a name=kashida-prop></a><a name=text-kashida-space></a> <dfn
25642610
id=kashida title="text-justify:kashida"><code>kashida</code></dfn>
25652611

2566-
<dd>Justification primarily stretches Arabic and related scripts through
2567-
the use of kashida or other calligraphic elongation.
2612+
<dd>Justification primarily stretches <a href="#cursive-scripts">cursive
2613+
scripts</a> through the use of kashida or other calligraphic elongation.
25682614
</dl>
25692615

25702616
<p>When justifying text, the user agent takes the remaining space between
@@ -2580,44 +2626,6 @@ <h3 id=text-justify><span class=secno>8.4. </span> Justification Method:
25802626
be followed when any justification method other than &lsquo;<code
25812627
class=property>auto</code>&rsquo; is specified.
25822628

2583-
<p>Justification affects different types of writing systems in different
2584-
ways. For justification purposes, characters are grouped as follows:
2585-
2586-
<dl>
2587-
<dt id=block-scripts>block
2588-
2589-
<dd>CJK (including Hangul and half-width kana) and by extension all "wide"
2590-
characters. (See <a href="#UAX11"
2591-
rel=biblioentry>[UAX11]<!--{{!UAX11}}--></a>)
2592-
2593-
<dt id=clustered-scripts>clustered
2594-
2595-
<dd>South-East Asian scripts that have discrete units but do not use space
2596-
between words (such as Thai, Lao, Khmer, Myanmar). This category also
2597-
includes the Tibetan script.
2598-
2599-
<dt id=discrete-scripts>discrete
2600-
2601-
<dd>Scripts that use spaces or visible word-separating punctuation between
2602-
words and have discrete, unconnected (in print) units within words, such
2603-
as Latin, Greek, Ethiopic, Cyrillic, Hebrew.
2604-
2605-
<dt id=cursive-scripts>cursive
2606-
2607-
<dd>Arabic and similar cursive scripts
2608-
2609-
<dt id=connected-scripts>connected
2610-
2611-
<dd>Devanagari, Ogham, and other scripts that use spaces between words and
2612-
baseline connectors within words. By extension this group also includes
2613-
Gurmukhi, Tamil and any other Indic scripts whose typographic behavior is
2614-
similar to Devanagari.
2615-
</dl>
2616-
2617-
<p>The UA may enable or break optional ligatures or use other font features
2618-
such as alternate glyphs to help justify the text under any method. This
2619-
behavior is not defined by CSS.
2620-
26212629
<p id=expansion-opportunity>CSS defines <dfn
26222630
id=expansion-opportunities>expansion opportunities</dfn> as points where
26232631
the justification algorithm may alter spacing within the text. These
@@ -2627,7 +2635,7 @@ <h3 id=text-justify><span class=secno>8.4. </span> Justification Method:
26272635
lower priority expansion opportunities are adjusted. (Expansion and
26282636
compression limits are given by the <a
26292637
href="#letter-spacing">letter-spacing</a> and <a
2630-
href="#word-spacing">word-spacing</a> properties.
2638+
href="#word-spacing">word-spacing</a> properties.)
26312639

26322640
<p>How any remaining space is distributed once all expansion opportunities
26332641
reach their limits is up to the UA. If the inline contents of a line
@@ -2647,8 +2655,10 @@ <h3 id=text-justify><span class=secno>8.4. </span> Justification Method:
26472655
are given in the table below. Space must be distributed evenly among all
26482656
types of expansion opportunities in a given prioritization group, but may
26492657
vary within a line due to changes in the font or letter-spacing and
2650-
word-spacing values. The different types of expansion opportunities are
2651-
defined as follows:
2658+
word-spacing values. Since justification behavior varies by writing
2659+
system, expansion opportunities are organized by <a
2660+
href="#script-groups">script group</a>. The different types of expansion
2661+
opportunities are defined as follows:
26522662

26532663
<dl>
26542664
<dt>spaces
@@ -2979,6 +2989,10 @@ <h3 id=text-justify><span class=secno>8.4. </span> Justification Method:
29792989
<p class=note>The &lsquo;<code class=css>auto</code>&rsquo; column defined
29802990
above is informative.
29812991

2992+
<p>The UA may enable or break optional ligatures or use other font features
2993+
such as alternate glyphs to help justify the text under any method. This
2994+
behavior is not defined by CSS.
2995+
29822996
<div class=example>
29832997
<p>Japanese is one of the languages for which compression is preferred to
29842998
expansion in applying justification.</p>
@@ -5957,12 +5971,27 @@ <h2 class=no-num id=index>Index</h2>
59575971
<li>&lsquo;<code class=css>below right</code>&rsquo;, <a
59585972
href="#below-right" title="''below right''"><strong>11.1.6.</strong></a>
59595973

5974+
<li>block scripts, <a href="#block-scripts0" title="block
5975+
scripts"><strong>1.1.</strong></a>
5976+
5977+
<li>clustered scripts, <a href="#clustered-scripts0" title="clustered
5978+
scripts"><strong>1.1.</strong></a>
5979+
59605980
<li>collapsible, <a href="#collapsible"
59615981
title=collapsible><strong>4.2.</strong></a>
59625982

59635983
<li>&lsquo;<code class=css>column</code>&rsquo;, <a href="#column"
59645984
title="''column''"><strong>6.4.</strong></a>
59655985

5986+
<li>connected scripts, <a href="#connected-scripts0" title="connected
5987+
scripts"><strong>1.1.</strong></a>
5988+
5989+
<li>cursive scripts, <a href="#cursive-scripts0" title="cursive
5990+
scripts"><strong>1.1.</strong></a>
5991+
5992+
<li>discrete scripts, <a href="#discrete-scripts0" title="discrete
5993+
scripts"><strong>1.1.</strong></a>
5994+
59665995
<li><a href="#each-line"><code>each-line</code></a>, <a href="#each-line"
59675996
title=each-line><strong>10.1.</strong></a>
59685997

0 commit comments

Comments
 (0)