Skip to content

Commit 63d3cf4

Browse files
committed
Add word-boundary-detection:normal for SEA languages
1 parent dfa8b4e commit 63d3cf4

File tree

1 file changed

+58
-3
lines changed

1 file changed

+58
-3
lines changed

css-text-4/Overview.bs

Lines changed: 58 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -85,8 +85,8 @@ Detecting Word Boundaries: the 'word-boundary-detection' property</h4>
8585

8686
<pre class="propdef">
8787
Name: word-boundary-detection
88-
Value: manual | auto(<<lang>>)
89-
Initial: manual
88+
Value: normal | manual | auto(<<lang>>)
89+
Initial: normal
9090
Applies to: [=inline boxes=]
9191
Inherited: yes
9292
Percentages: N/A
@@ -114,9 +114,64 @@ Detecting Word Boundaries: the 'word-boundary-detection' property</h4>
114114
<dd>
115115
The User Agent must not insert [=virtual word boundaries=].
116116

117-
<dt><dfn>auto(<<lang>>)</dfn>
117+
[=Typographic character units=] with class SA in [[!UAX14]]
118+
must be treated as if they had class AL
119+
(i.e. assuming ''word-break: normal''
120+
and a value of 'line-break' other than ''line-break/anywhere'',
121+
there is no [=soft wrap opportunity=]
122+
between pairs of such characters).
123+
124+
<div class=advisement>
125+
Authors using this value for Southeast Asian languages
126+
are expected to manually indicate word boundaries,
127+
for instance using <{wbr}> or U+200B.
128+
Otherwise, there will be no [=soft wrap opportunity=]
129+
and the text may overflow.
130+
</div>
131+
132+
<dt><dfn>normal</dfn>
118133
<dd>
134+
The User Agent must not insert [=virtual word boundaries=],
135+
except within runs of characters belonging to Southeast Asian languages,
136+
where content analysis must be performed
137+
to determine where to insert [=virtual word boundaries=].
119138

139+
As with ''word-boundary-detection/manual'',
140+
[=typographic character units=] with class SA in [[!UAX14]]
141+
must be treated as if they had class AL;
142+
however, the User Agent must additionally
143+
analyse the content of a run of such characters
144+
and insert [=virtual word boundaries=] where appropriate.
145+
Within the constraints set by this specification,
146+
the specific algorithm used is UA-dependent.
147+
148+
As various languages can be written in scripts
149+
which use the characters with class SA,
150+
if the [=content language=] is known,
151+
the User Agent should use this information
152+
to tailor its analysis.
153+
154+
In order to avoid overflow,
155+
if the User Agent is unable to perform this analysis
156+
for any subset of the characters with class SA--
157+
for example due to lacking a dictionary for certain languages--
158+
there must be a [=soft wrap opportunity=]
159+
between pairs of characters in that subset.
160+
161+
Note: This [=soft wrap opportunity=] is not
162+
a [=virtual word boundary=],
163+
and is ignored by 'word-boundary-expansion'.
164+
165+
Note: This provision is not triggered merely when
166+
the UA fails to find a word boundary in a particular text run;
167+
the text run may well be a single unbreakable word.
168+
It applies for example
169+
when a text run is composed of Khmer characters (U+1780 to U+17FF)
170+
if the User Agent does not know how to determine
171+
word boundaries in Khmer.
172+
173+
<dt><dfn>auto(<<lang>>)</dfn>
174+
<dd>
120175
This value directs the User Agent to perform language-specific content analysis
121176
to determine where to insert [=virtual word boundaries=].
122177

0 commit comments

Comments
 (0)