You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<li>Otherwise, the [=segment break=] is converted to a space (U+0020).
2175
2179
2176
2180
<wpt>
@@ -2183,18 +2187,25 @@ Order of Operations</h4>
2183
2187
</wpt>
2184
2188
2185
2189
</ul>
2186
-
<!--
2187
2190
<p>
2188
2191
For this purpose,
2189
2192
Emoji (Unicode property <code>Emoji</code>)
2190
2193
with an <a>East Asian Width property</a> of
2191
2194
<code>Wide</code> or <code>Neutral</code>
2192
2195
are treated as having an <a>East Asian Width property</a> of
2193
2196
<code>Ambiguous</code>.
2194
-
-->
2195
-
Note: The white space processing rules have already
2197
+
2198
+
2199
+
ISSUE(5086): Should space-discarding punctuation have a stronger influence over mismatched before/after contexts?
2200
+
2201
+
ISSUE(5017): Should we classify punctuation and/or symbols as a category of space-ambiguous characters? (Currently spaces are discarded only if both sides are space-discarding; ambiguous characters would defer to the other side.)
2202
+
2203
+
CUT SEGMENT BREAK TRANSFORM -->
2204
+
2205
+
Note: The white space processing rules have already
2196
2206
removed any [=tabs=] and [=spaces=] around the [=segment break=]
2197
-
before these checks take place.
2207
+
before this context is evaluated.
2208
+
</ol>
2198
2209
2199
2210
<div class="example">
2200
2211
The purpose of the segment break transformation rules
@@ -2210,9 +2221,10 @@ Order of Operations</h4>
2210
2221
Here is an English paragraph
2211
2222
that is broken into multiple lines
2212
2223
in the source code so that it can
2213
-
more easily read in a text editor.
2224
+
be more easily read and edited
2225
+
in a text editor.
2214
2226
</pre>
2215
-
<p>Here is an English paragraph that is broken into multiple lines in the source code so that it can be more easily read in a text editor.</p>
2227
+
<p>Here is an English paragraph that is broken into multiple lines in the source code so that it can be more easily read and edited in a text editor.</p>
2216
2228
<figcaption>
2217
2229
Eliminating a line break in English requires maintaining a [=space=] in its place.
2218
2230
</figcaption>
@@ -2233,21 +2245,16 @@ Order of Operations</h4>
2233
2245
</figcaption>
2234
2246
</figure>
2235
2247
2236
-
The segment break transformation rules thus use adjacent context
2248
+
The segment break transformation rules can use adjacent context
2237
2249
to either transform the segment break into a space
2238
2250
or eliminate it entirely.
2239
2251
</div>
2240
2252
2241
-
<p class="feedback issue">Comments on how well these rules would work in practice would
2242
-
be very much appreciated, particularly from people who work with
2243
-
Thai and similar scripts.
2244
-
Note that browser implementations do not currently follow these rules consistently
2245
-
(although IE does in some cases transform the break,
2246
-
and Firefox follows the first two bullet points).</p>
2247
-
2248
-
ISSUE(5086): Should space-discarding punctuation have a stronger influence over mismatched before/after contexts?
2249
-
2250
-
ISSUE(5017): Should we classify punctuation and/or symbols as a category of space-ambiguous characters? (Currently spaces are discarded only if both sides are space-discarding; ambiguous characters would defer to the other side.)
2253
+
Note: Historically, HTML and CSS have unconditionally converted [=segment breaks=] to spaces,
2254
+
which has prevented content authored in languages such as Chinese
2255
+
from being able to break lines within the source.
2256
+
Thus UA heurstics need to be conservative about where they discard [=segment breaks=]
2257
+
even as they strive to improve support for such languages.
0 commit comments