@@ -464,33 +464,45 @@ <h3 id="partial-impl"><span class="secno">2.1.</span> Partial and Experimental I
464464< h2 id ="white-space-processing "> < span class ="secno "> 3.</ span >
465465 White Space Processing</ h2 >
466466
467+ < p > The source text of a document often contains formatting that
468+ is not relevant to the final rendering: for example, breaking
469+ the source into segments (lines) for ease of editing or adding
470+ white space characters such as tabs and spaces to indent the
471+ source code. CSS white space processing allows the author to
472+ control interpretation of such formatting: to preserve or
473+ collapse it away when rendering the document.
474+
475+ < p > Segments in the document source can be separated by a carriage
476+ return (U+000D), a linefeed (U+000A) or both (U+000D U+000A),
477+ or by some other mechanism that identifies the beginning
478+ and end of document segments, such as the SGML RECORD-START
479+ and RECORD-END tokens.
480+ If no segmentation rules are specified for the document language,
481+ each line feed (U+000A), carriage return (U+000D) and CRLF sequence
482+ (U+000D U+000A) in the text is considered a segment break. (This
483+ default rule also applies to generated content.)
484+ In CSS, each such segment break is treated as a single line feed
485+ character (U+000A).
486+
467487 < p > White space processing in CSS interprets white space characters
468488 for rendering: it has no effect on the underlying document data.
469- In the context of CSS, the document white space set is defined to be
470- any space characters (Unicode value U+0020), tab characters (U+0009),
471- or line break characters (defined by the document format: typically
472- line feed, U+000A). Control characters besides the white space
473- characters and the bidi formatting characters (U+202x) are treated as
474- normal characters and rendered according to the same rules.
475-
476- < p > The document parser must normalize line break character sequences
477- according to its own format rules before CSS processing takes effect.
478- However, in generated content strings the line feed character (U+000A)
479- and only the line feed character is considered a line break sequence.
480- For CSS white space processing all line breaks must be normalized to a
481- single character representation—usually the line feed character
482- (U+000A)—here called a "line break". This way, all recognized
483- line breaks are treated the same and style rules behave consistently
484- across systems.</ p >
489+ In the context of CSS, the document white space set is defined
490+ to be any space characters (Unicode value U+0020), tab characters
491+ (U+0009), and line feeds (U+000A).
485492
486493 < p class ="note "> Note that the document parser may have not only normalized
487- line break characters , but also collapsed other space characters or
494+ segment breaks , but also collapsed other space characters or
488495 otherwise processed white space according to markup rules. Because CSS
489496 processing occurs < em > after</ em > the parsing stage, it is not possible
490497 to restore these characters for styling. Therefore, some of the
491498 behavior specified below can be affected by these limitations and
492499 may be user agent dependent.</ p >
493500
501+ < p > Control characters other than U+0009 (tab), U+000A (line feed),
502+ U+0020 (space), and U+202x (bidi formatting characters) are treated
503+ as characters to render in the same way as any normal character.
504+ < span class ="issue "> Copied from CSS2.1 but this has got to be wrong.</ span >
505+
494506 < h3 id ="white-space-collapsing "> < span class ="secno "> 3.1.</ span >
495507 White Space Collapsing: the 'white-space-collapsing' property</ h3 >
496508
@@ -548,10 +560,10 @@ <h3 id="white-space-collapsing"><span class="secno">3.1.</span>
548560 cases</ a > , no character).</ dd >
549561 < dt > < dfn title ="white-space-collapsing:preserve "> < code > preserve</ code > </ dfn > </ dt >
550562 < dd > This value prevents user agents from collapsing sequences
551- of white space. Line breaks are preserved.</ dd >
563+ of white space. Segment breaks are preserved as forced line breaks .</ dd >
552564 < dt > < dfn title ="white-space-collapsing:preserve-breaks "> < code > preserve-breaks</ code > </ dfn > </ dt >
553565 < dd > This value collapses white space as for 'collapse', but preserves
554- line breaks.</ dd >
566+ segment breaks as forced line breaks.</ dd >
555567 < dt > < dfn title ="white-space-collapsing:discard "> < code > discard</ code > </ dfn > </ dt >
556568 < dd > This value directs user agents to "discard" all white space in the
557569 element.
@@ -566,9 +578,6 @@ <h3 id="white-space-collapsing"><span class="secno">3.1.</span>
566578 < h3 id ="white-space-rules "> < span class ="secno "> 3.2.</ span >
567579 The White Space Processing Rules</ h3 >
568580
569- < p > Any text that is directly contained inside a block (not inside
570- an inline) is treated as being inside an anonymous inline element.</ p >
571-
572581 < p > For each inline (including anonymous inlines), white space
573582 characters are handled as follows, ignoring bidi formatting
574583 characters as if they were not there:</ p >
@@ -579,19 +588,21 @@ <h3 id="white-space-rules"><span class="secno">3.2.</span>
579588 are considered < dfn > collapsible</ dfn > and are processed by
580589 performing the following steps:</ p >
581590 < ol >
582- < li > All non-line-break white space characters immediately following
583- a line break character are removed. (This has the effect of
584- discarding all white space at the start of a line but preserving
585- a trailing space if one exists at the end.)</ li >
591+ < li > All spaces and tabs immediately following a line feed character
592+ are removed. (This has the effect of discarding all white space
593+ at the start of a line but preserving a trailing space if one
594+ exists at the end.)</ li >
586595 < li > If < span class ="property "> 'white-space-collapsing'</ span > is not
587- 'preserve-breaks', line break characters are transformed for
596+ 'preserve-breaks', line feed characters are transformed for
588597 rendering according to the < a href ="#line-break-transform "> line
589598 break transformation rules</ a > .
590599 </ li >
591600 < li > Every tab (U+0009) is converted to a space (U+0020)</ li >
592- < li > Any space (U+0020) following another space (U+0020)—even
593- a space before the inline, if that space is also collapsible—is
594- removed.</ li >
601+ < li > Any space (U+0020) following another collabsible space
602+ (U+0020)—even a space before the inline—is removed.
603+ However, if removing this space would eliminate a line breaking
604+ opportunity in the text, that opportunity is still considered
605+ to exist.</ li >
595606 </ ol >
596607 </ li >
597608 < li > < p > If < span class ="property "> 'white-space-collapsing'</ span > is set to
@@ -612,13 +623,13 @@ <h3 id="white-space-rules"><span class="secno">3.2.</span>
612623 < ol >
613624 < li > A sequence of collapsible spaces (U+0020) at the beginning of a
614625 line is removed.</ li >
615- < li > A tab (U+0009) is rendered as a horizontal shift that lines up
626+ < li > Each tab (U+0009) is rendered as a horizontal shift that lines up
616627 the start edge of the next glyph with the next tab stop.
617628 Tab stops occur at points that are multiples of 8 times the width
618629 of a space (U+0020) rendered in the block's font from the block's
619630 starting content edge.</ li >
620- < li > A sequence of collapsible spaces (U+0020) or ideographic spaces
621- (U+3000) at the end of a line is removed.</ li >
631+ < li > A sequence of collapsible spaces (U+0020) at the end of a line
632+ is removed.</ li >
622633 < li > If spaces (U+0020) or tabs (U+0009) at the end of a line are
623634 non-collapsible but have 'text-wrap' set to 'normal' or 'suppress'
624635 the UA may visually collapse them.
@@ -665,37 +676,43 @@ <h4 id="egbidiwscollapse"><span class="secno">3.2.1.</span>
665676 < h4 id ="line-break-transform "> < span class ="secno "> 3.2.2.</ span >
666677 Line Break Transformation Rules</ h4 >
667678
668- < p > When line breaks are < a href ="#collapse "> collapsible</ a > , they are
679+ < p > When line feeds are < a href ="#collapse "> collapsible</ a > , they are
669680 either transformed into a space (U+0020) or removed depending on the
670681 script context before and after the line break.</ p >
671682
672683 < p > The script context is determined by the Unicode-given script value
673- [UAX24] of the first character that side of the line break . However,
684+ [UAX24] of the first character that side of the line feed . However,
674685 characters such as punctuation that belong to the COMMON and INHERITED
675686 scripts are ignored in this check; the next character is examined
676687 instead. The UA must not examine characters outside the block and may
677688 limit its examination to as few as four characters on each side of the
678- line break . If the check fails to find an acceptable script value
689+ line feed . If the check fails to find an acceptable script value
679690 (i.e. it has hit the check limits), then the script context is neutral.</ p >
680691
692+ < p class ="note "> Note that the white space processing rules have already
693+ removed any tabs and spaces after the line feed before these checks
694+ take place.</ p >
695+
681696 < ul >
682697 < li > If the character immediately before or immediately after the line
683- break is the zero width space character (U+200B), then the line break
698+ feed is the zero width space character (U+200B), then the line feed
684699 is removed.
685- < li > Otherwise, if the script context on one side of the line break is
700+ < li > Otherwise, if the script context on one side of the line feed is
686701 Han, Yi, Hiragana, or Katakana and the context on the other side is
687- Han, Yi, Hiragana, Katakana, or neutral, then the line break is removed.
688- < p class ="issue "> IE removes line breaks not only between two Hiragana but also Hiragana and fullwidth alphabets.
689- Firefox, Opera, and Safari does not remove line breaks even between two Hiragana.
690- So current rule written here will break everyone.</ p >
702+ Han, Yi, Hiragana, Katakana, or neutral, then the line feed is removed.
703+ < p class ="issue "> IE removes line feed not only between two Hiragana
704+ but also Hiragana and fullwidth alphabets.
705+ Firefox, Opera, and Safari does not remove line breaks even between
706+ two Hiragana.
707+ Should fullwidth Latin be considered in this rule?</ p >
691708 < li > Otherwise, if ''text-autospace'' property is set to add extra spaces
692- for the combination of the character before the line break and after,
709+ for the combination of the character before the line feed and after,
693710 then the line break is removed.
694711 < p class ="issue "> Any feedback on this behavior is appreciated.
695712 Now that we have text-autospace property, it makes sense to favor it than inserting spaces.
696713 However, we also want to preserve backward compatibility as much as possible.
697- That is the reason why the line break should be removed only if text-autospace inserts spaces instead.</ p > </ li >
698- < li > Otherwise, the line break is converted to a space (U+0020).
714+ That is the reason why the line feed should be removed only if text-autospace inserts spaces instead.</ p > </ li >
715+ < li > Otherwise, the line feed is converted to a space (U+0020).
699716 </ ul >
700717
701718 < p class ="issue "> Comments on how well this would work in practice would
@@ -1075,8 +1092,8 @@ <h3 id="text-wrap"><span class="secno">5.1.</span>
10751092 for the WJ, ZW, and GL line-breaking classes in
10761093 [< a href ="#UAX14 "> UAX14</ a > ] must be honored.
10771094 < dt > < dfn title ="text-wrap:none "> < code > none</ code > </ dfn > </ dt >
1078- < dd > Lines may not break except at forced break points; text that
1079- does not fit within the block box overflows it.</ dd >
1095+ < dd > Lines may not break; text that does not fit within the block container
1096+ overflows it.</ dd >
10801097 < dt > < dfn title ="text-wrap:unrestricted "> < code > unrestricted</ code > </ dfn > </ dt >
10811098 < dd > Lines may break between any two grapheme clusters. Line-breaking
10821099 restrictions have no effect and hyphenation does not take place.
@@ -1090,7 +1107,8 @@ <h3 id="text-wrap"><span class="secno">5.1.</span>
10901107 'normal'.
10911108 </ dl >
10921109
1093- < p > For all values, line-breaking behavior defined for the BK, CR, LF, CM
1110+ < p > Regardless of the 'text-wrap' value, lines always break at forced breaks:
1111+ for all values, line-breaking behavior defined for the BK, CR, LF, CM
10941112 NL, and SG line breaking classes in [< a href ="#UAX14-norm "> UAX14</ a > ] must
10951113 be honored.</ p >
10961114
0 commit comments