Skip to content

Commit af8ab86

Browse files
committed
ideas/at-text-transform: <code css> for basic syntax highlighting, note on Greek
1 parent be674e7 commit af8ab86

1 file changed

Lines changed: 30 additions & 27 deletions

File tree

ideas/at-text-transform.txt

Lines changed: 30 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This is an early draft for a possible generic mechanism to allow authors to defi
44

55
The general form of an @text-transform at-rule is:
66

7-
<code>
7+
<code css>
88
@text-transform <transform-name>
99
{ [ descriptor: value; ]+ }
1010
</code>
@@ -15,7 +15,7 @@ A text transform created using this at-rule may be used simply by using <transfo
1515

1616
===== The transformation descriptor =====
1717

18-
<code>
18+
<code bnf>
1919
Name: transformation
2020
Value: <conversion>#
2121
default: N/A
@@ -26,7 +26,6 @@ default: N/A
2626
<enumeration> = <string>
2727
</code>
2828

29-
3029
This descriptor defines which character will be replaced by which, by listing a series of conversions, to be applied in the same order as they appear in the descriptor.
3130

3231
Conversions may refer to existing text transforms, either predefined by CSS or defined by the author. While an transformation using only a single such conversion is not very useful, combining it with other conversions allows authors to extend or define variants of existing transforms. Referring to the text-transform currently being define is not allowed, and makes the whole descriptor invalid.
@@ -52,19 +51,22 @@ In a <conversion>, If the source <char-list> is longer than the target <char-lis
5251
<note warning>ISSUE 4: It has been suggested that it should be possible to write text-transforms that behave differently on different languages. This can probably be achieved by adding some optional part at the beginning of each <conversion>, although I am not sure what
5352
the syntax should be.</note>
5453

55-
5654
Examples:
5755

58-
<code>
59-
@text-transform latin-only-uppercase { transformation: "a-z" to "A-Z"; }
56+
<code css>
57+
@text-transform latin-only-uppercase
58+
{
59+
transformation: "a-z" to "A-Z";
60+
}
6061
</code>
6162

62-
6363
The following two transforms are identical.
6464

65-
<code>
66-
@text-tranform abcdef1 { transformation: "abc" to "def"; }
67-
65+
<code css>
66+
@text-tranform abcdef1
67+
{
68+
transformation: "abc" to "def";
69+
}
6870
@text-tranform abcdef2
6971
{
7072
transformation: "a" to "d",
@@ -76,7 +78,7 @@ The following two transforms are identical.
7678

7779
===== The character-type descriptor =====
7880

79-
<code>Name: character-type
81+
<code bnf>Name: character-type
8082
Value: extended | legacy | single
8183
Default: extended
8284
</code>
@@ -98,7 +100,7 @@ This definition affects character processing in two different contexts:
98100

99101

100102
===== The scope descriptor =====
101-
<code>Name: scope
103+
<code bnf>Name: scope
102104
Value: all | [initial || medial || final]
103105
Default: all
104106
</code>
@@ -112,7 +114,6 @@ This descriptor makes it possible to restrict which characters in the source tex
112114

113115
<note warning>ISSUE 7: More fancy values could be added here in the future to support things like title case, or to match only the base character, or only the diacritics.</note>
114116

115-
116117
The definition of "word" is UA-dependent; [[http://www.unicode.org/reports/tr29/tr29-17.html|UAX29]] is suggested (but not required) for determining such word boundaries.
117118

118119
The transformation descriptor may be used to refer to existing text-transforms in the definition of a new one. If the text-transforms
@@ -121,9 +122,11 @@ two scopes.
121122

122123
Example:
123124

124-
<code>
125-
@text-transform latin-only-uppercase { transformation: "a-z" to "A-Z"; }
126-
125+
<code css>
126+
@text-transform latin-only-uppercase
127+
{
128+
transformation: "a-z" to "A-Z";
129+
}
127130
@text-transform latin-only-capitalize
128131
{
129132
transformation: latin-only-uppercase;
@@ -141,7 +144,7 @@ The following use cases only apply to a single language. Defining all the possib
141144
==== Full-size kana ====
142145
In Japanese, small kanas appearing within ruby are sometimes replaced by the equivalent full-size kana. The following transform defines this conversion
143146

144-
<code>
147+
<code css>
145148
@text-transform full-size-kana
146149
{
147150
transformation: "ぁぃぅぇぉゕゖっゃゅょゎ" to "あいうえおかけつやゆよわ",
@@ -189,8 +192,7 @@ The uppercasing and lowercasing algorithm defined for the text-transform propert
189192

190193
Someone, for example in a user style sheet, may want to apply an uppercase or lowercase transform to a document where language is insufficiently marked up, but known to the author of the style sheet to be Turkish. In this case, the generic uppercase and lowercase transforms would fail, but the following would work.
191194

192-
193-
<code>
195+
<code css>
194196
@text-transform turkic-uppercase
195197
{
196198
transformation: "i" to "İ", uppercase;
@@ -209,13 +211,12 @@ http://en.wikipedia.org/wiki/Georgian_alphabet
209211

210212
The Georgian language has used three different unicameral alphabets through history: Asomtavruli, Nuskhuri, and Mkhedruli. Recently, some authors have been using Asomtavruli letters in an otherwise Mkhedruli text, in a way that resembles a bicameral alphabet. One may assume that they would find the following transform useful.
211213

212-
<code>
214+
<code css>
213215
@text-transform Mkhedruli-to-Asomtavruli
214216
{
215217
transformation: "ა-ჵ" to "Ⴀ-Ⴥ";
216218
}
217-
</code>
218-
<code>
219+
219220
@text-transform Asomtavruli-to-Mkhedruli
220221
{
221222
transformation: "Ⴀ-Ⴥ" to "ა-ჵ";
@@ -235,7 +236,7 @@ In old (18th century and earlier) European texts, the letter s, when at the midd
235236

236237
Modern readers are often unfamiliar with this letter form, and for readability reasons, one may want to convert from one to the other. The follow transform would accomplish this.
237238

238-
<code>
239+
<code css>
239240
@text-transform modernize-s
240241
{
241242
transformation: "ſ" to "s";
@@ -244,7 +245,7 @@ Modern readers are often unfamiliar with this letter form, and for readability r
244245

245246
This does the opposite transform:
246247

247-
<code>
248+
<code css>
248249
@text-transform long-s
249250
{
250251
transformation: "s" to "ſ" ;
@@ -260,7 +261,7 @@ Here are some more example of how the generic mechanism may be used
260261

261262
Most writing systems of the world have at least one common transliteration scheme into the roman script.
262263

263-
<code>
264+
<code css romanization.css>
264265
@text-transform romanization
265266
{/* ISO 9 (Cyrillic) */
266267
transformation: "А а Ӑ ӑ Ӓ ӓ Ә ә Б б В в Г г Ґ ґ Ҕ ҕ Ғ ғ Д д Ђ ђ Ѓ ѓ Е е Ё ё Ӗ ӗ Є є Ҽ ҽ Ҿ ҿ
@@ -280,12 +281,14 @@ Most writing systems of the world have at least one common transliteration schem
280281
N n X x O o Ó ó P p R r S s s T t Y y Ý ý Ÿ ÿ F f Ch ch Ps ps Ō ō Ṓ ṓ";
281282
}
282283
</code>
284+
<note>The Greek example above only works if ISSUE 2 is resolved, because Theta, Chi and Psi are transliterated into digraphs that don’t have a single code point in Unicode.</note>
285+
283286
==== Comic book vikings ====
284287
In the "Asterix and the Great Crossing" comic book, the Viking characters are supposed to speak a foreign language unintelligible to the main characters, but still understandable to the readers. This is represented by writing down their speech normally, except that some letters are replaced by similarly looking letters found in Scandinavian languages.
285288

286289
This effect could be obtained by the following transform:
287290

288-
<code>
291+
<code css>
289292
@text-transform fake-norse
290293
{
291294
transformation: "aoAO" to "åøÅØ";
@@ -295,7 +298,7 @@ This effect could be obtained by the following transform:
295298
==== Leet speak ====
296299
In Internet, hacker and gamer culture, a phenomenon is quite common, where characters are replaced by other characters or character sequences which have a somewhat similar glyphic appearance. Although no single consensual convention exists and sometimes mappings are neither injective nor surjective, one could simulate this playful style with a transform like the following:
297300

298-
<code>
301+
<code css>
299302
@text-transform leet-speak
300303
{
301304
transformation: "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

0 commit comments

Comments
 (0)