@@ -1360,14 +1360,29 @@ <h4>
13601360
13611361 < p >
13621362 Create a new unicode-range token
1363- with both its start value and end value
1364- initially set to the empty string.
1363+ with an empty range.
13651364
13661365 < p >
13671366 Consume as many < i > hex digits</ i > as possible, but no more than 6.
1368- Interpret the digits as a hexadecimal number,
1369- and set the unicode-range token's start value
1370- to that number.
1367+ If less than 6 < i > hex digits were consumed</ i > ,
1368+ consume as many U+003F QUESTION MARK (?) character as possible,
1369+ but no more than enough to make the total of < i > hex digits</ i > and U+003F QUESTION MARK (?) characters equal to 6.
1370+
1371+ < p >
1372+ If any U+003F QUESTION MARK (?) characters were consumed,
1373+ first interpret the consumed characters as a hexadecimal number,
1374+ with the U+003F QUESTION MARK (?) characters replaced by U+0030 DIGIT ZERO (0) characters.
1375+ This is the < i > start of the range</ i > .
1376+ Then interpret the consumed characters as a hexadecimal number again,
1377+ with the U+003F QUESTION MARK (?) character replaced by U+0046 LATIN CAPITAL LETTER F (F) characters.
1378+ This is the < i > end of the range</ i > .
1379+ < i > Set the unicode-range token's range</ i > , then emit it.
1380+ Switch to the < i > data state</ i > .
1381+
1382+ < p >
1383+ Otherwise,
1384+ interpret the digits as a hexadecimal number.
1385+ This is the < i > start of the range</ i > .
13711386
13721387 < p >
13731388 Consume the < i > next input character</ i > .
@@ -1377,21 +1392,21 @@ <h4>
13771392 < dd >
13781393 If the < i > next input character</ i > is a < i > hex digit</ i > ,
13791394 consume as many < i > hex digits</ i > as possible, but no more than 6.
1380- Interpret the digits as a hexadecimal number,
1381- and set the unicode-range token's end value to that number .
1382- Emit the unicode-range token.
1395+ Interpret the digits as a hexadecimal number.
1396+ This is the < i > end of the range </ i > .
1397+ < i > Set the unicode-range token's range </ i > , then emit it .
13831398 Switch to the < i > data state</ i > .
13841399
13851400 < p >
13861401 Otherwise,
1387- set the unicode-range token's end value to its start value
1402+ < i > set the unicode-range token's range </ i >
13881403 and emit it.
13891404 Switch to the < i > data state</ i > .
13901405 Reconsume the < i > current input character</ i > .
13911406
13921407 < dt > anything else
13931408 < dd >
1394- Set the unicode-range token's end value to its start value
1409+ < i > Set the unicode-range token's range </ i >
13951410 and emit it.
13961411 Switch to the < i > data state</ i > .
13971412 Reconsume the < i > current input character</ i > .
@@ -1425,3 +1440,41 @@ <h4>
14251440 < dd >
14261441 Return the < i > current input character</ i > .
14271442 </ dl >
1443+
1444+ < h4 >
1445+ < dfn > Set the unicode-range token's range</ dfn > </ h4 >
1446+
1447+ < p >
1448+ This section describes how to set a unicode-range token's range
1449+ so that the range it describes
1450+ is within the supported range of unicode characters.
1451+
1452+ < p >
1453+ It assumes that the < dfn > start of the range</ dfn > has been defined,
1454+ the < dfn > end of the range</ dfn > might be defined,
1455+ and both are non-negative integers.
1456+
1457+ < p >
1458+ If the < i > start of the range</ i > is greater than
1459+ the current maximum allowed codepoint in Unicode (currently U+10FFFF),
1460+ the unicode-range token's range is empty.
1461+
1462+ < p >
1463+ If the < i > end of the range</ i > is defined,
1464+ and it is less than the < i > start of the range</ i > ,
1465+ the unicode-range token's range is empty.
1466+
1467+ < p >
1468+ If the < i > end of the range</ i > is not defined,
1469+ the unicode-range token's range
1470+ is the single character whose codepoint is the < i > start of the range</ i > .
1471+
1472+ < p >
1473+ Otherwise,
1474+ if the < i > end of the range</ i > is greater than
1475+ the current maximum allowed codepoint in Unicode,
1476+ change it to the current maximum allowed codepoint.
1477+ The unicode-range token's range
1478+ is all characters between
1479+ the character whose codepoint is the < i > start of the range</ i >
1480+ and the character whose codepoint is the < i > end of the range</ i > .
0 commit comments