Skip to content

Latest commit

 

History

History
113 lines (98 loc) · 8.72 KB

File metadata and controls

113 lines (98 loc) · 8.72 KB
title Letters and Numbers Orientation By Codepoint

Letters and Numbers Orientation By Codepoint

This page is intended to help analyze Unicode wrt text orientation. It is not comprehensive at all yet.

Category Codes:

Code UTR50 MSFT Meaning
U U S Upright; translates between horizontal and vertical
R S R Sideways; rotates between horizontal and vertical
TU T ST Typeset upright with alternate glyph. Best fallback is just upright.
TR SB RT Typeset upright with alternate glyph. Best fallback is just sideways.

Two modes are presented: Stacking (text-orientation: upright) and Default (TBD).

Letters (L*) and Script-Specific Numbers (N*)

Letters and script-specific (non-Common) numbers are classified by using their script property (including Script Extensions property). Common numbers are listed separately.

Code Name Stack Mixed Memo UTR MS
Bopo Bopomofo U U
Brai Braille U R :?: Checking with DAISY but haven't got back yet. Most resources say Braille cannot flow vertical. [[http://www.design-thinking.jp/2011/04/blog-post.html This page]] indicates [[http://en.wikipedia.org/wiki/Sanada_Yukimura Yukimura Sanada]] developed vertical Braille as R in 16th century, but this is probably different from the today's Braille. [[http://www6.ocn.ne.jp/~takut/tenjiehon.html This book]] has vertical modern Braille, but can't identify if it's U or R from the picture. A definite scan of Mongolian Braille, however, shows that it is R.
Egyp Egyptian Hieroglyphs U U Egypgian hieroglyhs are upright when written in columns
Hira Hiragana U U
Kana Katakana U U Unclear whether halfwidth katakana should be upright or sideways; voice marks are broken if set upright.
Hani Han U U
Hang Hangul U U
Lisu Lisu U R Lisu-script characters are used intermixed with Latin, so their orientations must match UR UU
Merc Meroitic Cursive U U Egypgian hieroglyhs are upright when written in columns UR UR
Mero Meroitic Hieroglyphs U U Egypgian hieroglyhs are upright when written in columns UR UU
Mong Mongolian V V Mongolian in Unicode code chart shows vertical glyphs and most font today has glyphs in 90 degree CCW rotated, so they are U from Unicode point of view, but R from UA point of view. Call it V.
Ogam Ogham R R
Orkh Old Turkic R R Old Turkic has a strong tradition of vertical writing. Unclear whether it rotates clockwise or counter-clockwise, but it definitely rotates.
Phag Phags Pa V V Same as Mongolian.
Yiii Yi U U Old documents show Yi rotated sideways (as vertical script), but one example of modern Yi (typeset horizontally) uses upright-stacked captions
Arab Arabic U R :?: Still debating how to handle cursive RTL in stacked mode UR RR
Mand Mandaic U R :?: Still debating how to handle cursive RTL in stacked mode UR RR
Miao Maio U R :?: Needs some research to determine whether U/R or U/U UR UU
Syrc Syriac U R :?: Still debating how to handle cursive RTL in stacked mode UR RR
-- Canadian_Aboriginal U R :?: UTR#50 has U/U, unclear why UU UU
Oriya, Telugu, Kannada, Malayalam, Sinhala, Myanmar, Khmer, Tai_Tham, Javanese, Cham U R :?: Unclear why MSFT chose R/R, seems wrong UR RR
Linear_B, Ugaritic, Old_Persian, Avestan U R :?: Unclear why MSFT chose U/U. Cuneiform in particular derives (via rotation) from vertical writing, so U/U seems an illogical choice UR UU
-- All others U R Unless the script has a vertical tradition, it is sideways in mixed mode and upright in stacked

There are some exceptions:

Code Description Char Stack Mix Memo
[[http://www.fileformat.info/info/unicode/char/30FC/index.htm U+30FC]] KATAKANA-HIRAGANA PROLONGED SOUND MARK TR TR
[[http://www.fileformat.info/info/unicode/char/FF70/index.htm U+FF70]] HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK TR R :?: Halfwidth?
U+FF61-FFDF, U+FFE8-FFEF All halfwidth letters U R

Some interesting cases:

Letterlike Symbols Block Letters

See also Symbols from this block and Math symbols from this block

U+2102 DOUBLE-STRUCK CAPITAL C U R Part of mathematical double-struck set
U+2107 EULER CONSTANT U R Match PLANCK CONSTANT
U+210A SCRIPT SMALL G U R Part of mathematical script set
U+210B SCRIPT CAPITAL H U R Part of mathematical script set
U+210C BLACK-LETTER CAPITAL H U R Match other math letters
U+210D DOUBLE-STRUCK CAPITAL H U R Part of mathematical double-struck set
U+210E PLANCK CONSTANT U R Part of mathematical italic set
U+210F PLANCK CONSTANT OVER TWO PI U R Match PLANCK CONSTANT
U+2110 SCRIPT CAPITAL I U R Part of mathematical script set
U+2111 BLACK-LETTER CAPITAL I U R Match other math letters
U+2112 SCRIPT CAPITAL L U R Part of mathematical script set
U+2113 SCRIPT SMALL L U U EA compatibility unit is upright. Not unified with mathematical script l.
U+2115 DOUBLE-STRUCK CAPITAL N U R Part of mathematical double-struck set
U+2119 DOUBLE-STRUCK CAPITAL P U R Part of mathematical double-struck set
U+211A DOUBLE-STRUCK CAPITAL Q U R Part of mathematical double-struck set
U+211B SCRIPT CAPITAL R U R Part of mathematical script set
U+211C BLACK-LETTER CAPITAL R U R Match other math letters
U+211D DOUBLE-STRUCK CAPITAL R U R Part of mathematical double-struck set
U+2124 DOUBLE-STRUCK CAPITAL Z U R Part of mathematical double-struck set
U+2126 OHM SIGN Ω U U EA compatibility unit is upright. :!: NFC-folds to omega
U+2128 BLACK-LETTER CAPITAL Z U R Match other math letters
U+212A KELVIN SIGN K U U EA compatibility unit is upright. :!: NFC-folds to K
U+212B ANGSTROM SIGN Å U U EA compatibility unit is upright. :!: NFC-folds to Aring
U+212C SCRIPT CAPITAL B U R Part of mathematical script set
U+212D BLACK-LETTER CAPITAL C U R Match other math letters
U+212F SCRIPT SMALL E U R Part of mathematical script set
U+2130 SCRIPT CAPITAL E U R Part of mathematical script set
U+2131 SCRIPT CAPITAL F U R Part of mathematical script set
U+2132 TURNED CAPITAL F U R Claudian must match Latin
U+2133 SCRIPT CAPITAL M U R Part of mathematical script set
U+2134 SCRIPT SMALL O U R Part of mathematical script set
U+2139 INFORMATION SOURCE U U Symbolic, not math
U+213C DOUBLE-STRUCK SMALL PI U R Match double-struck Latin
U+213D DOUBLE-STRUCK SMALL GAMMA U R Match double-struck Latin
U+213E DOUBLE-STRUCK CAPITAL GAMMA U R Match double-struck Latin
U+213F DOUBLE-STRUCK CAPITAL PI U R Match double-struck Latin
U+2135 ALEF SYMBOL U R Math symbol
U+2136 BET SYMBOL U R Math symbol
U+2137 GIMEL SYMBOL U R Math symbol
U+2138 DALET SYMBOL U R Math symbol
U+2145 DOUBLE-STRUCK ITALIC CAPITAL D U R Math symbol, match double-struck Latin
U+2146 DOUBLE-STRUCK ITALIC SMALL D U R Math symbol, match double-struck Latin
U+2147 DOUBLE-STRUCK ITALIC SMALL E U R Math symbol, match double-struck Latin
U+2148 DOUBLE-STRUCK ITALIC SMALL I U R Math symbol, match double-struck Latin
U+2149 DOUBLE-STRUCK ITALIC SMALL J U R Math symbol, match double-struck Latin
U+214E TURNED SMALL F U R Claudian must match Latin