Skip to content

[css-ruby] Incorrect language tags in the source file? #3292

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
xfq opened this issue Nov 7, 2018 · 7 comments · Fixed by #4942
Closed

[css-ruby] Incorrect language tags in the source file? #3292

xfq opened this issue Nov 7, 2018 · 7 comments · Fixed by #4942
Assignees
Labels
Closed Accepted as Editorial css-ruby-1 Current Work i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. Testing Unnecessary Memory aid - issue doesn't require tests

Comments

@xfq
Copy link
Member

xfq commented Nov 7, 2018

In the glossary of css-ruby, the terms are marked as lang="zh/ko/ja", although they're written in English.

Source file:

<h2 id="glossary">
Glossary</h2>
<dl>
<dt id="g-bopomofo" lang="zh">Bopomofo
<dd>37 characters and 4 tone markings used as phonetics in Chinese,
especially standard Mandarin.
<p class="note">
Note that the user agent is responsible for ensuring the correct relative alignment and positioning of the glyphs,
including bopomofo tone marks, when displaying text,
whether it occurs in ruby annotations or as normal inline text.
Bopomofo Tone marks are spacing characters that occur (in memory) at the end of the ruby text for each base character.
They are usually displayed in a separate column to the right of or above the bopomofo characters,
and the position of the tone mark depends on the number of characters in the syllable.
One tone mark, however, is placed before the bopomofo, not over it.
<!-- See Taiwanese requirements doc for EPUB at http://epub-revision.googlecode.com/files/EGLS_TW_eng.ppt -->
<dt id="g-hanja" lang="ko">Hanja
<dd>Subset of the Korean writing system that utilizes ideographic
characters borrowed or adapted from the Chinese writing system. Also see
<a href="#g-kanji"><span lang="ja">Kanji</span></a>.</dd>
<dt id="g-hiragana" lang="ja">Hiragana
<dd>Japanese syllabic script, or character of that script. Rounded and
cursive in appearance. Subset of the Japanese writing system, used together
with kanji and katakana. In recent times, mostly used to write Japanese
words when kanji are not available or appropriate, and word endings and
particles. Also see <a
href="#g-katakana"><span lang="ja">Katakana</span></a>.</dd>
<dt id="g-ideogram">Ideograph
<dd>A character that is used to represent an idea, word, or word component,
in contrast to a character from an alphabetic or syllabic script. The most
well-known ideographic script is used (with some variation) in East Asia
(China, Japan, Korea,...).</dd>
<dt id="g-kana" lang="ja">Kana
<dd>Collective term for hiragana and katakana.</dd>
<dt id="g-kanji">Kanji
<dd>Japanese term for ideographs; ideographs used in Japanese. Subset of the
Japanese writing system, used together with hiragana and katakana. Also see <a
href="#g-hanja"><span lang="ko">Hanja</span></a>.</dd>
<dt id="g-katakana" lang="ja">Katakana
<dd>Japanese syllabic script, or character of that script. Angular in
appearance. Subset of the Japanese writing system,&nbsp; used together with
kanji and hiragana. In recent times, mainly used to write foreign words. Also see <a
href="#g-hiragana"><span lang="ja">Hiragana</span></a>.</dd>
</dl>

@xfq xfq added the css-ruby-1 Current Work label Nov 7, 2018
@upsuper
Copy link
Member

upsuper commented Nov 10, 2018

Yeah... I see no reason why they need to be marked that way.

@upsuper
Copy link
Member

upsuper commented Nov 10, 2018

They are not really English, though. Strictly speaking they are romanization of words in those languages. Not sure how they should be marked.

@heycam
Copy link
Contributor

heycam commented Nov 10, 2018

You could use zh-Latn or zh-Latn-pinyin I guess.

@heycam
Copy link
Contributor

heycam commented Nov 10, 2018

(For the first one, at least. I didn't look up in http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry what script names to use in the other languages.)

@xfq xfq added the i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. label Nov 10, 2018
@frivoal
Copy link
Collaborator

frivoal commented Nov 12, 2018

I believe zh-Latn , ja-Latn and ko-Latn would be correct.

@xfq xfq changed the title [css-ruby-1] Incorrect language tags in the source file? [css-ruby] Incorrect language tags in the source file? Dec 1, 2018
@r12a
Copy link
Contributor

r12a commented Aug 29, 2019

I think this is one of the scenarios where language tagging is not clear-cut, and very much falls back on the question: "What effect are you hoping to achieve by tagging for language?" (Or, actually, here perhaps what effect do you not want to produce?) Here's my 2p:

If you label these terms as described above you'll find that the browser applies different CJK fonts to the text unless overridden by styling (not the case here), which to me looks quite odd because they aren't usually written that way in the native language either. For myself, i see these as technical terms used in English that happen to be derived from CJK terms. I wouldn't put any language tag on them, because:
a. i don't know what that would achieve,
b. actually it can be detrimental in terms of styling

@frivoal
Copy link
Collaborator

frivoal commented Apr 10, 2020

Arguably, UAs ought to be responding to the full language tag, including the writing-system, and not switch to Chinese if it sees lang="zh-Latn". However, even if that were true, I now think it's still be wrong to tag them that way: if we think a a screen reader instead, we wouldn't expect it to switch to a Chinese voice when reading "bopomofo". The document is in English and to be read / listened to by English speakers, and there's no benefit in switching to Chinese phonetics.

So I agree with @r12a: these are better seen technical terms used in English that happen to be derived from CJK terms, best left without a language tag.

frivoal added a commit to frivoal/csswg-drafts that referenced this issue Apr 10, 2020
@frivoal frivoal self-assigned this Apr 10, 2020
frivoal added a commit to frivoal/csswg-drafts that referenced this issue Apr 13, 2020
@frivoal frivoal added Closed Accepted as Editorial Testing Unnecessary Memory aid - issue doesn't require tests labels Apr 20, 2020
JTensai pushed a commit to JTensai/csswg-drafts that referenced this issue May 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closed Accepted as Editorial css-ruby-1 Current Work i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. Testing Unnecessary Memory aid - issue doesn't require tests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants