Skip to content

[css-fonts] Handling of Standardized Variation Sequences #1710

Open
@hfhchan

Description

@hfhchan

In Step 2a of Section 5.3 Cluster matching of CSS Fonts 3:

If c1 is a variation selector, system fallback must be used to find
a font that supports the full sequence of `b + c1.

I've tested Chrome and Firefox, and both don't do any system font fallback when the given font contains a glyph for the b but don't for b + c1.

I tested with 齋󠄁齋 (the first has a variation selector U+E0101 appended after it), and with a CSS declaration of font-family: SimSun, HanaMinA;, of which SimSun does not contain a glyph for the variation selector, but HanaMinA does:

Result in Firefox:
image

Result in Chrome:
image

The spec should either be amended to reflect the behavior implemented by browsers, or the browser's behavior should be changed. Unicode Variation Selectors involve a GSUB CMAP14 lookup and it would be understandable that reordering complex table lookups in the font-selection phase could be prohibitively expensive.

Due to the nature of the Han script, it is often hard to objectively quantify what is the same character and what is not. Different people have different expectations. CJK Unification in ISO10646 was a very controversial decision and continues on to be controversial today. Reliably rendering Unicode Variations is necessary and may have legal ramifications.

The fallback to b behavior is problematic because it may not be what the author intended and the user has completely no idea. More often, the preferred behavior is that a "tofu" (.nodef) is displayed instead.

In addition, China and TCA are likely to be using Unicode Variation Selectors to encode historic variants of CJK Unified Ideographs (assuming the decision by the IRG is approved by WG2 in the coming meeting in September). Variant characters with visually-significant differences will be approved for unification with their more common character, provided that the variant is similar in structure, rare in modern use and is attested to be exactly equivalent to the base character in semantics. In these cases, getting .nodef is usually preferred over getting the base character's glyph if a given font doesn't have that specific glyph variant.

At the same time, it may be useful in historical text digitization projects to dynamically switch between showing characters in glyphs as they are in the books, and glyphs that are used in the modern day. This could be accomplished by stripping all the variation selectors out via regex and innerHTML, or more preferrably activated via a CSS property or feature / flag.

To cater for such behavior, I suggest that a new CSS property and/or OpenType feature / flag is introduced.

These behaviors could be implemented via a new CSS property such as font-variation-sequences with the following values:

  • auto (fallback to b for VS-16 and below if b + c1 is missing, .nodef for VS-17 and above if b + c1 is missing)
  • ignore-missing (fallback to b if b + c1 is missing),
  • tofu-missing (fallback to .nodef if b + c1 is missing),
  • ignore-all (ignore all Unicode Variation Selectors).

It could also be piggybacked by introducing a new OpenType flag, maybe named as "tofu", so the different behaviors could be activated directly via font-variation-settings.

Metadata

Metadata

Assignees

No one assigned

    Labels

    css-fonts-4Current Worki18n-trackerGroup bringing to attention of Internationalization, or tracked by i18n but not needing response.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions