Skip to content

Commit 740c287

Browse files
committed
attempt to clarify voice selection
1 parent 24a4ef8 commit 740c287

2 files changed

Lines changed: 60 additions & 40 deletions

File tree

css3-speech/Overview.html

Lines changed: 35 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1753,11 +1753,13 @@ <h3 id=voice-props-voice-family><span class=secno>10.1. </span>The
17531753

17541754
<p>The &lsquo;<a href="#voice-family"><code
17551755
class=property>voice-family</code></a>&rsquo; property specifies a
1756-
comma-separated, prioritized list of values that designate speech
1757-
synthesis voices (analogous to &lsquo;<code class=css><a
1758-
href="#font-family-def"><code
1756+
comma-separated, prioritized list of component values that are separated
1757+
by a comma to indicate that they are alternatives (this is analogous to
1758+
&lsquo;<code class=css><a href="#font-family-def"><code
17591759
class=property>font-family</code></a></code>&rsquo; in visual style
1760-
sheets), where:
1760+
sheets). Each component value potentially designates a speech synthesis
1761+
voice instance, by specifying match criteria (see the <a
1762+
href="#voice-selection">voice selection</a> section on this topic).
17611763

17621764
<p> <strong>&lt;generic-voice&gt;</strong> = [&lt;age&gt;? &lt;gender&gt;
17631765
&lt;integer&gt;?]
@@ -1867,27 +1869,37 @@ <h4 class=no-toc id=voice-selection><span class=secno>10.1.1. </span>Voice
18671869

18681870
<p>The &lsquo;<a href="#voice-family"><code
18691871
class=property>voice-family</code></a>&rsquo; property is used to guide
1870-
the selection of the speech synthesis voice. As part of this selection
1871-
process, speech-capable user agents must also take into account the
1872-
language of the selected element within the markup content. The "name",
1873-
"gender", "age", and preferred "index" are voice selection hints that get
1874-
carried down the content hierarchy as the &lsquo;<a
1875-
href="#voice-family"><code class=property>voice-family</code></a>&rsquo;
1876-
property value gets inherited by descendant elements. At any point within
1877-
the content structure, the language takes precedence (i.e. has a higher
1878-
priority) over the specified CSS voice characteristics. The following list
1879-
outlines the selection algorithm (note that the definition of "language"
1880-
is loose here, in order to cater for dialectic variants):
1872+
the selection of the speech synthesis voice instance. As part of this
1873+
selection process, speech-capable user agents must also take into account
1874+
the language of the selected element within the markup content. The
1875+
"name", "gender", "age", and preferred "variant" (index) are voice
1876+
selection hints that get carried down the content hierarchy as the
1877+
&lsquo;<a href="#voice-family"><code
1878+
class=property>voice-family</code></a>&rsquo; property value gets
1879+
inherited by descendant elements. At any point within the content
1880+
structure, the language takes precedence (i.e. has a higher priority) over
1881+
the specified CSS voice characteristics.
1882+
1883+
<p> The following list outlines the voice selection algorithm (note that
1884+
the definition of "language" is loose here, in order to cater for
1885+
dialectic variants):
18811886

18821887
<ol>
1883-
<li> If only a single voice is available for the language of the selected
1884-
content, then this voice must be used, regardless of the specified CSS
1885-
voice characteristics.
1886-
1887-
<li> If several voices are available for the language of the selected
1888-
content, then the chosen voice is the one that most closely matches the
1889-
specified gender, age, and preferred voice variant. The actual definition
1890-
of "best match" is processor-dependent.
1888+
<li> If only a single voice instance is available for the language of the
1889+
selected content, then this voice must be used, regardless of the
1890+
specified CSS voice characteristics.
1891+
1892+
<li> If several voice instances are available for the language of the
1893+
selected content, then the chosen voice is the one that most closely
1894+
matches the specified name, or gender, age, and preferred voice variant.
1895+
The actual definition of "best match" is processor-dependent (e.g. a
1896+
reasonable match for "voice-family: 10 male;" may well be a higher
1897+
pitched female voice suitable for a young boy's vocal rendition). If no
1898+
voice instance matches the characteristics provided by any of the
1899+
&lsquo;<a href="#voice-family"><code
1900+
class=property>voice-family</code></a>&rsquo; component values, the first
1901+
available voice instance (amongst those suitable for the language of the
1902+
selected content) must be used.
18911903

18921904
<li> If no voice is available for the language of the selected content, it
18931905
is recommended that user-agents let the user know about the lack of

css3-speech/Overview.src.html

Lines changed: 25 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1383,9 +1383,12 @@ <h3 id="voice-props-voice-family">The 'voice-family' property</h3>
13831383
</tr>
13841384
</tbody>
13851385
</table>
1386-
<p>The 'voice-family' property specifies a comma-separated, prioritized list of values that
1387-
designate speech synthesis voices (analogous to '<a href="#font-family-def"><code
1388-
class="property">font-family</code></a>' in visual style sheets), where: </p>
1386+
<p>The 'voice-family' property specifies a comma-separated, prioritized list of component values
1387+
that are separated by a comma to indicate that they are alternatives (this is analogous to '<a
1388+
href="#font-family-def"><code class="property">font-family</code></a>' in visual style
1389+
sheets). Each component value potentially designates a speech synthesis voice instance, by
1390+
specifying match criteria (see the <a href="#voice-selection">voice selection</a> section on
1391+
this topic). </p>
13891392
<p>
13901393
<strong>&lt;generic-voice&gt;</strong> = [&lt;age&gt;? &lt;gender&gt; &lt;integer&gt;?] </p>
13911394
<p class="note"> Note that the functionality provided by this property is related to the <a
@@ -1462,21 +1465,26 @@ <h3 id="voice-props-voice-family">The 'voice-family' property</h3>
14621465
voice-family: "john doe", "Henry the-8th";</pre>
14631466
</div>
14641467
<h4 class="no-toc" id="voice-selection">Voice selection, content language</h4>
1465-
<p>The 'voice-family' property is used to guide the selection of the speech synthesis voice. As
1466-
part of this selection process, speech-capable user agents must also take into account the
1467-
language of the selected element within the markup content. The "name", "gender", "age", and
1468-
preferred "index" are voice selection hints that get carried down the content hierarchy as the
1469-
'voice-family' property value gets inherited by descendant elements. At any point within the
1470-
content structure, the language takes precedence (i.e. has a higher priority) over the
1471-
specified CSS voice characteristics. The following list outlines the selection algorithm (note
1472-
that the definition of "language" is loose here, in order to cater for dialectic
1473-
variants):</p>
1468+
<p>The 'voice-family' property is used to guide the selection of the speech synthesis voice
1469+
instance. As part of this selection process, speech-capable user agents must also take into
1470+
account the language of the selected element within the markup content. The "name", "gender",
1471+
"age", and preferred "variant" (index) are voice selection hints that get carried down the
1472+
content hierarchy as the 'voice-family' property value gets inherited by descendant elements.
1473+
At any point within the content structure, the language takes precedence (i.e. has a higher
1474+
priority) over the specified CSS voice characteristics. </p>
1475+
<p> The following list outlines the voice selection algorithm (note that the definition of
1476+
"language" is loose here, in order to cater for dialectic variants):</p>
14741477
<ol>
1475-
<li> If only a single voice is available for the language of the selected content, then this
1476-
voice must be used, regardless of the specified CSS voice characteristics. </li>
1477-
<li> If several voices are available for the language of the selected content, then the chosen
1478-
voice is the one that most closely matches the specified gender, age, and preferred voice
1479-
variant. The actual definition of "best match" is processor-dependent.</li>
1478+
<li> If only a single voice instance is available for the language of the selected content,
1479+
then this voice must be used, regardless of the specified CSS voice characteristics. </li>
1480+
<li> If several voice instances are available for the language of the selected content, then
1481+
the chosen voice is the one that most closely matches the specified name, or gender, age,
1482+
and preferred voice variant. The actual definition of "best match" is processor-dependent
1483+
(e.g. a reasonable match for "voice-family: 10 male;" may well be a higher pitched female
1484+
voice suitable for a young boy's vocal rendition). If no voice instance matches the
1485+
characteristics provided by any of the 'voice-family' component values, the first available
1486+
voice instance (amongst those suitable for the language of the selected content) must be
1487+
used. </li>
14801488
<li> If no voice is available for the language of the selected content, it is recommended that
14811489
user-agents let the user know about the lack of appropriate TTS voice. </li>
14821490
</ol>

0 commit comments

Comments
 (0)