8000 fixed voice-volume, various minor fixes. · xfq/csswg-drafts@5496d07 · GitHub
Skip to content

Commit 5496d07

Browse files
committed
fixed voice-volume, various minor fixes.
1 parent fd7d85d commit 5496d07

2 files changed

Lines changed: 140 additions & 153 deletions

File tree

css3-speech/Overview.html

Lines changed: 81 additions & 91 deletions
Original file line numberDiff line numberDiff line change
@@ -90,13 +90,13 @@
9090

9191
<h1 id=top>CSS Speech Module</h1>
9292

93-
<h2 class="no-num no-toc" id=longstatus-date>Editor's Draft 06 July 2011</h2>
93+
<h2 class="no-num no-toc" id=longstatus-date>Editor's Draft 07 July 2011</h2>
9494

9595
<dl>
9696
<dt>This version:
9797

9898
<dd>
99-
<!--<a href="http://www.w3.org/TR/2011/WD-css3-speech-20110706">http://www.w3.org/TR/2011/ED-css3-speech-20110706/</a>-->
99+
<!--<a href="http://www.w3.org/TR/2011/WD-css3-speech-20110707">http://www.w3.org/TR/2011/ED-css3-speech-20110707/</a>-->
100100
<a
101101
href="http://dev.w3.org/csswg/css3-speech">http://dev.w3.org/csswg/css3-speech</a>
102102

@@ -442,6 +442,7 @@ <h2 id=example><span class=secno>3. </span>Example</h2>
442442
voice-family: paul;
443443
voice-stress: moderate;
444444
cue-before: url(../audio/ping.wav);
445+
voice-volume: medium 6dB;
445446
}
446447
p.heidi
447448
{
@@ -516,13 +517,13 @@ <h3 id=mixing-props-voice-volume><span class=secno>5.1. </span>The
516517
<tr>
517518
<td> <em>Value:</em>
518519

519-
<td>normal | silent | x-soft | soft | medium | loud | x-loud |
520-
&lt;decibel&gt;
520+
<td>silent | [[x-soft | soft | medium | loud | x-loud] ||
521+
&lt;decibel&gt;]
521522

522523
<tr>
523524
<td> <em>Initial:</em>
524525

525-
<td>normal
526+
<td>medium
526527

527528
<tr>
528529
<td> <em>Applies&nbsp;to:</em>
@@ -547,7 +548,7 @@ <h3 id=mixing-props-voice-volume><span class=secno>5.1. </span>The
547548
<tr>
548549
<td> <em>Computed value:</em>
549550

550-
<td>specified value
551+
<td>keyword value, and decibel offset (if not zero)
551552
</table>
552553

553554
<p>The &lsquo;<a href="#voice-volume"><code
@@ -563,12 +564,13 @@ <h3 id=mixing-props-voice-volume><span class=secno>5.1. </span>The
563564
attribute of the <code>prosody</code> element</a> from the SSML markup
564565
language <a href="#SSML" rel=biblioentry>[SSML]<!--{{!SSML}}--></a>.
565566

566-
<dl>
567-
<dt> <strong>normal</strong>
568-
569-
<dd>
570-
<p> Corresponds to +0.0dB, which means that there is no modification of
571-
volume level. This value overrides the inherited value.</p>
567+
<dl><!-- dt>
568+
<strong>normal</strong>
569+
</dt>
570+
<dd>
571+
<p> Corresponds to +0.0dB, which means that there is no modification of volume level. This
572+
value overrides the inherited value.</p>
573+
</dd -->
572574

573575
<dt> <strong>silent</strong>
574576

@@ -582,9 +584,9 @@ <h3 id=mixing-props-voice-volume><span class=secno>5.1. </span>The
582584
&lsquo;<code class=property>silent</code>&rsquo;, and an element whose
583585
&lsquo;<a href="#speak"><code class=property>speak</code></a>&rsquo;
584586
property has the value &lsquo;<code class=property>none</code>&rsquo;.
585-
With the former, the selected takes up the same time as if it had been
586-
spoken, including any pause before and after the element, but no sound
587-
is generated (descendants can override the &lsquo;<a
587+
With the former, the selected element takes up the same time as if it
588+
was spoken, including any pause before and after the element, but no
589+
sound is generated (descendants can override the &lsquo;<a
588590
href="#voice-volume"><code class=property>voice-volume</code></a>&rsquo;
589591
value and may therefore generate audio output). With the latter, the
590592
selected element is not rendered in the aural dimension and no time is
@@ -598,8 +600,8 @@ <h3 id=mixing-props-voice-volume><span class=secno>5.1. </span>The
598600
<dd>
599601
<p> This sequence of keywords corresponds to monotonically non-decreasing
600602
volume levels, mapped to implementation-dependent values (i.e. inferred
601-
by the user-agent) that meet user's requirements in terms of perceived
602-
sound loudness . The keyword &lsquo;<code
603+
by the user-agent) that meet the user's requirements in terms of
604+
perceived sound loudness . The keyword &lsquo;<code
603605
class=property>x-soft</code>&rsquo; maps to the user's <em>minimum
604606
audible</em> volume level, &lsquo;<code
605607
class=property>x-loud</code>&rsquo; maps to the user's <em>maximum
@@ -614,10 +616,17 @@ <h3 id=mixing-props-voice-volume><span class=secno>5.1. </span>The
614616
<dd>
615617
<p>A <a href="#number-def">number</a> immediately followed by "dB"
616618
(decibel unit). This represents a change (positive or negative) relative
617-
to the default value for the root element, or to the inherited volume
618-
level otherwise. This is expressed as the ratio of the squares of the
619-
new signal amplitude (a1) and the current amplitude (a0), as per the
620-
following logarithmic equation: volume(dB) = 20 log10 (a1 / a0)</p>
619+
to the given keyword value (see enumeration above), or to the default
620+
value for the root element, or otherwise to the inherited volume level
621+
(which may itself be be a combination of a keyword value and of a
622+
decibel offset). When the inherited volume level is &lsquo;<code
623+
class=property>silent</code>&rsquo;, this &lsquo;<a
624+
href="#voice-volume"><code class=property>voice-volume</code></a>&rsquo;
625+
resolves to &lsquo;<code class=property>silent</code>&rsquo; too,
626+
regardless of the provided &lt;decibel&gt; value. Decibels express the
627+
ratio F882 of the squares of the new signal amplitude (a1) and the current
628+
amplitude (a0), as per the following logarithmic equation: volume(dB) =
629+
20 log10 (a1 / a0)</p>
621630

622631
<p class=note> Note that -6.0dB is approximately half the amplitude of
623632
the audio signal, and +6.0dB is approximately twice the amplitude.</p>
@@ -1369,9 +1378,8 @@ <h3 id=rest-props-rest-before-after><span class=secno>8.1. </span>The
13691378
<dt> <strong>none</strong>
13701379

13711380
<dd>
1372-
<p> Equivalent to 0ms (no prosodic break in the speech output). This
1373-
value can be used to inhibit a prosodic break which the processor would
1374-
otherwise produce.</p>
1381+
<p> Equivalent to 0ms (no prosodic break is produced by the speech
1382+
processor).</p>
13751383

13761384
<dt> <strong>x-weak</strong>, <strong>weak</strong>,
13771385
<strong>medium</strong>, <strong>strong</strong>, and
@@ -1579,23 +1587,18 @@ <h3 id=cue-props-cue-before-after><span class=secno>9.1. </span>The
15791587
<dd>
15801588
<p>A <a href="#number-def">number</a> immediately followed by "dB"
15811589
(decibel unit). This represents a change (positive or negative) relative
1582-
to the default sound level of audio clip. This is expressed as the ratio
1590+
to the computed value of the &lsquo;<a href="#voice-volume"><code
1591+
class=property>voice-volume</code></a>&rsquo; property within the <a
1592+
href="#aural-model">aural "box" model</a> of the selected element. When
1593+
the &lsquo;<a href="#voice-volume"><code
1594+
class=property>voice-volume</code></a>&rsquo; property is set to
1595+
&lsquo;<code class=property>silent</code>&rsquo;, the audio cue is also
1596+
set to &lsquo;<code class=property>silent</code>&rsquo; (regardless of
1597+
the value provided for this &lt;decibel&gt;). Decibels express the ratio
15831598
of the squares of the new signal amplitude (a1) and the current
15841599
amplitude (a0), as per the following logarithmic equation: volume(dB) =
15851600
20 log10 (a1 / a0)</p>
15861601

1587-
<p>Audio cues apply to the selected element within the <a
1588-
href="#aural-model">audio "box" model</a>, so when the inherited value
1589-
from the &lsquo;<a href="#voice-volume"><code
1590-
class=property>voice-volume</code></a>&rsquo; property is &lsquo;<code
1591-
class=property>silent</code>&rsquo;, the volume level for the audio cue
1592-
is resolved to -infinity decibels (which effectively silences the audio
1593-
cue), regardless of the value provided for this &lt;decibel&gt;. In
1594-
other words, a selected element can be entirely silenced (i.e. including
1595-
its associated audio cues) by setting the &lsquo;<a
1596-
href="#voice-volume"><code class=property>voice-volume</code></a>&rsquo;
1597-
property to &lsquo;<code class=property>silent</code>&rsquo;.</p>
1598-
15991602
<p class=note> Note that -6.0dB is approximately half the amplitude of
16001603
the audio signal, and +6.0dB is approximately twice the amplitude.</p>
16011604

@@ -1802,6 +1805,12 @@ <h3 id=voice-props-voice-family><span class=secno>10.1. </span>The
18021805
rel=biblioentry>[SSML]<!--{{!SSML}}--></a>, voice names are
18031806
space-separated and cannot contain whitespace characters.</p>
18041807

1808+
<p> It is recommended to quote voice names that contain white space,
1809+
digits, or punctuation characters other than hyphens - even if these
1810+
voice names are valid in unquoted form - in order to improve code
1811+
clarity. For example: <code>voice-family: "john doe", "Henry
1812+
the-8th";</code></p>
1813+
18051814
<dt> <strong>&lt;age&gt;</strong>
18061815

18071816
<dd>
@@ -1855,15 +1864,6 @@ <h3 id=voice-props-voice-family><span class=secno>10.1. </span>The
18551864
voice-family: john 1st; /* identifier cannot start with digit */</pre>
18561865
</div>
18571866

1858-
<div class=example>
1859-
<p> This is an example of valid voice names that contain white space,
1860-
digits, or punctuation characters other than hyphens, but which are
1861-
quoted nonetheless, for reading clarity.</p>
1862-
1863-
<pre>
1864-
voice-family: "john doe", "Henry the-8th";</pre>
1865-
</div>
1866-
18671867
<h4 class=no-toc id=voice-selection><span class=secno>10.1.1. </span>Voice
18681868
selection, content language</h4>
18691869

@@ -2079,10 +2079,12 @@ <h3 id=voice-props-voice-pitch><span class=secno>10.3. </span>The &lsquo;<a
20792079

20802080
<p>The &lsquo;<a href="#voice-pitch"><code
20812081
class=property>voice-pitch</code></a>&rsquo; property specifies the
2082-
average pitch of generated speech output, and depends on the &lsquo;<a
2083-
href="#voice-family"><code class=property>voice-family</code></a>&rsquo;.
2084-
For example, the default average pitch for a common male voice is around
2085-
120Hz, whereas it is around 210Hz for a female voice.
2082+
"baseline" pitch of the generated speech output, which depends on the used
2083+
&lsquo;<a href="#voice-family"><code
2084+
class=property>voice-family</code></a>&rsquo; instance, and varies across
2085+
speech synthesis processors (it approximately corresponds to the average
2086+
pitch of the output). For example, the common pitch for a male voice is
2087+
around 120Hz, whereas it is around 210Hz for a female voice.
20862088

20872089
<p class=note> Note that the functionality provided by this property is
20882090
related to the <a
@@ -2095,24 +2097,18 @@ <h3 id=voice-props-voice-pitch><span class=secno>10.3. </span>The &lsquo;<a
20952097

20962098
<dd>
20972099
<p> A value in <a href="#frequency-def">frequency</a> units (Hertz or
2098-
kiloHertz, e.g. "100Hz", "+2kHz"). Unless the &lsquo;<code
2099-
class=property>relative</code>&rsquo; keyword is used, values are
2100-
restricted to positive numbers (using negative numbers results in the
2101-
property value being ignored). When the &lsquo;<code
2102-
class=property>relative</code>&rsquo; keyword is used, the provided
2103-
value specifies a relative change (decrement or increment) to the
2104-
inherited value. When the &lsquo;<code
2105-
class=property>relative</code>&rsquo; keyword is not used, the provided
2106-
value specifies the average pitch of the speaking voice, expressed as an
2107-
absolute frequency.</p>
2100+
kiloHertz, e.g. "100Hz", "+2kHz"). Values are restricted to positive
2101+
numbers (unless the &lsquo;<code class=property>relative</code>&rsquo;
2102+
keyword is used), and using negative numbers results in the property
2103+
value being ignored.</p>
21082104

21092105
<dt> <strong>relative</strong>
21102106

21112107
<dd>
21122108
<p> This keyword specifies that the provided frequency value is expressed
2113-
relatively to another base value. This disambiguates absolute positive
2114-
&lt;frequency&gt; values from increments (e.g. "+2kHz" can either be an
2115-
increment or an absolute value).</p>
2109+
relatively to the inherited value, with positive or negative numbers.
2110+
For example, "+2kHz relative" is an increment, unlike "+2kHz" which is a
2111+
positive absolute value.</p>
21162112

21172113
<dt> <strong>&lt;semitones&gt;</strong>
21182114

@@ -2132,7 +2128,7 @@ <h3 id=voice-props-voice-pitch><span class=secno>10.3. </span>The &lsquo;<a
21322128
<p> Only non-negative <a href="#percentage-def">percentage</a> values are
21332129
allowed. Computed values are calculated relative to the inherited value.
21342130
For example, 50% means that the inherited value gets multiplied by 0.5,
2135-
which results in half the inherited average pitch of the voice.</p>
2131+
which results in half the inherited pitch of the voice.</p>
21362132

21372133
<dt><strong>x-low</strong>, <strong>low</strong>, <strong>medium</strong>,
21382134
<strong>high</strong>, <strong>x-high</strong>
@@ -2150,8 +2146,10 @@ <h3 id=voice-props-voice-pitch><span class=secno>10.3. </span>The &lsquo;<a
21502146
h1 { voice-pitch: +250Hz; } /* identical to the line above */
21512147
h2 { voice-pitch: +30Hz relative; }
21522148
h2 { voice-pitch: 30Hz relative; } /* identical to the line above */
2153-
h3 { voice-pitch: relative -2st; } /* the swapped keyword placement is a legal syntax */
2154-
h4 { voice-pitch: -2st; } /* Illegal syntax ! ("relative" keyword is missing) */</pre>
2149+
h3 { voice-pitch: relative -20Hz; } /* the swapped keyword placement is a legal syntax */
2150+
h4 { voice-pitch: -20Hz; } /* Illegal syntax ! ("relative" keyword is missing for negative frequency) */
2151+
h4 { voice-pitch: -3.5st; } /* Legal syntax: semitones are always relative, no need for the keyword. */
2152+
</pre>
21552153
</div>
21562154

21572155
<h3 id=voice-props-voice-pitch-range><span class=secno>10.4. </span>The
@@ -2204,11 +2202,12 @@ <h3 id=voice-props-voice-pitch-range><span class=secno>10.4. </span>The
22042202

22052203
<p> The &lsquo;<a href="#voice-pitch-range"><code
22062204
class=property>voice-pitch-range</code></a>&rsquo; property specifies the
2207-
variability in average pitch, i.e. how much the fundamental frequency may
2208-
deviate from the average pitch. The dynamic pitch range of the generated
2209-
speech output typically increases for a highly animated voice, for example
2210-
when variations in inflection are used to convey meaning and emphasis in
2211-
speech.
2205+
variability in the "baseline" pitch, i.e. how much the fundamental
2206+
frequency may deviate from the average pitch of the speech output. The
2207+
dynamic pitch range of the generated speech generally increases for a
2208+
highly animated voice, for example when variations in inflection are used
2209+
to convey meaning and emphasis in speech. Typically, a low range produces
2210+
a flat, monotonic voice, whereas a high range produces an animated voice.
22122211

22132212
<p class=note> Note that the functionality provided by this property is
22142213
related to the <a
@@ -2221,27 +2220,18 @@ <h3 id=voice-props-voice-pitch-range><span class=secno>10.4. </span>The
22212220

22222221
<dd>
22232222
<p> A value in <a href="#frequency-def">frequency</a> units (Hertz or
2224-
kiloHertz, e.g. "100Hz", "+2kHz"). Unless the &lsquo;<code
2225-
class=property>relative</code>&rsquo; keyword is used, values are
2226-
restricted to positive numbers (using negative numbers results in the
2227-
property value being ignored). When the &lsquo;<code
2228-
class=property>relative</code>&rsquo; keyword is used, the provided
2229-
value specifies a relative change (decrement or increment) to the
2230-
inherited value. When the &lsquo;<code
2231-
class=property>relative</code>&rsquo; keyword is not used, the provided
2232-
value specifies the average pitch of the speaking voice, expressed as an
2233-
absolute frequency.</p>
2234-
2235-
<p class=note> Low ranges produce a flat, monotonic voice. A high range
2236-
produces animated voices.</p>
2223+
kiloHertz, e.g. "100Hz", "+2kHz"). Values are restricted to positive
2224+
numbers (unless the &lsquo;<code class=property>relative</code>&rsquo;
2225+
keyword is used), and using negative numbers results in the property
2226+
value being ignored.</p>
22372227

22382228
<dt> <strong>relative</strong>
22392229

22402230
<dd>
22412231
<p> This keyword specifies that the provided frequency value is expressed
2242-
relatively to another base value. This disambiguates absolute positive
2243-
&lt;frequency&gt; values from increments (e.g. "+2kHz" can either be an
2244-
increment or an absolute value).</p>
2232+
relatively to the inherited value, with positive or negative numbers.
2233+
For example, "+2kHz relative" is an increment, unlike "+2kHz" which is a
2234+
positive absolute value.</p>
22452235

22462236
<dt> <strong>&lt;semitones&gt;</strong>
22472237

@@ -2260,7 +2250,7 @@ <h3 id=voice-props-voice-pitch-range><span class=secno>10.4. </span>The
22602250
<p> Only non-negative <a href="#percentage-def">percentage</a> values are
22612251
allowed. Computed values are calculated relative to the inherited value.
22622252
For example, 50% means that the inherited value gets multiplied by 0.5,
2263-
which results in half the inherited average pitch range of the voice.</p>
2253+
which results in half the inherited pitch range of the voice.</p>
22642254

22652255
<dt><strong>x-low</strong>, <strong>low</strong>, <strong>medium</strong>,
22662256
<strong>high</strong> and <strong>x-high</strong>
@@ -2958,10 +2948,10 @@ <h2 class=no-num id=property-index>Appendix A &mdash; Property index</h2>
29582948
<tr>
29592949
<td><a class=property href="#voice-volume">voice-volume</a>
29602950

2961-
<td>normal | silent | x-soft | soft | medium | loud | x-loud |
2962-
&lt;decibel&gt;
2951+
<td>silent | [[x-soft | soft | medium | loud | x-loud] ||
2952+
&lt;decibel&gt;]
29632953

2964-
<td>normal
2954+
<td>medium
29652955

29662956
<td>all elements
29672957

0 commit comments

Comments
 (0)