Skip to content

Commit d487da9

Browse files
committed
final paste of my notes for voice-balance.
ready for review.
1 parent 9efe1a0 commit d487da9

2 files changed

Lines changed: 127 additions & 61 deletions

File tree

css3-speech/Overview.html

Lines changed: 77 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -706,43 +706,6 @@ <h3 id=mixing-props-voice-balance><span class=secno>5.2. </span>The
706706
match in the SSML markup language <a href="#SSML"
707707
rel=biblioentry>[SSML]<!--{{!SSML}}--></a>.
708708

709-
<ul>
710-
<li> When user-agents produce audio via a mono-aural sound system (i.e.
711-
single-speaker setup), the &lsquo;<a href="#voice-balance"><code
712-
class=property>voice-balance</code></a>&rsquo; property has no effect.
713-
714-
<li> When user-agents produce audio through a stereo sound system (e.g.
715-
two speakers, a pair of headphones), the left-right distribution of audio
716-
signals precisely match the authored values for the &lsquo;<a
717-
href="#voice-balance"><code
718-
class=property>voice-balance</code></a>&rsquo; property.
719-
720-
<li> When user-agents are capable of mixing audio signals through more
721-
than 2 channels (e.g. 5-speakers surround sound system, including a
722-
dedicated center channel), the physical distribution of audio signals
723-
resulting from the application of the &lsquo;<a
724-
href="#voice-balance"><code
725-
class=property>voice-balance</code></a>&rsquo; property must be performed
726-
so that the listener perceives sound as if it was coming from a basic
727-
stereo layout. For example, the center channel as well as the left/right
728-
speakers may be used altogether in order to emulate the behavior of the
729-
&lsquo;<code class=property>center</code>&rsquo; value (zero, on the
730-
[-100,100] scale of the &lsquo;<a href="#voice-balance"><code
731-
class=property>voice-balance</code></a>&rsquo; property).
732-
</ul>
733-
734-
<p class=note> Note that sound systems may be configured by users in such a
735-
way that it would interfere with the left-right audio distribution
736-
specified by document authors. Typically, the various "surround" modes
737-
available in modern sound systems (including systems based on basic stereo
738-
speakers) tend to greatly alter the perceived spatial arrangement of audio
739-
signals. Some users may even configure their system to "downgrade" any
740-
rendered sound to a single mono channel, in which case the effect of the
741-
&lsquo;<a href="#voice-balance"><code
742-
class=property>voice-balance</code></a>&rsquo; property would obviously
743-
not be perceivable at all. The rendering fidelity of authored content is
744-
therefore dependent on such user customizations.
745-
746709
<dl>
747710
<dt> <strong>&lt;number&gt;</strong>
748711

@@ -793,13 +756,90 @@ <h3 id=mixing-props-voice-balance><span class=secno>5.2. </span>The
793756
the resulting number to &lsquo;<code class=css>100</code>&rsquo;.</p>
794757
</dl>
795758

759+
<p> User agents may be connected to different kinds of sound systems,
760+
featuring varying audio mixing capabilities. The expected behavior for
761+
mono, stereo, and surround sound systems is defined as follows:
762+
763+
<ul>
764+
<li> When user-agents produce audio via a mono-aural sound system (i.e.
765+
single-speaker setup), the &lsquo;<a href="#voice-balance"><code
766+
class=property>voice-balance</code></a>&rsquo; property has no effect.
767+
768+
<li> When user-agents produce audio through a stereo sound system (e.g.
769+
two speakers, a pair of headphones), the left-right distribution of audio
770+
signals can precisely match the authored values for the &lsquo;<a
771+
href="#voice-balance"><code
772+
class=property>voice-balance</code></a>&rsquo; property.
773+
774+
<li> When user-agents are capable of mixing audio signals through more
775+
than 2 channels (e.g. 5-speakers surround sound system, including a
776+
dedicated center channel), the physical distribution of audio signals
777+
resulting from the application of the &lsquo;<a
778+
href="#voice-balance"><code
779+
class=property>voice-balance</code></a>&rsquo; property should be
780+
performed so that the listener perceives sound as if it was coming from a
781+
basic stereo layout. For example, the center channel as well as the
782+
left/right speakers may be used altogether in order to emulate the
783+
behavior of the &lsquo;<code class=property>center</code>&rsquo; value.
784+
</ul>
785+
786+
<p class=note> Note that sound systems may be configured by users in such a
787+
way that it would interfere with the left-right audio distribution
788+
specified by document authors. Typically, the various "surround" modes
789+
available in modern sound systems (including systems based on basic stereo
790+
speakers) tend to greatly alter the perceived spatial arrangement of audio
791+
signals. The illusion of a three-dimensional sound stage is often achieved
792+
using a combination of phase shifting, digital delay, volume control
793+
(channel mixing), and other techniques. Some users may even configure
794+
their system to "downgrade" any rendered sound to a single mono channel,
795+
in which case the effect of the &lsquo;<a href="#voice-balance"><code
796+
class=property>voice-balance</code></a>&rsquo; property would obviously
797+
not be perceivable at all. The rendering fidelity of authored content is
798+
therefore dependent on such user customizations, and the &lsquo;<a
799+
href="#voice-balance"><code class=property>voice-balance</code></a>&rsquo;
800+
property merely specifies the desired end-result.
801+
796802
<p class=note> Note that many speech synthesizers only generate mono sound,
797803
and therefore do not intrinsically support the &lsquo;<a
798804
href="#voice-balance"><code class=property>voice-balance</code></a>&rsquo;
799805
property. The sound distribution along the left-right axis consequently
800806
occurs at post-synthesis stage (when the speech-enabled user-agent mixes
801807
the various audio sources authored within the document)
802808

809+
<p> Future revisions of the CSS Speech module may include support for
810+
three-dimensional audio, which would effectively enable authors to specify
811+
"azimuth" and "elevation" values. In the future, content authored using
812+
the current specification may therefore be consumed by user-agents which
813+
are compliant with the version of CSS Speech that supports
814+
three-dimensional audio. In order to prepare for this possibility, the
815+
values enabled by the current &lsquo;<a href="#voice-balance"><code
816+
class=property>voice-balance</code></a>&rsquo; property are designed to
817+
remain compatible with "azimuth" angles. More precisely, the mapping
818+
between the current left-right audio axis (lateral sound stage) and the
819+
envisioned 360 degrees plane around the listener's position is defined as
820+
follows:
821+
822+
<ul>
823+
<li>The value &lsquo;<code class=css>0</code>&rsquo; maps to zero degrees
824+
(&lsquo;<code class=property>center</code>&rsquo;). This is in "front" of
825+
the listener, not from "behind".
826+
827+
<li>The value &lsquo;<code class=css>-100</code>&rsquo; maps to -40
828+
degrees (&lsquo;<code class=property>left</code>&rsquo;). Negative angles
829+
are in the counter-clockwise direction (the audio stage is seen from the
830+
top).
831+
832+
<li>The value &lsquo;<code class=css>100</code>&rsquo; maps to 40 degrees
833+
(&lsquo;<code class=property>right</code>&rsquo;). Positive angles are in
834+
the clockwise direction (the audio stage is seen from the top).
835+
836+
<li>Intermediary values on the scale from &lsquo;<code
837+
class=css>-100</code>&rsquo; to &lsquo;<code class=css>100</code>&rsquo;
838+
map to the angles between -40 and 40 degrees in a numerically
839+
linearly-proportional manner. For example, &lsquo;<code
840+
class=css>-50</code>&rsquo; maps to -20 degrees.
841+
</ul>
842+
803843
<h2 id=speaking-props><span class=secno>6. </span>Speaking properties</h2>
804844

805845
<h3 id=speaking-props-speak><span class=secno>6.1. </span>The &lsquo;<a

css3-speech/Overview.src.html

Lines changed: 50 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -416,33 +416,9 @@ <h3 id="mixing-props-voice-balance">The 'voice-balance' property</h3>
416416
lateral sound stage: one extremity is on the left, the other extremity is on the right hand
417417
side, relative to the listener's position. Authors can specify intermediary steps between left
418418
and right extremities, to represent the audio separation along the resulting left-right axis. </p>
419-
420419
<p class="note"> Note that the functionality provided by this property has no match in the SSML
421420
markup language [[!SSML]]. </p>
422421

423-
<ul>
424-
<li> When user-agents produce audio via a mono-aural sound system (i.e. single-speaker setup),
425-
the 'voice-balance' property has no effect. </li>
426-
<li> When user-agents produce audio through a stereo sound system (e.g. two speakers, a pair
427-
of headphones), the left-right distribution of audio signals precisely match the authored
428-
values for the 'voice-balance' property. </li>
429-
<li> When user-agents are capable of mixing audio signals through more than 2 channels (e.g.
430-
5-speakers surround sound system, including a dedicated center channel), the physical
431-
distribution of audio signals resulting from the application of the 'voice-balance' property
432-
must be performed so that the listener perceives sound as if it was coming from a basic
433-
stereo layout. For example, the center channel as well as the left/right speakers may be
434-
used altogether in order to emulate the behavior of the 'center' value (zero, on the
435-
[-100,100] scale of the 'voice-balance' property). </li>
436-
</ul>
437-
438-
<p class="note"> Note that sound systems may be configured by users in such a way that it would
439-
interfere with the left-right audio distribution specified by document authors. Typically, the
440-
various "surround" modes available in modern sound systems (including systems based on basic
441-
stereo speakers) tend to greatly alter the perceived spatial arrangement of audio signals.
442-
Some users may even configure their system to "downgrade" any rendered sound to a single mono
443-
channel, in which case the effect of the 'voice-balance' property would obviously not be
444-
perceivable at all. The rendering fidelity of authored content is therefore dependent on such
445-
user customizations. </p>
446422
<dl>
447423
<dt>
448424
<strong>&lt;number&gt;</strong>
@@ -488,10 +464,60 @@ <h3 id="mixing-props-voice-balance">The 'voice-balance' property</h3>
488464
clamping the resulting number to '100'.</p>
489465
</dd>
490466
</dl>
467+
468+
<p> User agents may be connected to different kinds of sound systems, featuring varying audio
469+
mixing capabilities. The expected behavior for mono, stereo, and surround sound systems is
470+
defined as follows: </p>
471+
<ul>
472+
<li> When user-agents produce audio via a mono-aural sound system (i.e. single-speaker setup),
473+
the 'voice-balance' property has no effect. </li>
474+
<li> When user-agents produce audio through a stereo sound system (e.g. two speakers, a pair
475+
of headphones), the left-right distribution of audio signals can precisely match the
476+
authored values for the 'voice-balance' property. </li>
477+
<li> When user-agents are capable of mixing audio signals through more than 2 channels (e.g.
478+
5-speakers surround sound system, including a dedicated center channel), the physical
479+
distribution of audio signals resulting from the application of the 'voice-balance' property
480+
should be performed so that the listener perceives sound as if it was coming from a basic
481+
stereo layout. For example, the center channel as well as the left/right speakers may be
482+
used altogether in order to emulate the behavior of the 'center' value. </li>
483+
</ul>
484+
485+
<p class="note"> Note that sound systems may be configured by users in such a way that it would
486+
interfere with the left-right audio distribution specified by document authors. Typically, the
487+
various "surround" modes available in modern sound systems (including systems based on basic
488+
stereo speakers) tend to greatly alter the perceived spatial arrangement of audio signals. The
489+
illusion of a three-dimensional sound stage is often achieved using a combination of phase
490+
shifting, digital delay, volume control (channel mixing), and other techniques. Some users may
491+
even configure their system to "downgrade" any rendered sound to a single mono channel, in
492+
which case the effect of the 'voice-balance' property would obviously not be perceivable at
493+
all. The rendering fidelity of authored content is therefore dependent on such user
494+
customizations, and the 'voice-balance' property merely specifies the desired end-result. </p>
495+
491496
<p class="note"> Note that many speech synthesizers only generate mono sound, and therefore do
492497
not intrinsically support the 'voice-balance' property. The sound distribution along the
493498
left-right axis consequently occurs at post-synthesis stage (when the speech-enabled
494499
user-agent mixes the various audio sources authored within the document) </p>
500+
501+
<p> Future revisions of the CSS Speech module may include support for three-dimensional audio,
502+
which would effectively enable authors to specify "azimuth" and "elevation" values. In the
503+
future, content authored using the current specification may therefore be consumed by
504+
user-agents which are compliant with the version of CSS Speech that supports three-dimensional
505+
audio. In order to prepare for this possibility, the values enabled by the current
506+
'voice-balance' property are designed to remain compatible with "azimuth" angles. More
507+
precisely, the mapping between the current left-right audio axis (lateral sound stage) and the
508+
envisioned 360 degrees plane around the listener's position is defined as follows: </p>
509+
<ul>
510+
<li>The value '0' maps to zero degrees ('center'). This is in "front" of the listener, not
511+
from "behind".</li>
512+
<li>The value '-100' maps to -40 degrees ('left'). Negative angles are in the
513+
counter-clockwise direction (the audio stage is seen from the top).</li>
514+
<li>The value '100' maps to 40 degrees ('right'). Positive angles are in the clockwise
515+
direction (the audio stage is seen from the top).</li>
516+
<li>Intermediary values on the scale from '-100' to '100' map to the angles between -40 and 40
517+
degrees in a numerically linearly-proportional manner. For example, '-50' maps to -20
518+
degrees.</li>
519+
</ul>
520+
495521
<h2 id="speaking-props">Speaking properties</h2>
496522
<h3 id="speaking-props-speak">The 'speak' property</h3>
497523
<table class="propdef" summary="name: syntax">

0 commit comments

Comments
 (0)