miller-book/node175.html

<!DOCTYPE html>

<!--Converted with LaTeX2HTML 2002-2-1 (1.71)
original version by:  Nikos Drakos, CBLU, University of Leeds
* revised and updated by:  Marcus Hennecke, Ross Moore, Herb Swan
* with significant contributions from:
  Jens Lippmann, Marek Rouchal, Martin Wilck and others -->
<HTML>
<HEAD>

<meta name="viewport" content="width=device-width, initial-scale=1.0">


<TITLE>Phase</TITLE>
<META NAME="description" CONTENT="Phase">
<META NAME="keywords" CONTENT="book">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">

<META NAME="Generator" CONTENT="LaTeX2HTML v2002-2-1">
<META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css">

<LINK REL="STYLESHEET" HREF="book.css">

<LINK REL="next" HREF="node177.html">
<LINK REL="previous" HREF="node172.html">
<LINK REL="up" HREF="node163.html">
<LINK REL="next" HREF="node176.html">
</HEAD>

<BODY >
<!--Navigation Panel-->
<A ID="tex2html3186"
  HREF="node176.html">
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
 SRC="next.png"></A>
<A ID="tex2html3180"
  HREF="node163.html">
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
 SRC="up.png"></A>
<A ID="tex2html3174"
  HREF="node174.html">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
 SRC="prev.png"></A>
<A ID="tex2html3182"
  HREF="node4.html">
<IMG WIDTH="65" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="contents"
 SRC="contents.png"></A>
<A ID="tex2html3184"
  HREF="node201.html">
<IMG WIDTH="43" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="index"
 SRC="index.png"></A>
<BR>
<B> Next:</B> <A ID="tex2html3187"
  HREF="node176.html">Phase relationships between channels</A>
<B> Up:</B> <A ID="tex2html3181"
  HREF="node163.html">Fourier analysis and resynthesis</A>
<B> Previous:</B> <A ID="tex2html3175"
  HREF="node174.html">Timbre stamping (classical vocoder)</A>
 &nbsp; <B>  <A ID="tex2html3183"
  HREF="node4.html">Contents</A></B>
 &nbsp; <B>  <A ID="tex2html3185"
  HREF="node201.html">Index</A></B>
<BR>
<BR>
<!--End of Navigation Panel-->

<H1><A ID="SECTION001350000000000000000"></A>
<A ID="sect9.phase"></A>
<BR>
Phase
</H1>

<P>
So far we have operated on signals by altering the magnitudes of their
windowed Fourier transforms, but leaving phases intact.  The magnitudes
encode the spectral envelope of the sound.  The phases, on the other hand,
encode frequency and time, in the sense that phase change from
one window to a different one accumulates, over time, according to frequency.
To make a transformation that allows independent control over frequency and
time requires analyzing and reconstructing the phase.

<P>

<DIV ALIGN="CENTER"><A ID="fig09.10"></A><A ID="12612"></A>
<TABLE>
<CAPTION ALIGN="BOTTOM"><STRONG>Figure 9.10:</STRONG>
Phase in windowed Fourier analysis: (a) a complex sinusoid analyzed
on three successive windows; (b) the result for a single channel (k=3), for
the three windows.</CAPTION>
<TR><TD><IMG
 WIDTH="447" HEIGHT="623" BORDER="0"
 SRC="img1176.png"
 ALT="\begin{figure}\psfig{file=figs/fig09.10.ps}\end{figure}"></TD></TR>
</TABLE>
</DIV>

<P>
In the analysis/synthesis examples of the previous section, the phase of the
output is copied directly from the phase of the input.  This is appropriate
when the output signal corresponds in time with the input signal.  Sometimes time
modifications are desired, for instance to do time stretching or contraction.
Alternatively the output phase might depend on more than one input, for instance
to morph between one sound and another.

<P>
Figure <A HREF="#fig09.10">9.10</A> shows how the phase of the Fourier transform
changes from window to window, given a complex sinusoid as input.  The
sinusoid's frequency is <!-- MATH
 $\alpha = 3\omega$
 -->
<IMG
 WIDTH="53" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
 SRC="img1177.png"
 ALT="$\alpha = 3\omega$">, so that the peak in the Fourier transform
is centered at <IMG
 WIDTH="41" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
 SRC="img1178.png"
 ALT="$k=3$">.  If the initial phase is <IMG
 WIDTH="13" HEIGHT="30" ALIGN="MIDDLE" BORDER="0"
 SRC="img77.png"
 ALT="$\phi$">, then the neighboring
phases can be filled in as:
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
 \begin{displaymath}
\begin{array}{lll}
        {\angle S[0, 2] = \phi + \pi} &
            {\angle S[0, 3] = \phi} &
            {\angle S[0, 4] = \phi + \pi}\\
        {\angle S[1, 2] = \phi + H\alpha + \pi } &
            {\angle S[1, 3] = \phi + H\alpha} &
            {\angle S[1, 4] = \phi + H\alpha + \pi}\\
        {\angle S[2, 2] = \phi + 2H\alpha + \pi } &
            {\angle S[2, 3] = \phi + 2H\alpha} &
            {\angle S[2, 4] = \phi + 2H\alpha + \pi}\\
    \end{array}
\end{displaymath}
 -->

<IMG
 WIDTH="494" HEIGHT="64" BORDER="0"
 SRC="img1179.png"
 ALT="\begin{displaymath}
\begin{array}{lll}
{\angle S[0, 2] = \phi + \pi} &amp;
{\angl...
...ha} &amp;
{\angle S[2, 4] = \phi + 2H\alpha + \pi}\\
\end{array}\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>
This gives an excellent way of estimating the frequency <IMG
 WIDTH="13" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
 SRC="img7.png"
 ALT="$\alpha $">: pick any
channel whose amplitude is dominated by the sinusoid and subtract two
successive phase to get <IMG
 WIDTH="28" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
 SRC="img1180.png"
 ALT="$H\alpha$">:
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
 \begin{displaymath}
H \alpha = \angle S[1, 3] - \angle S[0, 3]
\end{displaymath}
 -->

<IMG
 WIDTH="169" HEIGHT="28" BORDER="0"
 SRC="img1181.png"
 ALT="\begin{displaymath}
H \alpha = \angle S[1, 3] - \angle S[0, 3]
\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
 \begin{displaymath}
\alpha = {{\angle S[1, 3] - \angle S[0, 3] + 2 p \pi} \over H}
\end{displaymath}
 -->

<IMG
 WIDTH="203" HEIGHT="40" BORDER="0"
 SRC="img1182.png"
 ALT="\begin{displaymath}
\alpha = {{\angle S[1, 3] - \angle S[0, 3] + 2 p \pi} \over H}
\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>
where <IMG
 WIDTH="11" HEIGHT="29" ALIGN="MIDDLE" BORDER="0"
 SRC="img57.png"
 ALT="$p$"> is an integer.  There are <IMG
 WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
 SRC="img25.png"
 ALT="$H$"> possible frequencies, spaced by
<IMG
 WIDTH="43" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
 SRC="img1183.png"
 ALT="$2\pi/H$">.  If we are using an overlap of 4, that is, <IMG
 WIDTH="68" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
 SRC="img1154.png"
 ALT="$H=N/4$">, the frequencies
are spaced by <!-- MATH
 $8\pi/N = 4 \omega$
 -->
<IMG
 WIDTH="82" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
 SRC="img1184.png"
 ALT="$8\pi/N = 4 \omega$">.  Happily, this is the width of the main lobe
for the Hann window, so no more than one possible value of <IMG
 WIDTH="13" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
 SRC="img7.png"
 ALT="$\alpha $"> can explain
any measured phase difference within the main lobe of a peak.  The correct value
of <IMG
 WIDTH="11" HEIGHT="29" ALIGN="MIDDLE" BORDER="0"
 SRC="img57.png"
 ALT="$p$"> to choose is that which gives a frequency closest to the nominal
frequency of the channel, <IMG
 WIDTH="22" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
 SRC="img1145.png"
 ALT="$k\omega$">.

<P>
When computing phases for synthesizing a new or modified signal, we want to
maintain the appropriate phase relationships between successive resynthesis
windows, and also, simultaneously, between adjacent channels. These two sets of
relationships are not always compatible, however.  We will make it our first
obligation to honor the relations between successive resynthesis windows, and
worry about phase relationships between channels afterward.

<P>
Suppose we want to construct the <IMG
 WIDTH="17" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
 SRC="img111.png"
 ALT="$m$">th spectrum <IMG
 WIDTH="52" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
 SRC="img64.png"
 ALT="$S[m, k]$"> for resynthesis
(having already constructed the previous one, number <IMG
 WIDTH="44" HEIGHT="29" ALIGN="MIDDLE" BORDER="0"
 SRC="img1185.png"
 ALT="$m-1$">).  Suppose
we wish the phase relationships between windows <IMG
 WIDTH="44" HEIGHT="29" ALIGN="MIDDLE" BORDER="0"
 SRC="img1185.png"
 ALT="$m-1$"> and <IMG
 WIDTH="17" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
 SRC="img111.png"
 ALT="$m$"> to be those of
a signal <IMG
 WIDTH="31" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
 SRC="img80.png"
 ALT="$x[n]$">, but that the phases of window number <IMG
 WIDTH="44" HEIGHT="29" ALIGN="MIDDLE" BORDER="0"
 SRC="img1185.png"
 ALT="$m-1$"> might have come
from somewhere else and can't be assumed to be in line with our wishes.

<P>

<DIV ALIGN="CENTER"><A ID="fig09.11"></A><A ID="12631"></A>
<TABLE>
<CAPTION ALIGN="BOTTOM"><STRONG>Figure 9.11:</STRONG>
Propagating phases in resynthesis.  Each phase, such as that of
<IMG
 WIDTH="52" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
 SRC="img64.png"
 ALT="$S[m, k]$"> here, depends on the previous output phase and the difference of the
input phases.</CAPTION>
<TR><TD><IMG
 WIDTH="459" HEIGHT="405" BORDER="0"
 SRC="img1186.png"
 ALT="\begin{figure}\psfig{file=figs/fig09.11.ps}\end{figure}"></TD></TR>
</TABLE>
</DIV>

<P>

<DIV ALIGN="CENTER"><A ID="fig09.12"></A><A ID="12636"></A>
<TABLE>
<CAPTION ALIGN="BOTTOM"><STRONG>Figure 9.12:</STRONG>
Phases of one channel of the analysis windows and two successive
resynthesis windows.</CAPTION>
<TR><TD><IMG
 WIDTH="355" HEIGHT="244" BORDER="0"
 SRC="img1187.png"
 ALT="\begin{figure}\psfig{file=figs/fig09.12.ps}\end{figure}"></TD></TR>
</TABLE>
</DIV>

<P>
To find out how much the phase of each channel should differ from the previous
one, we do two analyses of the signal <IMG
 WIDTH="31" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
 SRC="img80.png"
 ALT="$x[n]$">, separated by the same hop size
<IMG
 WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
 SRC="img25.png"
 ALT="$H$"> that we're using for resynthesis:
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
 \begin{displaymath}
T[k] = {\cal FT}(W(n)X[n]) (k)
\end{displaymath}
 -->

<IMG
 WIDTH="180" HEIGHT="28" BORDER="0"
 SRC="img1188.png"
 ALT="\begin{displaymath}
T[k] = {\cal FT}(W(n)X[n]) (k)
\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
 \begin{displaymath}
T'[k] = {\cal FT}(W(n)X[n+H]) (k)
\end{displaymath}
 -->

<IMG
 WIDTH="219" HEIGHT="28" BORDER="0"
 SRC="img1189.png"
 ALT="\begin{displaymath}
T'[k] = {\cal FT}(W(n)X[n+H]) (k)
\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>
Figure <A HREF="#fig09.11">9.11</A> shows the process of phase accumulation, in which the
output phases each depend on the previous output phase and the phase difference
for two windowed analyses of the input.  Figure <A HREF="#fig09.12">9.12</A> illustrates the
phase relationship in the complex plane.
The phase of the new output <IMG
 WIDTH="52" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
 SRC="img64.png"
 ALT="$S[m, k]$"> should be that of the previous one plus the
difference between the phases of the two analyses:
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
 \begin{displaymath}
\angle S[m, k] = \angle S[m-1, k] +
    \left ( \angle T'[k] -  \angle T[k] \right )
\end{displaymath}
 -->

<IMG
 WIDTH="299" HEIGHT="28" BORDER="0"
 SRC="img1190.png"
 ALT="\begin{displaymath}
\angle S[m, k] = \angle S[m-1, k] +
\left ( \angle T'[k] - \angle T[k] \right )
\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
 \begin{displaymath}
= \angle \left (
        {{S[m-1, k] T'[k]}
            \over
        {T[k]}}
    \right )
\end{displaymath}
 -->

<IMG
 WIDTH="163" HEIGHT="45" BORDER="0"
 SRC="img1191.png"
 ALT="\begin{displaymath}
= \angle \left (
{{S[m-1, k] T'[k]}
\over
{T[k]}}
\right )
\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>
Here we used the fact that multiplying or dividing two complex numbers gives
the sum or difference of their arguments.

<P>
If the desired magnitude is a real number <IMG
 WIDTH="11" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
 SRC="img4.png"
 ALT="$a$">, then we should set <IMG
 WIDTH="52" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
 SRC="img64.png"
 ALT="$S[m, k]$">
to:
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
 \begin{displaymath}
S[m, k] \; = \;
    a
    \; \cdot \;
    {
        {  \left |
            {{S[m-1, k] T'[k]}
            \over
            {T[k]}}
        \right |}
        ^
        {-1}
    }
    \; \cdot \;
    {
        {{S[m-1, k] T'[k]}
        \over
        {T[k]}}
    }
\end{displaymath}
 -->

<IMG
 WIDTH="383" HEIGHT="48" BORDER="0"
 SRC="img1192.png"
 ALT="\begin{displaymath}
S[m, k] \; = \;
a
\; \cdot \;
{
{ \left \vert
{{S[m-...
...{-1}
}
\; \cdot \;
{
{{S[m-1, k] T'[k]}
\over
{T[k]}}
}
\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>
The magnitudes of the second and third terms cancel out, so that the magnitude
of <IMG
 WIDTH="52" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
 SRC="img64.png"
 ALT="$S[m, k]$"> reduces to <IMG
 WIDTH="11" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
 SRC="img4.png"
 ALT="$a$">; the first two terms are real numbers so the
argument is controlled by the last term.

<P>
If we want to end up with the magnitude from the spectrum <IMG
 WIDTH="15" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
 SRC="img557.png"
 ALT="$T$"> as well, we can
set <IMG
 WIDTH="75" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
 SRC="img1193.png"
 ALT="$a = \vert T'[k]\vert$"> and simplify:
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
 \begin{displaymath}
S[m, k] \; = \;
    {
        {  \left |
            {{S[m-1, k]}
            \over
            {T[k]}}
        \right |}
        ^
        {-1}
    }
    \; \cdot \;
    {
        {{S[m-1, k] T'[k]}
        \over
        {T[k]}}
    }
\end{displaymath}
 -->

<IMG
 WIDTH="321" HEIGHT="48" BORDER="0"
 SRC="img1194.png"
 ALT="\begin{displaymath}
S[m, k] \; = \;
{
{ \left \vert
{{S[m-1, k]}
\over
{...
...{-1}
}
\; \cdot \;
{
{{S[m-1, k] T'[k]}
\over
{T[k]}}
}
\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>

<P>
<BR><HR>
<!--Table of Child-Links-->
<A ID="CHILD_LINKS"><STRONG>Subsections</STRONG></A>

<UL>
<LI><A ID="tex2html3188"
  HREF="node176.html">Phase relationships between channels</A>
</UL>
<!--End of Table of Child-Links-->
<HR>
<!--Navigation Panel-->
<A ID="tex2html3186"
  HREF="node176.html">
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
 SRC="next.png"></A>
<A ID="tex2html3180"
  HREF="node163.html">
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
 SRC="up.png"></A>
<A ID="tex2html3174"
  HREF="node174.html">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
 SRC="prev.png"></A>
<A ID="tex2html3182"
  HREF="node4.html">
<IMG WIDTH="65" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="contents"
 SRC="contents.png"></A>
<A ID="tex2html3184"
  HREF="node201.html">
<IMG WIDTH="43" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="index"
 SRC="index.png"></A>
<BR>
<B> Next:</B> <A ID="tex2html3187"
  HREF="node176.html">Phase relationships between channels</A>
<B> Up:</B> <A ID="tex2html3181"
  HREF="node163.html">Fourier analysis and resynthesis</A>
<B> Previous:</B> <A ID="tex2html3175"
  HREF="node174.html">Timbre stamping (classical vocoder)</A>
 &nbsp; <B>  <A ID="tex2html3183"
  HREF="node4.html">Contents</A></B>
 &nbsp; <B>  <A ID="tex2html3185"
  HREF="node201.html">Index</A></B>
<!--End of Navigation Panel-->
<ADDRESS>
Miller Puckette
2006-12-30
</ADDRESS>
</BODY>
</HTML>