miller-book/node175.html

572 lines
15 KiB
HTML

<!DOCTYPE html>
<!--Converted with LaTeX2HTML 2002-2-1 (1.71)
original version by: Nikos Drakos, CBLU, University of Leeds
* revised and updated by: Marcus Hennecke, Ross Moore, Herb Swan
* with significant contributions from:
Jens Lippmann, Marek Rouchal, Martin Wilck and others -->
<HTML>
<HEAD>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<TITLE>Phase</TITLE>
<META NAME="description" CONTENT="Phase">
<META NAME="keywords" CONTENT="book">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META NAME="Generator" CONTENT="LaTeX2HTML v2002-2-1">
<META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css">
<LINK REL="STYLESHEET" HREF="book.css">
<LINK REL="next" HREF="node177.html">
<LINK REL="previous" HREF="node172.html">
<LINK REL="up" HREF="node163.html">
<LINK REL="next" HREF="node176.html">
</HEAD>
<BODY >
<!--Navigation Panel-->
<A ID="tex2html3186"
HREF="node176.html">
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
SRC="next.png"></A>
<A ID="tex2html3180"
HREF="node163.html">
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
SRC="up.png"></A>
<A ID="tex2html3174"
HREF="node174.html">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
SRC="prev.png"></A>
<A ID="tex2html3182"
HREF="node4.html">
<IMG WIDTH="65" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="contents"
SRC="contents.png"></A>
<A ID="tex2html3184"
HREF="node201.html">
<IMG WIDTH="43" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="index"
SRC="index.png"></A>
<BR>
<B> Next:</B> <A ID="tex2html3187"
HREF="node176.html">Phase relationships between channels</A>
<B> Up:</B> <A ID="tex2html3181"
HREF="node163.html">Fourier analysis and resynthesis</A>
<B> Previous:</B> <A ID="tex2html3175"
HREF="node174.html">Timbre stamping (classical vocoder)</A>
&nbsp; <B> <A ID="tex2html3183"
HREF="node4.html">Contents</A></B>
&nbsp; <B> <A ID="tex2html3185"
HREF="node201.html">Index</A></B>
<BR>
<BR>
<!--End of Navigation Panel-->
<H1><A ID="SECTION001350000000000000000"></A>
<A ID="sect9.phase"></A>
<BR>
Phase
</H1>
<P>
So far we have operated on signals by altering the magnitudes of their
windowed Fourier transforms, but leaving phases intact. The magnitudes
encode the spectral envelope of the sound. The phases, on the other hand,
encode frequency and time, in the sense that phase change from
one window to a different one accumulates, over time, according to frequency.
To make a transformation that allows independent control over frequency and
time requires analyzing and reconstructing the phase.
<P>
<DIV ALIGN="CENTER"><A ID="fig09.10"></A><A ID="12612"></A>
<TABLE>
<CAPTION ALIGN="BOTTOM"><STRONG>Figure 9.10:</STRONG>
Phase in windowed Fourier analysis: (a) a complex sinusoid analyzed
on three successive windows; (b) the result for a single channel (k=3), for
the three windows.</CAPTION>
<TR><TD><IMG
WIDTH="447" HEIGHT="623" BORDER="0"
SRC="img1176.png"
ALT="\begin{figure}\psfig{file=figs/fig09.10.ps}\end{figure}"></TD></TR>
</TABLE>
</DIV>
<P>
In the analysis/synthesis examples of the previous section, the phase of the
output is copied directly from the phase of the input. This is appropriate
when the output signal corresponds in time with the input signal. Sometimes time
modifications are desired, for instance to do time stretching or contraction.
Alternatively the output phase might depend on more than one input, for instance
to morph between one sound and another.
<P>
Figure <A HREF="#fig09.10">9.10</A> shows how the phase of the Fourier transform
changes from window to window, given a complex sinusoid as input. The
sinusoid's frequency is <!-- MATH
$\alpha = 3\omega$
-->
<IMG
WIDTH="53" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
SRC="img1177.png"
ALT="$\alpha = 3\omega$">, so that the peak in the Fourier transform
is centered at <IMG
WIDTH="41" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img1178.png"
ALT="$k=3$">. If the initial phase is <IMG
WIDTH="13" HEIGHT="30" ALIGN="MIDDLE" BORDER="0"
SRC="img77.png"
ALT="$\phi$">, then the neighboring
phases can be filled in as:
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
\begin{displaymath}
\begin{array}{lll}
{\angle S[0, 2] = \phi + \pi} &
{\angle S[0, 3] = \phi} &
{\angle S[0, 4] = \phi + \pi}\\
{\angle S[1, 2] = \phi + H\alpha + \pi } &
{\angle S[1, 3] = \phi + H\alpha} &
{\angle S[1, 4] = \phi + H\alpha + \pi}\\
{\angle S[2, 2] = \phi + 2H\alpha + \pi } &
{\angle S[2, 3] = \phi + 2H\alpha} &
{\angle S[2, 4] = \phi + 2H\alpha + \pi}\\
\end{array}
\end{displaymath}
-->
<IMG
WIDTH="494" HEIGHT="64" BORDER="0"
SRC="img1179.png"
ALT="\begin{displaymath}
\begin{array}{lll}
{\angle S[0, 2] = \phi + \pi} &amp;
{\angl...
...ha} &amp;
{\angle S[2, 4] = \phi + 2H\alpha + \pi}\\
\end{array}\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>
This gives an excellent way of estimating the frequency <IMG
WIDTH="13" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
SRC="img7.png"
ALT="$\alpha $">: pick any
channel whose amplitude is dominated by the sinusoid and subtract two
successive phase to get <IMG
WIDTH="28" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img1180.png"
ALT="$H\alpha$">:
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
\begin{displaymath}
H \alpha = \angle S[1, 3] - \angle S[0, 3]
\end{displaymath}
-->
<IMG
WIDTH="169" HEIGHT="28" BORDER="0"
SRC="img1181.png"
ALT="\begin{displaymath}
H \alpha = \angle S[1, 3] - \angle S[0, 3]
\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
\begin{displaymath}
\alpha = {{\angle S[1, 3] - \angle S[0, 3] + 2 p \pi} \over H}
\end{displaymath}
-->
<IMG
WIDTH="203" HEIGHT="40" BORDER="0"
SRC="img1182.png"
ALT="\begin{displaymath}
\alpha = {{\angle S[1, 3] - \angle S[0, 3] + 2 p \pi} \over H}
\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>
where <IMG
WIDTH="11" HEIGHT="29" ALIGN="MIDDLE" BORDER="0"
SRC="img57.png"
ALT="$p$"> is an integer. There are <IMG
WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img25.png"
ALT="$H$"> possible frequencies, spaced by
<IMG
WIDTH="43" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
SRC="img1183.png"
ALT="$2\pi/H$">. If we are using an overlap of 4, that is, <IMG
WIDTH="68" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
SRC="img1154.png"
ALT="$H=N/4$">, the frequencies
are spaced by <!-- MATH
$8\pi/N = 4 \omega$
-->
<IMG
WIDTH="82" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
SRC="img1184.png"
ALT="$8\pi/N = 4 \omega$">. Happily, this is the width of the main lobe
for the Hann window, so no more than one possible value of <IMG
WIDTH="13" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
SRC="img7.png"
ALT="$\alpha $"> can explain
any measured phase difference within the main lobe of a peak. The correct value
of <IMG
WIDTH="11" HEIGHT="29" ALIGN="MIDDLE" BORDER="0"
SRC="img57.png"
ALT="$p$"> to choose is that which gives a frequency closest to the nominal
frequency of the channel, <IMG
WIDTH="22" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img1145.png"
ALT="$k\omega$">.
<P>
When computing phases for synthesizing a new or modified signal, we want to
maintain the appropriate phase relationships between successive resynthesis
windows, and also, simultaneously, between adjacent channels. These two sets of
relationships are not always compatible, however. We will make it our first
obligation to honor the relations between successive resynthesis windows, and
worry about phase relationships between channels afterward.
<P>
Suppose we want to construct the <IMG
WIDTH="17" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
SRC="img111.png"
ALT="$m$">th spectrum <IMG
WIDTH="52" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
SRC="img64.png"
ALT="$S[m, k]$"> for resynthesis
(having already constructed the previous one, number <IMG
WIDTH="44" HEIGHT="29" ALIGN="MIDDLE" BORDER="0"
SRC="img1185.png"
ALT="$m-1$">). Suppose
we wish the phase relationships between windows <IMG
WIDTH="44" HEIGHT="29" ALIGN="MIDDLE" BORDER="0"
SRC="img1185.png"
ALT="$m-1$"> and <IMG
WIDTH="17" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
SRC="img111.png"
ALT="$m$"> to be those of
a signal <IMG
WIDTH="31" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
SRC="img80.png"
ALT="$x[n]$">, but that the phases of window number <IMG
WIDTH="44" HEIGHT="29" ALIGN="MIDDLE" BORDER="0"
SRC="img1185.png"
ALT="$m-1$"> might have come
from somewhere else and can't be assumed to be in line with our wishes.
<P>
<DIV ALIGN="CENTER"><A ID="fig09.11"></A><A ID="12631"></A>
<TABLE>
<CAPTION ALIGN="BOTTOM"><STRONG>Figure 9.11:</STRONG>
Propagating phases in resynthesis. Each phase, such as that of
<IMG
WIDTH="52" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
SRC="img64.png"
ALT="$S[m, k]$"> here, depends on the previous output phase and the difference of the
input phases.</CAPTION>
<TR><TD><IMG
WIDTH="459" HEIGHT="405" BORDER="0"
SRC="img1186.png"
ALT="\begin{figure}\psfig{file=figs/fig09.11.ps}\end{figure}"></TD></TR>
</TABLE>
</DIV>
<P>
<DIV ALIGN="CENTER"><A ID="fig09.12"></A><A ID="12636"></A>
<TABLE>
<CAPTION ALIGN="BOTTOM"><STRONG>Figure 9.12:</STRONG>
Phases of one channel of the analysis windows and two successive
resynthesis windows.</CAPTION>
<TR><TD><IMG
WIDTH="355" HEIGHT="244" BORDER="0"
SRC="img1187.png"
ALT="\begin{figure}\psfig{file=figs/fig09.12.ps}\end{figure}"></TD></TR>
</TABLE>
</DIV>
<P>
To find out how much the phase of each channel should differ from the previous
one, we do two analyses of the signal <IMG
WIDTH="31" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
SRC="img80.png"
ALT="$x[n]$">, separated by the same hop size
<IMG
WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img25.png"
ALT="$H$"> that we're using for resynthesis:
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
\begin{displaymath}
T[k] = {\cal FT}(W(n)X[n]) (k)
\end{displaymath}
-->
<IMG
WIDTH="180" HEIGHT="28" BORDER="0"
SRC="img1188.png"
ALT="\begin{displaymath}
T[k] = {\cal FT}(W(n)X[n]) (k)
\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
\begin{displaymath}
T'[k] = {\cal FT}(W(n)X[n+H]) (k)
\end{displaymath}
-->
<IMG
WIDTH="219" HEIGHT="28" BORDER="0"
SRC="img1189.png"
ALT="\begin{displaymath}
T'[k] = {\cal FT}(W(n)X[n+H]) (k)
\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>
Figure <A HREF="#fig09.11">9.11</A> shows the process of phase accumulation, in which the
output phases each depend on the previous output phase and the phase difference
for two windowed analyses of the input. Figure <A HREF="#fig09.12">9.12</A> illustrates the
phase relationship in the complex plane.
The phase of the new output <IMG
WIDTH="52" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
SRC="img64.png"
ALT="$S[m, k]$"> should be that of the previous one plus the
difference between the phases of the two analyses:
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
\begin{displaymath}
\angle S[m, k] = \angle S[m-1, k] +
\left ( \angle T'[k] - \angle T[k] \right )
\end{displaymath}
-->
<IMG
WIDTH="299" HEIGHT="28" BORDER="0"
SRC="img1190.png"
ALT="\begin{displaymath}
\angle S[m, k] = \angle S[m-1, k] +
\left ( \angle T'[k] - \angle T[k] \right )
\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
\begin{displaymath}
= \angle \left (
{{S[m-1, k] T'[k]}
\over
{T[k]}}
\right )
\end{displaymath}
-->
<IMG
WIDTH="163" HEIGHT="45" BORDER="0"
SRC="img1191.png"
ALT="\begin{displaymath}
= \angle \left (
{{S[m-1, k] T'[k]}
\over
{T[k]}}
\right )
\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>
Here we used the fact that multiplying or dividing two complex numbers gives
the sum or difference of their arguments.
<P>
If the desired magnitude is a real number <IMG
WIDTH="11" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
SRC="img4.png"
ALT="$a$">, then we should set <IMG
WIDTH="52" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
SRC="img64.png"
ALT="$S[m, k]$">
to:
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
\begin{displaymath}
S[m, k] \; = \;
a
\; \cdot \;
{
{ \left |
{{S[m-1, k] T'[k]}
\over
{T[k]}}
\right |}
^
{-1}
}
\; \cdot \;
{
{{S[m-1, k] T'[k]}
\over
{T[k]}}
}
\end{displaymath}
-->
<IMG
WIDTH="383" HEIGHT="48" BORDER="0"
SRC="img1192.png"
ALT="\begin{displaymath}
S[m, k] \; = \;
a
\; \cdot \;
{
{ \left \vert
{{S[m-...
...{-1}
}
\; \cdot \;
{
{{S[m-1, k] T'[k]}
\over
{T[k]}}
}
\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>
The magnitudes of the second and third terms cancel out, so that the magnitude
of <IMG
WIDTH="52" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
SRC="img64.png"
ALT="$S[m, k]$"> reduces to <IMG
WIDTH="11" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
SRC="img4.png"
ALT="$a$">; the first two terms are real numbers so the
argument is controlled by the last term.
<P>
If we want to end up with the magnitude from the spectrum <IMG
WIDTH="15" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img557.png"
ALT="$T$"> as well, we can
set <IMG
WIDTH="75" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
SRC="img1193.png"
ALT="$a = \vert T'[k]\vert$"> and simplify:
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
\begin{displaymath}
S[m, k] \; = \;
{
{ \left |
{{S[m-1, k]}
\over
{T[k]}}
\right |}
^
{-1}
}
\; \cdot \;
{
{{S[m-1, k] T'[k]}
\over
{T[k]}}
}
\end{displaymath}
-->
<IMG
WIDTH="321" HEIGHT="48" BORDER="0"
SRC="img1194.png"
ALT="\begin{displaymath}
S[m, k] \; = \;
{
{ \left \vert
{{S[m-1, k]}
\over
{...
...{-1}
}
\; \cdot \;
{
{{S[m-1, k] T'[k]}
\over
{T[k]}}
}
\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>
<P>
<BR><HR>
<!--Table of Child-Links-->
<A ID="CHILD_LINKS"><STRONG>Subsections</STRONG></A>
<UL>
<LI><A ID="tex2html3188"
HREF="node176.html">Phase relationships between channels</A>
</UL>
<!--End of Table of Child-Links-->
<HR>
<!--Navigation Panel-->
<A ID="tex2html3186"
HREF="node176.html">
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
SRC="next.png"></A>
<A ID="tex2html3180"
HREF="node163.html">
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
SRC="up.png"></A>
<A ID="tex2html3174"
HREF="node174.html">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
SRC="prev.png"></A>
<A ID="tex2html3182"
HREF="node4.html">
<IMG WIDTH="65" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="contents"
SRC="contents.png"></A>
<A ID="tex2html3184"
HREF="node201.html">
<IMG WIDTH="43" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="index"
SRC="index.png"></A>
<BR>
<B> Next:</B> <A ID="tex2html3187"
HREF="node176.html">Phase relationships between channels</A>
<B> Up:</B> <A ID="tex2html3181"
HREF="node163.html">Fourier analysis and resynthesis</A>
<B> Previous:</B> <A ID="tex2html3175"
HREF="node174.html">Timbre stamping (classical vocoder)</A>
&nbsp; <B> <A ID="tex2html3183"
HREF="node4.html">Contents</A></B>
&nbsp; <B> <A ID="tex2html3185"
HREF="node201.html">Index</A></B>
<!--End of Navigation Panel-->
<ADDRESS>
Miller Puckette
2006-12-30
</ADDRESS>
</BODY>
</HTML>