miller-book/node172.html

362 lines
11 KiB
HTML

<!DOCTYPE html>
<!--Converted with LaTeX2HTML 2002-2-1 (1.71)
original version by: Nikos Drakos, CBLU, University of Leeds
* revised and updated by: Marcus Hennecke, Ross Moore, Herb Swan
* with significant contributions from:
Jens Lippmann, Marek Rouchal, Martin Wilck and others -->
<HTML>
<HEAD>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<TITLE>Fourier analysis and reconstruction of audio signals</TITLE>
<META NAME="description" CONTENT="Fourier analysis and reconstruction of audio signals">
<META NAME="keywords" CONTENT="book">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META NAME="Generator" CONTENT="LaTeX2HTML v2002-2-1">
<META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css">
<LINK REL="STYLESHEET" HREF="book.css">
<LINK REL="next" HREF="node175.html">
<LINK REL="previous" HREF="node171.html">
<LINK REL="up" HREF="node163.html">
<LINK REL="next" HREF="node173.html">
</HEAD>
<BODY >
<!--Navigation Panel-->
<A NAME="tex2html3144"
HREF="node173.html">
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
SRC="next.png"></A>
<A NAME="tex2html3138"
HREF="node163.html">
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
SRC="up.png"></A>
<A NAME="tex2html3132"
HREF="node171.html">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
SRC="prev.png"></A>
<A NAME="tex2html3140"
HREF="node4.html">
<IMG WIDTH="65" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="contents"
SRC="contents.png"></A>
<A NAME="tex2html3142"
HREF="node201.html">
<IMG WIDTH="43" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="index"
SRC="index.png"></A>
<BR>
<B> Next:</B> <A NAME="tex2html3145"
HREF="node173.html">Narrow-band companding</A>
<B> Up:</B> <A NAME="tex2html3139"
HREF="node163.html">Fourier analysis and resynthesis</A>
<B> Previous:</B> <A NAME="tex2html3133"
HREF="node171.html">Fourier analysis of non-periodic</A>
&nbsp; <B> <A NAME="tex2html3141"
HREF="node4.html">Contents</A></B>
&nbsp; <B> <A NAME="tex2html3143"
HREF="node201.html">Index</A></B>
<BR>
<BR>
<!--End of Navigation Panel-->
<H1><A NAME="SECTION001340000000000000000">
Fourier analysis and reconstruction of audio signals</A>
</H1>
<P>
Fourier analysis can sometimes be used to resolve the component sinusoids in an
audio signal. Even when it can't go that far, it can separate a
signal into frequency regions, in the sense that for each <IMG
WIDTH="12" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img58.png"
ALT="$k$">, the <IMG
WIDTH="12" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img58.png"
ALT="$k$">th point
of the Fourier transform would be affected only by components close to
the nominal frequency <IMG
WIDTH="22" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img1145.png"
ALT="$k\omega$">. This suggests many interesting operations
we could perform on a signal by taking its Fourier transform, transforming
the result, and then reconstructing a new, transformed, signal from the
modified transform.
<P>
<DIV ALIGN="CENTER"><A NAME="fig09.07"></A><A NAME="12565"></A>
<TABLE>
<CAPTION ALIGN="BOTTOM"><STRONG>Figure 9.7:</STRONG>
Sliding-window analysis and resynthesis of an audio signal using
Fourier transforms. In this example the signal is filtered by multiplying the
Fourier transform with a desired frequency response.</CAPTION>
<TR><TD><IMG
WIDTH="444" HEIGHT="680" BORDER="0"
SRC="img1146.png"
ALT="\begin{figure}\psfig{file=figs/fig09.07.ps}\end{figure}"></TD></TR>
</TABLE>
</DIV>
<P>
Figure <A HREF="#fig09.07">9.7</A> shows how to carry out a Fourier analysis, modification,
and reconstruction of an audio signal. The first step is to divide the
signal into
<A NAME="12569"></A><I>windows</I>,
which are segments of the signal, of <IMG
WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img3.png"
ALT="$N$"> samples each, usually with some
overlap. Each window is then shaped by multiplying it by a windowing
function (Hann, for example). Then the Fourier transform is calculated for
the <IMG
WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img3.png"
ALT="$N$"> points <!-- MATH
$k = 0, 1, \ldots, N-1$
-->
<IMG
WIDTH="133" HEIGHT="30" ALIGN="MIDDLE" BORDER="0"
SRC="img1147.png"
ALT="$k = 0, 1, \ldots, N-1$">. (Sometimes it is desirable to
calculate
the Fourier transform for more points than this, but these <IMG
WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img3.png"
ALT="$N$"> points will
suffice here.)
<P>
The Fourier analysis gives us a two-dimensional array of complex numbers.
Let <IMG
WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img25.png"
ALT="$H$"> denote the
<A NAME="12571"></A><I>hop size</I>,
the number of samples each window is advanced past the
previous window. Then for each <!-- MATH
$m = \ldots, 0, 1, \ldots$
-->
<IMG
WIDTH="115" HEIGHT="29" ALIGN="MIDDLE" BORDER="0"
SRC="img1148.png"
ALT="$m = \ldots, 0, 1, \ldots$">, the <IMG
WIDTH="17" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
SRC="img111.png"
ALT="$m$">th window
consists of the <IMG
WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img3.png"
ALT="$N$"> points starting at the point <IMG
WIDTH="32" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img1149.png"
ALT="$mH$">. The <IMG
WIDTH="13" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
SRC="img75.png"
ALT="$n$">th point
of the <IMG
WIDTH="17" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
SRC="img111.png"
ALT="$m$">th window is <IMG
WIDTH="61" HEIGHT="30" ALIGN="MIDDLE" BORDER="0"
SRC="img1150.png"
ALT="$mH+n$">. The windowed Fourier transform is thus
equal to:
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
\begin{displaymath}
S[m, k] = {\cal FT}\{w(n)X[n-mH]\} (k)
\end{displaymath}
-->
<IMG
WIDTH="247" HEIGHT="28" BORDER="0"
SRC="img1151.png"
ALT="\begin{displaymath}
S[m, k] = {\cal FT}\{w(n)X[n-mH]\} (k)
\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>
This is both a function of time (<IMG
WIDTH="17" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
SRC="img111.png"
ALT="$m$">, in units of <IMG
WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img25.png"
ALT="$H$"> samples) and of
frequency (<IMG
WIDTH="12" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img58.png"
ALT="$k$">, as a multiple of the fundamental frequency <IMG
WIDTH="14" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
SRC="img27.png"
ALT="$\omega $">). Fixing
the frame number <IMG
WIDTH="17" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
SRC="img111.png"
ALT="$m$"> and looking
at the windowed Fourier transform as a function of <IMG
WIDTH="12" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img58.png"
ALT="$k$">:
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
\begin{displaymath}
S[k] = S[m, k]
\end{displaymath}
-->
<IMG
WIDTH="98" HEIGHT="28" BORDER="0"
SRC="img1152.png"
ALT="\begin{displaymath}
S[k] = S[m, k]
\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>
gives us a measure of the momentary spectrum of the signal <IMG
WIDTH="36" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
SRC="img669.png"
ALT="$X[n]$">. On the other
hand, fixing a frequency <IMG
WIDTH="12" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img58.png"
ALT="$k$"> we can look at it as the <IMG
WIDTH="12" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img58.png"
ALT="$k$">th channel of an
<IMG
WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img3.png"
ALT="$N$">-channel signal:
<BR><P></P>
<DIV ALIGN="CENTER">
<!-- MATH
\begin{displaymath}
C[m] = S[m, k]
\end{displaymath}
-->
<IMG
WIDTH="104" HEIGHT="28" BORDER="0"
SRC="img1153.png"
ALT="\begin{displaymath}
C[m] = S[m, k]
\end{displaymath}">
</DIV>
<BR CLEAR="ALL">
<P></P>
From this point of view, the windowed Fourier transform separates the original
signal <IMG
WIDTH="36" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
SRC="img669.png"
ALT="$X[n]$"> into <IMG
WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img3.png"
ALT="$N$"> narrow frequency regions, called <I>bands</I>.
<P>
Having computed the windowed Fourier transform, we next apply any desired
modification. In the figure, the modification is simply to replace the upper
half of the spectrum by zero, which gives a highly selective low-pass filter.
(Two other possible modifications, narrow-band companding and vocoding, are
described in the following sections.)
<P>
Finally we reconstruct an output signal. To do this we apply the inverse of
the Fourier transform (labeled "iFT" in the figure). As shown in
Section <A HREF="node166.html#sect9-IFT">9.1.2</A> this can be done by taking another Fourier transform,
normalizing, and flipping the result backwards. In case the reconstructed
window does not go smoothly to zero at its two ends, we apply the Hann
windowing function a second time. Doing this to each successive window of
the input, we then add the outputs, using the same overlap as for the analysis.
<P>
If we use the Hann window and an overlap of four (that is, choose <IMG
WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img3.png"
ALT="$N$"> a multiple
of four and space each window <IMG
WIDTH="68" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
SRC="img1154.png"
ALT="$H=N/4$"> samples past the previous one), we can
reconstruct the original signal faithfully by omitting the "modification"
step. This is because the iFT undoes the work of the <IMG
WIDTH="27" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
SRC="img1155.png"
ALT="$FT$">, and so we
are multiplying each window by the Hann function squared. The output is
thus the input, times the Hann window function squared, overlap-added by four.
An easy check shows that this comes to the constant <IMG
WIDTH="27" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
SRC="img1156.png"
ALT="$3/2$">, so the output
equals the input times a constant factor.
<P>
The ability to reconstruct the input signal exactly is useful because some
types of modification may be done by degrees, and so the output can be made
to vary smoothly between the input and some transformed version of it.
<P>
<BR><HR>
<!--Table of Child-Links-->
<A NAME="CHILD_LINKS"><STRONG>Subsections</STRONG></A>
<UL>
<LI><A NAME="tex2html3146"
HREF="node173.html">Narrow-band companding</A>
<LI><A NAME="tex2html3147"
HREF="node174.html">Timbre stamping (classical vocoder)</A>
</UL>
<!--End of Table of Child-Links-->
<HR>
<!--Navigation Panel-->
<A NAME="tex2html3144"
HREF="node173.html">
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
SRC="next.png"></A>
<A NAME="tex2html3138"
HREF="node163.html">
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
SRC="up.png"></A>
<A NAME="tex2html3132"
HREF="node171.html">
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
SRC="prev.png"></A>
<A NAME="tex2html3140"
HREF="node4.html">
<IMG WIDTH="65" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="contents"
SRC="contents.png"></A>
<A NAME="tex2html3142"
HREF="node201.html">
<IMG WIDTH="43" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="index"
SRC="index.png"></A>
<BR>
<B> Next:</B> <A NAME="tex2html3145"
HREF="node173.html">Narrow-band companding</A>
<B> Up:</B> <A NAME="tex2html3139"
HREF="node163.html">Fourier analysis and resynthesis</A>
<B> Previous:</B> <A NAME="tex2html3133"
HREF="node171.html">Fourier analysis of non-periodic</A>
&nbsp; <B> <A NAME="tex2html3141"
HREF="node4.html">Contents</A></B>
&nbsp; <B> <A NAME="tex2html3143"
HREF="node201.html">Index</A></B>
<!--End of Navigation Panel-->
<ADDRESS>
Miller Puckette
2006-12-30
</ADDRESS>
</BODY>
</HTML>