358 lines
12 KiB
HTML
358 lines
12 KiB
HTML
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
|
||
|
|
||
|
<!--Converted with LaTeX2HTML 2002-2-1 (1.71)
|
||
|
original version by: Nikos Drakos, CBLU, University of Leeds
|
||
|
* revised and updated by: Marcus Hennecke, Ross Moore, Herb Swan
|
||
|
* with significant contributions from:
|
||
|
Jens Lippmann, Marek Rouchal, Martin Wilck and others -->
|
||
|
<HTML>
|
||
|
<HEAD>
|
||
|
<TITLE>Fourier analysis and reconstruction of audio signals</TITLE>
|
||
|
<META NAME="description" CONTENT="Fourier analysis and reconstruction of audio signals">
|
||
|
<META NAME="keywords" CONTENT="book">
|
||
|
<META NAME="resource-type" CONTENT="document">
|
||
|
<META NAME="distribution" CONTENT="global">
|
||
|
|
||
|
<META NAME="Generator" CONTENT="LaTeX2HTML v2002-2-1">
|
||
|
<META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css">
|
||
|
|
||
|
<LINK REL="STYLESHEET" HREF="book.css">
|
||
|
|
||
|
<LINK REL="next" HREF="node175.html">
|
||
|
<LINK REL="previous" HREF="node171.html">
|
||
|
<LINK REL="up" HREF="node163.html">
|
||
|
<LINK REL="next" HREF="node173.html">
|
||
|
</HEAD>
|
||
|
|
||
|
<BODY >
|
||
|
<!--Navigation Panel-->
|
||
|
<A NAME="tex2html3144"
|
||
|
HREF="node173.html">
|
||
|
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
|
||
|
SRC="file:/usr/local/share/lib/latex2html/icons/next.png"></A>
|
||
|
<A NAME="tex2html3138"
|
||
|
HREF="node163.html">
|
||
|
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
|
||
|
SRC="file:/usr/local/share/lib/latex2html/icons/up.png"></A>
|
||
|
<A NAME="tex2html3132"
|
||
|
HREF="node171.html">
|
||
|
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
|
||
|
SRC="file:/usr/local/share/lib/latex2html/icons/prev.png"></A>
|
||
|
<A NAME="tex2html3140"
|
||
|
HREF="node4.html">
|
||
|
<IMG WIDTH="65" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="contents"
|
||
|
SRC="file:/usr/local/share/lib/latex2html/icons/contents.png"></A>
|
||
|
<A NAME="tex2html3142"
|
||
|
HREF="node201.html">
|
||
|
<IMG WIDTH="43" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="index"
|
||
|
SRC="file:/usr/local/share/lib/latex2html/icons/index.png"></A>
|
||
|
<BR>
|
||
|
<B> Next:</B> <A NAME="tex2html3145"
|
||
|
HREF="node173.html">Narrow-band companding</A>
|
||
|
<B> Up:</B> <A NAME="tex2html3139"
|
||
|
HREF="node163.html">Fourier analysis and resynthesis</A>
|
||
|
<B> Previous:</B> <A NAME="tex2html3133"
|
||
|
HREF="node171.html">Fourier analysis of non-periodic</A>
|
||
|
<B> <A NAME="tex2html3141"
|
||
|
HREF="node4.html">Contents</A></B>
|
||
|
<B> <A NAME="tex2html3143"
|
||
|
HREF="node201.html">Index</A></B>
|
||
|
<BR>
|
||
|
<BR>
|
||
|
<!--End of Navigation Panel-->
|
||
|
|
||
|
<H1><A NAME="SECTION001340000000000000000">
|
||
|
Fourier analysis and reconstruction of audio signals</A>
|
||
|
</H1>
|
||
|
|
||
|
<P>
|
||
|
Fourier analysis can sometimes be used to resolve the component sinusoids in an
|
||
|
audio signal. Even when it can't go that far, it can separate a
|
||
|
signal into frequency regions, in the sense that for each <IMG
|
||
|
WIDTH="12" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img58.png"
|
||
|
ALT="$k$">, the <IMG
|
||
|
WIDTH="12" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img58.png"
|
||
|
ALT="$k$">th point
|
||
|
of the Fourier transform would be affected only by components close to
|
||
|
the nominal frequency <IMG
|
||
|
WIDTH="22" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img1145.png"
|
||
|
ALT="$k\omega$">. This suggests many interesting operations
|
||
|
we could perform on a signal by taking its Fourier transform, transforming
|
||
|
the result, and then reconstructing a new, transformed, signal from the
|
||
|
modified transform.
|
||
|
|
||
|
<P>
|
||
|
|
||
|
<DIV ALIGN="CENTER"><A NAME="fig09.07"></A><A NAME="12565"></A>
|
||
|
<TABLE>
|
||
|
<CAPTION ALIGN="BOTTOM"><STRONG>Figure 9.7:</STRONG>
|
||
|
Sliding-window analysis and resynthesis of an audio signal using
|
||
|
Fourier transforms. In this example the signal is filtered by multiplying the
|
||
|
Fourier transform with a desired frequency response.</CAPTION>
|
||
|
<TR><TD><IMG
|
||
|
WIDTH="444" HEIGHT="680" BORDER="0"
|
||
|
SRC="img1146.png"
|
||
|
ALT="\begin{figure}\psfig{file=figs/fig09.07.ps}\end{figure}"></TD></TR>
|
||
|
</TABLE>
|
||
|
</DIV>
|
||
|
|
||
|
<P>
|
||
|
Figure <A HREF="#fig09.07">9.7</A> shows how to carry out a Fourier analysis, modification,
|
||
|
and reconstruction of an audio signal. The first step is to divide the
|
||
|
signal into
|
||
|
<A NAME="12569"></A><I>windows</I>,
|
||
|
which are segments of the signal, of <IMG
|
||
|
WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img3.png"
|
||
|
ALT="$N$"> samples each, usually with some
|
||
|
overlap. Each window is then shaped by multiplying it by a windowing
|
||
|
function (Hann, for example). Then the Fourier transform is calculated for
|
||
|
the <IMG
|
||
|
WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img3.png"
|
||
|
ALT="$N$"> points <!-- MATH
|
||
|
$k = 0, 1, \ldots, N-1$
|
||
|
-->
|
||
|
<IMG
|
||
|
WIDTH="133" HEIGHT="30" ALIGN="MIDDLE" BORDER="0"
|
||
|
SRC="img1147.png"
|
||
|
ALT="$k = 0, 1, \ldots, N-1$">. (Sometimes it is desirable to
|
||
|
calculate
|
||
|
the Fourier transform for more points than this, but these <IMG
|
||
|
WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img3.png"
|
||
|
ALT="$N$"> points will
|
||
|
suffice here.)
|
||
|
|
||
|
<P>
|
||
|
The Fourier analysis gives us a two-dimensional array of complex numbers.
|
||
|
Let <IMG
|
||
|
WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img25.png"
|
||
|
ALT="$H$"> denote the
|
||
|
<A NAME="12571"></A><I>hop size</I>,
|
||
|
the number of samples each window is advanced past the
|
||
|
previous window. Then for each <!-- MATH
|
||
|
$m = \ldots, 0, 1, \ldots$
|
||
|
-->
|
||
|
<IMG
|
||
|
WIDTH="115" HEIGHT="29" ALIGN="MIDDLE" BORDER="0"
|
||
|
SRC="img1148.png"
|
||
|
ALT="$m = \ldots, 0, 1, \ldots$">, the <IMG
|
||
|
WIDTH="17" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img111.png"
|
||
|
ALT="$m$">th window
|
||
|
consists of the <IMG
|
||
|
WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img3.png"
|
||
|
ALT="$N$"> points starting at the point <IMG
|
||
|
WIDTH="32" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img1149.png"
|
||
|
ALT="$mH$">. The <IMG
|
||
|
WIDTH="13" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img75.png"
|
||
|
ALT="$n$">th point
|
||
|
of the <IMG
|
||
|
WIDTH="17" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img111.png"
|
||
|
ALT="$m$">th window is <IMG
|
||
|
WIDTH="61" HEIGHT="30" ALIGN="MIDDLE" BORDER="0"
|
||
|
SRC="img1150.png"
|
||
|
ALT="$mH+n$">. The windowed Fourier transform is thus
|
||
|
equal to:
|
||
|
<BR><P></P>
|
||
|
<DIV ALIGN="CENTER">
|
||
|
<!-- MATH
|
||
|
\begin{displaymath}
|
||
|
S[m, k] = {\cal FT}\{w(n)X[n-mH]\} (k)
|
||
|
\end{displaymath}
|
||
|
-->
|
||
|
|
||
|
<IMG
|
||
|
WIDTH="247" HEIGHT="28" BORDER="0"
|
||
|
SRC="img1151.png"
|
||
|
ALT="\begin{displaymath}
|
||
|
S[m, k] = {\cal FT}\{w(n)X[n-mH]\} (k)
|
||
|
\end{displaymath}">
|
||
|
</DIV>
|
||
|
<BR CLEAR="ALL">
|
||
|
<P></P>
|
||
|
This is both a function of time (<IMG
|
||
|
WIDTH="17" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img111.png"
|
||
|
ALT="$m$">, in units of <IMG
|
||
|
WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img25.png"
|
||
|
ALT="$H$"> samples) and of
|
||
|
frequency (<IMG
|
||
|
WIDTH="12" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img58.png"
|
||
|
ALT="$k$">, as a multiple of the fundamental frequency <IMG
|
||
|
WIDTH="14" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img27.png"
|
||
|
ALT="$\omega $">). Fixing
|
||
|
the frame number <IMG
|
||
|
WIDTH="17" HEIGHT="13" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img111.png"
|
||
|
ALT="$m$"> and looking
|
||
|
at the windowed Fourier transform as a function of <IMG
|
||
|
WIDTH="12" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img58.png"
|
||
|
ALT="$k$">:
|
||
|
<BR><P></P>
|
||
|
<DIV ALIGN="CENTER">
|
||
|
<!-- MATH
|
||
|
\begin{displaymath}
|
||
|
S[k] = S[m, k]
|
||
|
\end{displaymath}
|
||
|
-->
|
||
|
|
||
|
<IMG
|
||
|
WIDTH="98" HEIGHT="28" BORDER="0"
|
||
|
SRC="img1152.png"
|
||
|
ALT="\begin{displaymath}
|
||
|
S[k] = S[m, k]
|
||
|
\end{displaymath}">
|
||
|
</DIV>
|
||
|
<BR CLEAR="ALL">
|
||
|
<P></P>
|
||
|
gives us a measure of the momentary spectrum of the signal <IMG
|
||
|
WIDTH="36" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
|
||
|
SRC="img669.png"
|
||
|
ALT="$X[n]$">. On the other
|
||
|
hand, fixing a frequency <IMG
|
||
|
WIDTH="12" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img58.png"
|
||
|
ALT="$k$"> we can look at it as the <IMG
|
||
|
WIDTH="12" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img58.png"
|
||
|
ALT="$k$">th channel of an
|
||
|
<IMG
|
||
|
WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img3.png"
|
||
|
ALT="$N$">-channel signal:
|
||
|
<BR><P></P>
|
||
|
<DIV ALIGN="CENTER">
|
||
|
<!-- MATH
|
||
|
\begin{displaymath}
|
||
|
C[m] = S[m, k]
|
||
|
\end{displaymath}
|
||
|
-->
|
||
|
|
||
|
<IMG
|
||
|
WIDTH="104" HEIGHT="28" BORDER="0"
|
||
|
SRC="img1153.png"
|
||
|
ALT="\begin{displaymath}
|
||
|
C[m] = S[m, k]
|
||
|
\end{displaymath}">
|
||
|
</DIV>
|
||
|
<BR CLEAR="ALL">
|
||
|
<P></P>
|
||
|
From this point of view, the windowed Fourier transform separates the original
|
||
|
signal <IMG
|
||
|
WIDTH="36" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
|
||
|
SRC="img669.png"
|
||
|
ALT="$X[n]$"> into <IMG
|
||
|
WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img3.png"
|
||
|
ALT="$N$"> narrow frequency regions, called <I>bands</I>.
|
||
|
|
||
|
<P>
|
||
|
Having computed the windowed Fourier transform, we next apply any desired
|
||
|
modification. In the figure, the modification is simply to replace the upper
|
||
|
half of the spectrum by zero, which gives a highly selective low-pass filter.
|
||
|
(Two other possible modifications, narrow-band companding and vocoding, are
|
||
|
described in the following sections.)
|
||
|
|
||
|
<P>
|
||
|
Finally we reconstruct an output signal. To do this we apply the inverse of
|
||
|
the Fourier transform (labeled ``iFT" in the figure). As shown in
|
||
|
Section <A HREF="node166.html#sect9-IFT">9.1.2</A> this can be done by taking another Fourier transform,
|
||
|
normalizing, and flipping the result backwards. In case the reconstructed
|
||
|
window does not go smoothly to zero at its two ends, we apply the Hann
|
||
|
windowing function a second time. Doing this to each successive window of
|
||
|
the input, we then add the outputs, using the same overlap as for the analysis.
|
||
|
|
||
|
<P>
|
||
|
If we use the Hann window and an overlap of four (that is, choose <IMG
|
||
|
WIDTH="18" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img3.png"
|
||
|
ALT="$N$"> a multiple
|
||
|
of four and space each window <IMG
|
||
|
WIDTH="68" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
|
||
|
SRC="img1154.png"
|
||
|
ALT="$H=N/4$"> samples past the previous one), we can
|
||
|
reconstruct the original signal faithfully by omitting the ``modification"
|
||
|
step. This is because the iFT undoes the work of the <IMG
|
||
|
WIDTH="27" HEIGHT="14" ALIGN="BOTTOM" BORDER="0"
|
||
|
SRC="img1155.png"
|
||
|
ALT="$FT$">, and so we
|
||
|
are multiplying each window by the Hann function squared. The output is
|
||
|
thus the input, times the Hann window function squared, overlap-added by four.
|
||
|
An easy check shows that this comes to the constant <IMG
|
||
|
WIDTH="27" HEIGHT="32" ALIGN="MIDDLE" BORDER="0"
|
||
|
SRC="img1156.png"
|
||
|
ALT="$3/2$">, so the output
|
||
|
equals the input times a constant factor.
|
||
|
|
||
|
<P>
|
||
|
The ability to reconstruct the input signal exactly is useful because some
|
||
|
types of modification may be done by degrees, and so the output can be made
|
||
|
to vary smoothly between the input and some transformed version of it.
|
||
|
|
||
|
<P>
|
||
|
<BR><HR>
|
||
|
<!--Table of Child-Links-->
|
||
|
<A NAME="CHILD_LINKS"><STRONG>Subsections</STRONG></A>
|
||
|
|
||
|
<UL>
|
||
|
<LI><A NAME="tex2html3146"
|
||
|
HREF="node173.html">Narrow-band companding</A>
|
||
|
<LI><A NAME="tex2html3147"
|
||
|
HREF="node174.html">Timbre stamping (classical vocoder)</A>
|
||
|
</UL>
|
||
|
<!--End of Table of Child-Links-->
|
||
|
<HR>
|
||
|
<!--Navigation Panel-->
|
||
|
<A NAME="tex2html3144"
|
||
|
HREF="node173.html">
|
||
|
<IMG WIDTH="37" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next"
|
||
|
SRC="file:/usr/local/share/lib/latex2html/icons/next.png"></A>
|
||
|
<A NAME="tex2html3138"
|
||
|
HREF="node163.html">
|
||
|
<IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up"
|
||
|
SRC="file:/usr/local/share/lib/latex2html/icons/up.png"></A>
|
||
|
<A NAME="tex2html3132"
|
||
|
HREF="node171.html">
|
||
|
<IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous"
|
||
|
SRC="file:/usr/local/share/lib/latex2html/icons/prev.png"></A>
|
||
|
<A NAME="tex2html3140"
|
||
|
HREF="node4.html">
|
||
|
<IMG WIDTH="65" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="contents"
|
||
|
SRC="file:/usr/local/share/lib/latex2html/icons/contents.png"></A>
|
||
|
<A NAME="tex2html3142"
|
||
|
HREF="node201.html">
|
||
|
<IMG WIDTH="43" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="index"
|
||
|
SRC="file:/usr/local/share/lib/latex2html/icons/index.png"></A>
|
||
|
<BR>
|
||
|
<B> Next:</B> <A NAME="tex2html3145"
|
||
|
HREF="node173.html">Narrow-band companding</A>
|
||
|
<B> Up:</B> <A NAME="tex2html3139"
|
||
|
HREF="node163.html">Fourier analysis and resynthesis</A>
|
||
|
<B> Previous:</B> <A NAME="tex2html3133"
|
||
|
HREF="node171.html">Fourier analysis of non-periodic</A>
|
||
|
<B> <A NAME="tex2html3141"
|
||
|
HREF="node4.html">Contents</A></B>
|
||
|
<B> <A NAME="tex2html3143"
|
||
|
HREF="node201.html">Index</A></B>
|
||
|
<!--End of Navigation Panel-->
|
||
|
<ADDRESS>
|
||
|
Miller Puckette
|
||
|
2006-12-30
|
||
|
</ADDRESS>
|
||
|
</BODY>
|
||
|
</HTML>
|