Using PyWavelets to Remove High Frequency Noise

I ran across an interesting blog post from 2012 that described how to use the PyWavelets module to remove noise from signals. I had been looking for a technique for smoothing signals without smoothing over peaks and sharp shifts, and I had completely forgotten about using wavelets.

Wavelets can be used to decompose a signal into a series of coefficients. The first coefficients represent the lowest frequencies, and the last coefficients represent the highest frequencies. By removing the higher frequency coefficients and then reconstructing the signal with the truncated coefficients, we can smooth the signal without smoothing over all of the interesting peaks the way we would with a moving average.

To do this you’ll need to install NumPy and Seaborn, but those are pretty straightforward. The pywt module is installed by pip-installing PyWavelets.

$ sudo pip install PyWavelets

Next, we import some stuff. If you are using an IPython notebook, then you want to include %pylab inline in the first line of your notebook. Also, the seaborn module isn’t necessary for plotting, I just like the plots it produces.

Next, we define a waveletSmooth() function. The wavelet argument determines the type of wavelet, more wavelet types can be found here. I’ve specified the "db4" wavelet as the default, but the PyWavelets module supports over seventy different types of wavelets. Below is a list of possible wavelet parameters,

  • The Haar wavelet, "haar", produces a square signal
  • the “Discrete” Meyer wavelet, "dmey"
  • The Daubechies family of wavelets, "db1" through "db20"
  • The Symlets family, "sym2" through "sym20"
  • The Coiflet family, "coif1" through "coif5"
  • The Biorthognal and Reverse Biorthogonal families
    • "bior1.1", "rbio1.1"
    • "bior1.3", "rbio1.3"
    • "bior1.5", "rbio1.5"
    • "bior2.2", "rbio2.2"
    • "bior2.4", "rbio2.4"
    • "bior2.6", "rbio2.6"
    • "bior2.8", "rbio2.8"
    • "bior3.1", "rbio3.1"
    • "bior3.3", "rbio3.3"
    • "bior3.5", "rbio3.5"
    • "bior3.7", "rbio3.7"
    • "bior3.9", "rbio3.9"
    • "bior4.4", "rbio4.4"
    • "bior5.5", "rbio5.5"
    • "bior6.8", "rbio6.8"

The level argument determines the level of smoothing, but it depends on the length of the signal that you are smoothing, so start with 1, and then move up to 2, etc, until you get an error. (Scientific, I know.)

%inline pylab # <-- add this if you're in an IPython notebook
import pywt
import numpy as np
import seaborn
from statsmodels.robust import mad

def waveletSmooth( x, wavelet="db4", level=1, title=None ):
    # calculate the wavelet coefficients
    coeff = pywt.wavedec( x, wavelet, mode="per" )
    # calculate a threshold
    sigma = mad( coeff[-level] )
    # changing this threshold also changes the behavior,
    # but I have not played with this very much
    uthresh = sigma * np.sqrt( 2*np.log( len( x ) ) )
    coeff[1:] = ( pywt.threshold( i, value=uthresh, mode="soft" ) for i in coeff[1:] )
    # reconstruct the signal using the thresholded coefficients
    y = pywt.waverec( coeff, wavelet, mode="per" )
    f, ax = plt.subplots()
    plot( x, color="b", alpha=0.5 )
    plot( y, color="b" )
    if title:
        ax.set_title(title)
    ax.set_xlim((0,len(y)))

I used this code to compare wavelet smoothing parameters on several signals shown below,

We can see several features emerge from the noise that we can use to further process the signals.