28 February 2009

Some thoughts on digital wow/flutter demodulation and analysis

Background

Accurate analysis of the speed stability of turntables requires a wow/flutter spectrum: a frequency analysis of speed deviation. This is performed by treating a sine wave as a carrier for a modulated signal (the speed deviation). Perform a frequency demodulation of the wave and you have your speed deviation; run an amplitude spectrum on that deviation and you have your wow/flutter spectrum, or apply the IEC wow/flutter weighting filter and obtain a peak w/f measurement to compare against rated specs. However, the peak W/F measurements tend to be vanishingly low for virtually all turntables nowadays and the spectrum is generally considered far more useful. Ladegaard's classic app note on vibration analysis contains good explanations of the importance of wow/flutter spectra when analyzing turntable sound fidelity, and it is quite unfortunate that its measurement has remained generally out of the reach of the lay audiophile.

Historically this measurement has required expensive ($10,000 or more!) lab hardware. Nowadays there are a few ways to do this digitially, but the most commonly encountered ones are rather poor. The crudest method is simply to look at a spectrogram of the tone: this gives the rough outline of the modulation, but is generally horribly CPU-intensive and has a time and frequency resolution so low as to be meaningless. A method of better utility is to simply look at the high-resolution spectrum of the wow/flutter tone; this is what Stereophile typically uses on its turntable tests. This is a trivial operation in any audio package, although some care is necessary to set block lengths, window parameters and overlap to maximize dynamic range. Alternatively, the signal may be mixed (multiplied) with a synthesized sine wave with a frequency matching the average frequency of the test tone. This shifts the test tone at F into mirror tones at 0 and 2F; the latter can be lowpassed out and the former can be run through an amplitude spectrum plot to obtain a baseband spectrum. This might yield improved performance by folding sidebands together to reduce noise, but any error in the synthesis frequency will cause the sidebands to not line up exactly, leading to a sort of "double vision" in the spectrum plot, where every peak appears twice.

All of these methods have major issues specific to their implementation, but by far, their biggest weakness is that they do not demodulate the signal. Accurate FM demodulation serves many extremely useful roles that simpler FFT-based schemes cannot:
  • It yields an accurate time-domain waveform of the instantaneous frequency. This waveform is absolutely necessary for computing IEC weighted wow/flutter figures.
  • It removes the many sidebands present in FM signals and integrates them into a more accurate result. Sidebands from low frequency wow may overlap the carrier band belonging to higher frequency flutter in an FFT-based scheme, ultimately leading a loss of precision in the final spectrum.
  • Usually, it is very insensitive to amplitude modulation. All FFT-based schemes will flatten both the AM and FM signals into a single spectrum, which may be a major inaccuracy. In some cases, the FM demodulation can also output the AM signal along with the FM one.
  • Because an exact lock is made at the instantaneous frequency, frequency drift does not affect the precision of the result. This is an inevitable problem with FFT-based schemes.

Analog and digital FM demodulation is based on a variety of schemes. I implemented quadrature demodulation: a direct application of the Hilbert transform. This allows computation of the instantaneous frequency on a
sample-by-sample basis, at effectively unbounded frequency resolution. What's more, quadrature demodulation separates the amplitude modulation signal away from the FM signal, and once the FM is computed, the AM signal is effectively free. Also, in comparison to PLL schemes, there are very few tweakable parameters to a quadrature demodulator: set up the filter lengths, set up an input conditioning filter, and go to town. It is one of the most computationally intensive demodulation schemes, but it with some light optimization it still runs faster than real time. I got the exact frequency demodulation equation used from the comp.dsp archives , which is a corrected version of an equation in Frerking:


I(n)*Q'(n) - Q(n)*I'(n)
----------------------- = omega (F)
I^2(n) + Q^2(n)


Where I(n) is the original signal, Q(n) is the Hilbert transform of I(n), and I'(n) and Q'(n) are their respective bandlimited derivatives. The formula for amplitude demodulation is:


A = sqrt(I(n)^2 + Q(n)^2)


For the case of a pure single-tone signal, these equations are exact. However that does not occur in a noisy environment and so errors can generally creep into the demodulation, most prominently through clicks and pops in the recording.

Implementation

I took my existing RIAA filter equations to apply reverse RIAA if needed - my thinking here is that when recording a wow/flutter tone, the non-flat filter response around the tone frequency will lead to inaccuracies in the flutter spectrum. It should also yield a smaller AM signal amplitude. It has Orban's hardcoded coefficients for 44100 and 96000hz sampling rates. For other sample rates the bilinear transform filter is used, which is notoriously inaccurate at low sample rates.

Input conditioning is currently handled with a Kaiser FIR bandpass, centered on the tone frequency, with beta=5 and (IIRC) 2047 taps. Obviously, the filter must be linear-phase. The subjective cleanliness of the demodulated output waveforms, as well as the quality of Audacity's declick operations on the output files, demands that this filter exist and it have relatively gentle transition bands. But the demodulation operation itself does not appear that sensitive to wideband noise, so this may be optional for a spectrum analysis. The rejection of the stopbands and the linearity of the demodulation and further analysis is high enough that post-demodulation filtering is rendered unnecessary.

Some care must be taken to keep the time delays of all the various data paths the same. Most importantly, the 1-tap derivative filter x[1]-x[0] will not work because the effective time delay is 0.5. A FIR filter must be constructed with integral delay, along with delay lines to keep I(n) and Q(n) delayed the same amount as I'(n) and Q'(n).

The Hilbert FIR filter was constructed by crafting an impulse signal and applying a Hilbert transform to it; the derivative FIR filter was hand constructed. I'm using FFT-based convolution, so the number of taps available for the Hilbert and derivative filters is scandalously large. I use 262143-tap filters and the whole thing runs 3x realtime. At those sizes, windowing the filters is unnecessary.

Stereo recordings are summed to mono, and there is some control over the L/R balance to ensure an optimum signal. FM signal outputs are in units of the specified frequency: If the test tone is set to be 3150hz in the demodulator, then 0dbFS = 3150hz, -20dbFS = 10% and so on. The AM signal is in PCM amplitude units but makes no attempt to reverse the RIAA eq if applied in the demodulator.

Signals had DC removal done in Audacity. If there were extremely strong clicks they were usually removed manually.

Normalization of AM signals (to a fraction of the carrier amplitude) is not yet done. This means that while the AM results may be shown in the same chart, they are not directly comparable against one another. They are nevertheless provided for comparison against the FM plots.

Spectrum analysis was done in my usual handrolled code. Windowing is Hamming to maximize frequency resolution and because dynamic range requirements are not great; overlapping the previous FFT window at the rear 75%.

IEC wow/flutter measurements are not yet implemented. IEC drift measurements are not yet implemented, although even if they were, they allegedly require a 30s averaging time which might be problematic when dealing with 60s wow/flutter tracks.

I implemented this in LabVIEW 8.21. (Full disclosure: I am a former employee of NI and I own shares.)

Samples

  • AT-LP2D-USB recording of the Ultimate LP wow/flutter, supplied by knowzy.com
  • My recording of the STR-151 3khz tone - Technics SL-1200, Audio Technica AT-OC9MLII. Recorded with flat eq on Yamaha GO46.
  • A recording of a HFS75 wow/flutter track from a currently-unnamed contributor. Unfortunately this track is only ~10 seconds long, so especially the wow results will have a lower resolution. The noise levels seem to be much higher too.

Results

Assuming a carrier frequency of 3.15khz, a peak speed deviation of 2% and a peak modulation frequency of 250khz, Carson's Rule suggests a signal bandwidth of 626hz. I therefore bandpassed the input signal to a 700hz bandwidth before demodulating. (A discussion and analysis of higher frequency flutter is well beyond the scope of this paper.)

Spectrum analysis is done at window lengths of 28.8 and 1.8 seconds. At 28.8s, the record warp peaks are extremely visible while the underlying spectrum is easily seen. At 1.8s, much more windows are available to average together and the warp harmonics completely merge together, resulting in a clearer, more accurate plot. The 28.8s plots go from 0hz to 20hz in a linear scale while the 1.8s plots progress logarithmically from 1hz to 250hz. Plot smoothing begins at 100hz for the 1.8s plots.

For the following spectrum plots, the AT-LP2D results are in blue, my results are in red, and the HFS75's in green. The AT2D had an average speed deviation of 0.167% relative to 3.15khz; the Technics deviated by 0.42% against 3khz, and the HFD75 recording deviated by -2.2% against 3khz - I don't have a good explanation for the latter, except perhaps that the HFS75 track might be something like 2950hz instead of 3000hz; also, the test records for the latter two tables are considerably older than the first, and this might reflect the quality of the lathes than the quality of the turntable.

LP2D demodulated waveforms: black is FM, red is AM


Technics demodulated waveforms: black is FM, red is AM



HFS75 demodulated waveforms: black is FM, red is AM


Frequency modulation spectrum, 0-20hz


Frequency modulation spectrum, 1-250hz



Amplitude modulation spectrum, 0-20hz



Amplitude modulation spectrum, 1-250hz


Discussion

Analysis of the process:
  • In general, because different test records were used for each recording, the turntables are not entirely comparable based on these measurements except in situations where a clear interpretation can be made. They are plotted on the same charts primarily for easier viewing.
  • Demodulation is extremely sensitive to input noise - the HFS75 recording was substantially noisier and that really did compromise all its plots. Its modulation spectrum above 50hz simply cannot be trusted. In general, maximizing the SNR of the tone will maximize the SNR of the flutter spectrum. Using a declick utility may improve performance.
  • The averaged flutter spectra required roughly 30s of input signal to mostly average out, so at least this length is necessary for a good plot. The HFS75 track is 10 seconds long which is way too short. Testing suggests that enabling/disabling inverse RIAA eq does not significantly change the AM or FM waveforms.
  • Testing at higher bandwidths does not show significant flutter beyond 200hz. The noise levels are markedly higher starting at 500hz, and background noise tones in the recording (such as a 1khz sine and harmonics on the LP2D) show up spuriously in the modulation spectra.
  • Compare to Ladegaard's B&K plots - is my noise floor higher or lower? I can't really tell at the moment.

Analysis of plots:
  • The LP2D plots show clear peaks at 18hz, 40-50hz and 55hz, with 20-30db of SNR. Warp harmonics can be very clearly separated from the background noise floor. These are good things because they show the flutter analysis showing actually useful stuff.
  • The Technics AM flutter plot rises remarkably above 200hz, while the other tables do not. The reason why this is so potentially remarkable is because the speed control loop is supposed to operate somewhere around this frequency. However, it does not show up on the Technics FM plot, which is a strike against this being control-related.
  • While I am not automatically normalizing the AM results yet, I can say that the average AM DC value for the LP2D was 0.20955, which means signal has a peak-to-peak fluctuation of roughly 3.6%. This is astonishingly high. The Technics AM peak-to-peak value is roughly 1.6% which is also very high. There are no really good explanations for this yet.
  • There are other puzzling things about the AM spectra. Besides the rising response on the Technics plot and the large magnitude, there is a hump in the AM response of the LP2D at 140hz, several peaks on the FM flutter plot that do not exist in AM plots, and the FM wow plot indicates very many tones that do not exist on the AM plot.

2 comments:

Unknown said...

Very interesting ideas, tests and analysis. I'm surprised more people and/or software vendors have not included something like this in their tool offerings. Do you have any interest in sharing the LabView file(s)?

JC said...

RT, I'm curious if the Feickhart software's wow & flutter analysis incorporates something as exhaustive as your technique discussed here?