Clipping clamps the signal to a constant value. It also tends to occur right in the middle of signal content which is of a high power. If the signal derivative is calculated, the DC component of the clipping is effectively eliminated, bringing the values to 0, while the values before/after the clipping are relatively unmodified. (Specifically, low frequencies are attenuated and high frequencies are boosted.) So naively, one could use this derivative as the basis for a clipping detector - compare y' to y, and if y' is zero or very close to 0 while y is of a high power, you may have clipping. This technique would be immune to attenuated clipping - if it occurs at -10db it should work as well as if it is at 0db.
However, this approach fails when gross frequency response distortions are introduced - like what exists on vinyl. As discussed earlier, vinyl clipping examples exist which are sloped, not flat. The derivative of these a sloped line is a constant nonzero value. The workaround for this is simple: take another derivative, the second derivative, so that this constant nonzero value collapses to 0. In theory this could be extended to an arbitrary number of derivatives, but because high frequencies are amplified, background noise tends to dominate the response after the 2nd derivative, so the 3rd and beyond are pretty useless for vinyl analysis.
What I'm ultimately hoping for is to have the final output be a histogram and running a threshold on that to give an estimate for how many clipped samples exist in the signal. This allows comparisons between signals that are not sample-aligned (as is usually the case with vinyl vs. CD comparisons).
Here's what I have so far. First, some clipped stuff from Leyendecker again, on CD:
data:image/s3,"s3://crabby-images/e045b/e045baae5d7e6c040f6314807e5d16f263b37e2c" alt=""
Now for the LP version, different part of the file:
data:image/s3,"s3://crabby-images/13579/13579b78fccce47b45a593bb4d87f8c4a23580f2" alt=""
But it's a start.