After talked with the steps of the subspace detection algorithm in the previous blog, I read a paper on the Empirical subspace detection algorithm can be found
here.
The idea of the Empirical subspace detection algorithm is this: After you have the design set (a matrix with the repeat waveforms that aligned on P wave), you don't need to calculate the SVD to find the orthogonal basis like the original subspace detection algorithm, instead you calculate the stacking of the waveforms in the design set, and the time derivative of the stacked waveform. The author of the paper found that the stacked waveform is similar to the first basis vector from the SVD, and the derivative of the stacked waveform is similar to the second basis vector. For the first one, it is quite easy to understand, the first basis contains the common features of all the waveforms, that is why it is mimic the stacking of the waveforms. But why the second basis vector is similar to the derivative of the stacked signal is not that easy to see. The author claims that the second singular vector represents information related to the variations produced by slight offsets in earthquake location of the design set earthquakes.
So this forms the empirical part of this algorithm: just use stacked waveform and the time derivative of the stacked waveform in the design set to represent the first two basis vector from the SVD (No need to calculate SVD in practice).
I went ahead to test this concept using some simulated signal, this is what I did:
(1) I generate a signal using Mexican hat wavelet, one pulse at a location to represent P wave, and the other pulse at a later location but with 2 times of the amplitude, then I add white noise in the background.
(2) I generate another 9 waveform based on the above signal but each shift the S wave a little step, (note: the first pulse are aligned to represent the P wave, and the second pulse shifted to represent the S wave shift due to the different location of the event), see the following figure as an example:
(3) I calculate the stacking of these signals, and the time derivative of the stacked signal.
(4) Then I compare the signal from step (3) with the first and second basis vector from the SVD.
(5) I plot the normalized comparison (scale the maximum amplitude to 1) in the following figures:
Conclusion: It does seem the author is correct that the stacked waveform, and the time derivative of the stacked waveform looks assemble the first two basis vector from SVD. For the derivative of the stacking, the S wave part matching with the second basis vector very well (similar results from the author's paper Fig. 4). I also tried different width of the pulse, you can find it in the code or the figures on my Github. Still I can not give a solid physical meaning why the time derivative of the stacking matches the 2nd basis vector. If you know the answer, please let me know ^)^
You can find my code to generate these figures
here on my Github. The code will also generate results for different signal width of the Mexican hat wavelet to play with different frequency. Try it.
Acknowledgment: Thanks
Taka for discussion, really helpful!