Model compensation is a technique for noise-robustness. It takes a model of the speech and a model of the noise, and produces a model of the corrupted speech. In this toolkit, the models of the speech and noise must both be Gaussian. Only static speech recogniser coefficients are compensated.
The cross-entropy toolkit implements the following well-known model compensation techniques.
Compute the corrupted speech distribution with DPMC (Gales 1995). This draws samples from the dristibution of the corrupted speech that follows from the distributions for the speech, the noise, and the phase factor. It then approximates a Gaussian distribution or a mixture of Gaussians on these samples.
| Returns: | approximate corrupted speech distribution. if componentNum is 1, then this is a Gaussian. Otherwise, this is a Mixture of Gaussians. |
|---|---|
| Parameters: |
|
Apply VTS compensation (Moreno 1996). This uses a first-order vector Taylor series approximation to the mismatch function. The distributions for the speech, noise, and phase factor must be Gaussian. Because the mismatch function is linearised, the resulting approximate corrupted speech distribution is also Gaussian.
| Returns: | the approximate corrupted speech distribution as a Gaussian. |
|---|---|
| Parameters: |
|