Given N samples of speech, we would like to compute estimates to
that result in the best fit. One reasonable way to define ``best fit''
is in terms of mean squared error. These can also be regarded as ``most
probable'' parameters if it is assumed the distribution of errors is
Gaussian and a priori there were no restrictions on the values of
.
The error at any time,
, is:

Hence the summed squared error, E, over a finite window of length N is:

The minimum of E occurs when the derivative is zero with respect to
each of the parameters,
. As can be seen from
equation 67 the value of E is quadratic in each of the
therefore there is a single solution. Very large positive or
negative values of
must lead to poor prediction and hence the
solution to
must be a minimum.

Figure 38: Schematic showing single minimum of a quadratic
Hence differentiating equation 67 with respect to
and setting equal to zero gives the set of p equations:

rearranging equation 69 gives:
![]()
Define the covariance matrix,
with elements
:
![]()
Now we can write equation 70 as:
![]()
or in matrix form:

or simply:
![]()
Hence the Covariance method solution is obtained by matrix inverse:
![]()
Note that
is symmetric, i.e.
,
and that this symmetry can be expoited in inverting
(see [9]).
These equations reference the samples
.