Joint Uncertainty Decoding for Robust Large Vocabulary Speech Recognition H. Liao and M.J.F. Gales CUED/F-INFENG/TR-552. Nov 2006. Standard techniques to increase automatic speech recognition noise robustness typically assume recognition models are clean trained. This "clean" training data may in fact not be clean at all, but may contain channel variations, varying noise conditions, as well as different speakers. Hence rather than considering noise robustness techniques as compensating clean acoustic models for environmental noise, they may be thought of as reducing the acoustic mismatch between training and test conditions. This report examines the application of VTS model compensation or model-based Joint uncertainty decoding to clean and multistyle trained systems. An EM-based noise estimation procedure is also presented to produce ML VTS or Joint noise models depending on the form of compensation used. Alternatively, compared to multistyle training, adaptive training with Joint uncertainty transforms, also referred to as JAT in this work, provides a better method for handling heterogeneous data. With JAT, the uncertainty bias added to the model variances de-weights observations proportional to the noise level. In this way, Joint transforms normalise the noise from the data allowing the canonical model to solely represent the underlying "clean" acoustic signal. This report presents a novel Joint adaptive training framework including formula for estimating the transforms and canonical model parameters. Lastly, large vocabulary systems are often trained on multistyle data sets such as broadcast news or conversational telephone speech that have a variety of noise conditions. However, to date not much research has been done on compensating such systems built with non-artificially corrupted data. In this report, experiments are conducted on an artificially corrupted Resource Management database and the large vocabulary Broadcast News corpus of collected broadcast recordings.