Advanced techniques for error robust audio and speech communications

The past decades have seen a very fast growth of the telecommunications industry. Mobile telephony has evolved from a specialist application to being commonplace and affordable, and is now a mass-market industry. Like mobile telephony, multimedia communications has also evolved, where voice, video a...

Full description

Bibliographic Details
Main Author: Oztoprak, Huseyin
Published: University of Surrey 2011
Subjects:
Online Access:https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.549469
Description
Summary:The past decades have seen a very fast growth of the telecommunications industry. Mobile telephony has evolved from a specialist application to being commonplace and affordable, and is now a mass-market industry. Like mobile telephony, multimedia communications has also evolved, where voice, video and data are all to be integrated into one device. Today's audio and speech communication systems are characterised by heterogeneous networks, and varying natural environment conditions. The resilience of employed coding paradigms against network related problems is one of the principal factors in determining the satisfaction of end user. The aim of the research presented here is to improve the error resilience of audio and speech codecs using the dedicated redundancy in a source-aware way. Firstly, Index Assignment based Channel Coding (IACC), a joint source channel codec designed for alleviating the effects of bit errors on the speech and audio codecs is introduced. Although IACC is a type of joint source channel coding, it does not intervene with the source codec design. The proposed scheme takes into account source characteristics and adjusts the amount of coding according to the sensitivity of the different values of the source parameters. It is shown that source characteristics play an important role in the performance of IACC. A scheme which concatenates IACC and convolutional coding is also presented. The performance of IACC based schemes has been evaluated by applying them to the parameters generated by AMR-WB+ audio codec. A method for perceptual training of IACC codes is also proposed. Subjective tests comparing the performance of IACC based schemes and established convolutional coding have also been performed. Next, various new techniques for improving the performance of multiple description coding techniques in protecting audio in networks with packet losses are presented. AAC is chosen as the underlying audio codec. Firstly, two methods for improving the performance of multiple description transform coding in application to spectral coefficients are proposed. Secondly, multiple description vector quantisation is adapted to AAC spectral coefficients and a method for improving its performance is presented. Thirdly, a coding scheme which lowers the side information burden in multiple description coding is proposed. Lastly, the performance of techniques and single description coding are compared in networks with various packet loss rates. Useful operating points for all these schemes are obtained. A scalable multiple description scheme is introduced as the last contribution in the thesis. The proposed system provides multiple description for the hierarchical two layers. The trade-off between the first and second layers and the trade-off between the central and side distortions are controlled parametrically.