An investigation into the generation, encoding and retrieval of CCTV-derived knowledge
Modern video surveillance systems generate diverse forms of data and to facilitate the effective exchange of these data a methodical approach is required. This thesis proposes the Video Surveillance Content Description Interface (VSCDI), a component of ISO/IEC 23000-10 - Information technology - Mul...
Main Author: | |
---|---|
Published: |
Kingston University
2008
|
Subjects: | |
Online Access: | https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.523148 |
Summary: | Modern video surveillance systems generate diverse forms of data and to facilitate the effective exchange of these data a methodical approach is required. This thesis proposes the Video Surveillance Content Description Interface (VSCDI), a component of ISO/IEC 23000-10 - Information technology - Multimedia application format (MPEG-A) - Part 10: Video surveillance application format. The interface is designed to describe content associated with and generated by a surveillance system. In particular, a set of descriptors are included for: content-based image retrieval; user-defined Classification Schemes to impose any required description ontology; and to provide consistent descriptions across multiple sources. The VSCDI is evaluated using comparisons with other meta-data frameworks and in terms of the performance of its colour descriptor components. Two new data sets are created of pedestrians in indoor environments with multiple camera views for re-identification experiments. The experiments use a novel application of colour constancy for cross-camera comparisons. Two evaluation measures are used: the Average Normalised Mean Retrieval Rate (ANMRR) for ranked estimates; and the Information Gain metric for probabilistic estimates. Techniques are investigated for using more than one descriptor both to provide the estimate and to represent a person whose image is split into Top and Bottom clothing components. The re-identification of pedestrians is discussed in the context of providing both a coherent description of the overall scene activity and within an embedded system. |
---|