Summary: | Image database systems are very useful in many applications. To design an effective image
database system, high dimensional image feature vectors have to be extracted from the
images automatically. Each comparison between them tends to be expensive, so sequential
comparisons are usually impractical. Moreover, the. traditional multi-dimensional
indexing structures are incapable of handling these high-dimensional vectors efficiently.
Thus, it has been proposed to abstract lower dimensional k-D vector from the original
N-D feature vector, where k <C N. 2-level filtering is then used so that the k-D vector can
fit in the indexing structure for coarse filtering and much fewer comparisons are needed
between N-D vectors for the fine filtering stage. Unfortunately, both stages cannot be
efficient at the same time. A major contribution of this thesis is to propose the idea of
multi-level filtering by adding additional intermediate levels so that both the coarsest
and finest filtering stages can be very efficient. Based on the cost models developed, the
trends of 2-level and multi-level filterings are analysed and compared. The experimental
evaluations further confirm that the 3-level filtering usually requires less CPU and I/O
time than 2-level does. When compared with the previous approach of 2-level filtering,
3-levels can save from 15% to over 400% of time needed. Another contribution is to develop
the optimizers which can find the (near) optimal configuration of 2-level and 3-level
filterings for any image dataset. Experimental results show that in about 32 seconds,
the developed optimizer can find a configuration whose total run-time per query exceeds
that of the real optimal configuration by less than 2.5%.
|