Summary: | This thesis considers the problem of searching a large set of items, such as emails, for a small subset which are relevant to a given query. This can be implemented in a sequential manner – whereby knowledge from items that have already been screened is used to assist in the selection of subsequent items to screen. Often the items being searched have an underlying network structure. Using the network structure and a modelling assumption that relevant items and participants are likely to cluster together can greatly increase the rate of screening relevant items. However, inference in this type of model is computationally expensive. In the first part of this thesis, we show that Bayes linear methods provide a natural approach to modelling this data. We develop a new optimisation problem for Bernoulli random variables, called constrained Bayes linear, which has additional constraints incorporated into the Bayes linear optimisation problem. For non-linear relationships between the latent variable and observations, Bayes linear will give a poor approximation. We propose a novel sequential Monte Carlo method for sequential inference on the network, which better copes with non-linear relationships. We give a method for simulating the random variables based upon the Bayes linear methodology. Finally, we look at the effect the ordering of the random variables has on the joint probability distribution of binary random variables, when they are simulated using this proposed Bayes linear method.
|