Mining for co-occurring motion trajectories : sport analysis

This thesis investigates the applicability of a data mining algorithm for automatic pattern discovery widely used for conventional databases, called Apriori, to a new domain - 2D motion trajectory data. This is one the first attempts to analyze motion trajectory data, in the data mining style, i....

Full description

Bibliographic Details
Main Author: Dimitrijević, Maja
Language:English
Published: 2009
Online Access:http://hdl.handle.net/2429/12060
Description
Summary:This thesis investigates the applicability of a data mining algorithm for automatic pattern discovery widely used for conventional databases, called Apriori, to a new domain - 2D motion trajectory data. This is one the first attempts to analyze motion trajectory data, in the data mining style, i.e., to develop methods for automatic finding of interesting patterns or rules in the object motion trajectories. While our focus is on the application to the hockey game analysis, similar methods could also be used in the area of video surveillance, for sport game strategies, or more generally in geographic applications. More specifically, our focus is on the discovery of the hockey game patterns that contain frequent motion trajectories of the hockey players, where the frequency is defined with respect to a motion trajectory similarity measure. Furthermore, the patterns relate motion of the players of the same or opposing teams, which should be correlated according to their roles in the game. We design and implement a system that discovers such patterns, and test its effectiveness and efficiency on a real life and semi-randomly generated data set. Our effectiveness tests tend to prove the right choice of the motion trajectory similarity measure, and the validity of the algorithm. Our tests also include a comparison of using the Apriori algorithm, with a semi-naive algorithm, proving the importance of using Apriori, which outperforms the semi-naive algorithm for various choices of parameters and data sizes.