Learning to track and identify players from broadcast sports videos

Tracking and identifying players in sports videos filmed with a single pan-tilt-zoom camera has many applications, but it is also a challenging problem. This thesis introduces the first intelligent system that tackles this difficult task. The system possesses the ability to detect and track multiple...

Full description

Bibliographic Details
Main Author:	Lu, Wei-Lwun
Language:	English
Published:	University of British Columbia 2012
Online Access:	http://hdl.handle.net/2429/39956

id	ndltd-UBC-oai-circle.library.ubc.ca-2429-39956
record_format	oai_dc
spelling	ndltd-UBC-oai-circle.library.ubc.ca-2429-399562018-01-05T17:25:33Z Learning to track and identify players from broadcast sports videos Lu, Wei-Lwun Tracking and identifying players in sports videos filmed with a single pan-tilt-zoom camera has many applications, but it is also a challenging problem. This thesis introduces the first intelligent system that tackles this difficult task. The system possesses the ability to detect and track multiple players, estimates the homography between video frames and the court, and identifies the players. The tracking system is based on the tracking-by-detection philosophy. We first localize players using a player detector, categorize detections based on team colors, and then group them into tracks of specific players. Instead of using visual cues to distinguish between players, we instead rely on their short-term motion patterns. The homography estimation is solved by using a variant of the Iterated Closest Points (ICP). Unlike most existing algorithms that rely on matching robust feature points, we propose to match edge points in two images. In addition, we also introduce a technique to update the model online to accommodate logos and patterns in different stadiums. The identification system utilizes both visual and spatial cues, and exploits both temporal and mutual exclusion constraints in a Conditional Random Field. In addition, we propose a novel Linear Programming Relaxation algorithm for predicting the best player identification in a video clip. In order to reduce the number of labeled training data required to learn the identification system, we pioneer the use of weakly supervised learning with the assistance of play-by-play texts. Experiments show promising results in tracking, homography estimation, and identification. Moreover, weakly supervised learning with play-by-play texts greatly reduces the number of labeled training data required. Experiments show that we can use weakly supervised learning with merely 200 labels to achieve similar accuracies to a strongly supervised approach, which requires at least 20000 labels. Science, Faculty of Computer Science, Department of Graduate 2012-01-09T19:09:18Z 2012-01-09T19:09:18Z 2011 2012-05 Text Thesis/Dissertation http://hdl.handle.net/2429/39956 eng Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/ University of British Columbia
collection	NDLTD
language	English
sources	NDLTD
description	Tracking and identifying players in sports videos filmed with a single pan-tilt-zoom camera has many applications, but it is also a challenging problem. This thesis introduces the first intelligent system that tackles this difficult task. The system possesses the ability to detect and track multiple players, estimates the homography between video frames and the court, and identifies the players. The tracking system is based on the tracking-by-detection philosophy. We first localize players using a player detector, categorize detections based on team colors, and then group them into tracks of specific players. Instead of using visual cues to distinguish between players, we instead rely on their short-term motion patterns. The homography estimation is solved by using a variant of the Iterated Closest Points (ICP). Unlike most existing algorithms that rely on matching robust feature points, we propose to match edge points in two images. In addition, we also introduce a technique to update the model online to accommodate logos and patterns in different stadiums. The identification system utilizes both visual and spatial cues, and exploits both temporal and mutual exclusion constraints in a Conditional Random Field. In addition, we propose a novel Linear Programming Relaxation algorithm for predicting the best player identification in a video clip. In order to reduce the number of labeled training data required to learn the identification system, we pioneer the use of weakly supervised learning with the assistance of play-by-play texts. Experiments show promising results in tracking, homography estimation, and identification. Moreover, weakly supervised learning with play-by-play texts greatly reduces the number of labeled training data required. Experiments show that we can use weakly supervised learning with merely 200 labels to achieve similar accuracies to a strongly supervised approach, which requires at least 20000 labels. === Science, Faculty of === Computer Science, Department of === Graduate
author	Lu, Wei-Lwun
spellingShingle	Lu, Wei-Lwun Learning to track and identify players from broadcast sports videos
author_facet	Lu, Wei-Lwun
author_sort	Lu, Wei-Lwun
title	Learning to track and identify players from broadcast sports videos
title_short	Learning to track and identify players from broadcast sports videos
title_full	Learning to track and identify players from broadcast sports videos
title_fullStr	Learning to track and identify players from broadcast sports videos
title_full_unstemmed	Learning to track and identify players from broadcast sports videos
title_sort	learning to track and identify players from broadcast sports videos
publisher	University of British Columbia
publishDate	2012
url	http://hdl.handle.net/2429/39956
work_keys_str_mv	AT luweilwun learningtotrackandidentifyplayersfrombroadcastsportsvideos
_version_	1718583188230504448

Learning to track and identify players from broadcast sports videos

Similar Items