Urban computing using call detail records : mobility pattern mining, next-location prediction and location recommendation

Thesis: S.M. in Transportation, Massachusetts Institute of Technology, Department of Civil and Environmental Engineering, 2016. === Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016. === This electronic version was submitted by the s...

Full description

Bibliographic Details
Main Author: Leng, Yan, Ph.D. Massachusetts Institute of Technology
Other Authors: Jinhua Zhao, Larry Rudolph and Haris N. Koutsopoulos.
Format: Others
Language:English
Published: Massachusetts Institute of Technology 2016
Subjects:
Online Access:http://hdl.handle.net/1721.1/104156
id ndltd-MIT-oai-dspace.mit.edu-1721.1-104156
record_format oai_dc
collection NDLTD
language English
format Others
sources NDLTD
topic Civil and Environmental Engineering.
Electrical Engineering and Computer Science.
spellingShingle Civil and Environmental Engineering.
Electrical Engineering and Computer Science.
Leng, Yan, Ph.D. Massachusetts Institute of Technology
Urban computing using call detail records : mobility pattern mining, next-location prediction and location recommendation
description Thesis: S.M. in Transportation, Massachusetts Institute of Technology, Department of Civil and Environmental Engineering, 2016. === Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016. === This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. === Cataloged from student-submitted PDF version of thesis. === Includes bibliographical references (pages 145-152). === Urban computing fuses computer science with other fields, such as transportation, in the context of urban spaces by connecting ubiquitous sensing technologies, analytical models and visualizations to solve challenging problems in urban environment and operation systems. This paper focuses on Call Detail Records, one widely collected opportunistic sensing data source for billing purposes, to understand presence patterns, develop mobility prediction methods and reduce traffic congestions with location recommendations. Understanding human mobility and presence patterns at locations are the building blocks for behavior prediction, service design and system improvements. In the first part, this thesis focuses on 1) understanding presence patterns at user locations with a proposed metric Normalized Hourly Presence, 2) extracting common presence patterns across the population with Principal Component Analysis; 3) and infer home and workplaces using K-means Clustering and Fuzzy C-means Clustering. The proposed method was implemented on MIT Reality Mining data, by which we demonstrate that with inference rates of 56% and 82%, the method can improve 79% and 34% in accuracy respectively in home and workplace inference comparing to the baseline model. In addition, it was implemented on the CDR data collected in a crowded city in China to prove its scalability and applicability in real-world applications. With Fuzzy C-means Clustering, we could flexibly trade-off between inference rate and accuracy to understand the interplay between the two and apply it for various purposes. With an understanding of mobility patterns, the next crucial foundation in urban computing is mobility prediction, enabling transportation practitioners to take actions beforehand and commercial organizations to send location-based advertisements, etc. Specifically, this paper focuses on next-location prediction from Call Detail Records. Mobility traces was analogized to language models, mapping cell towers to words and individual location traces to sentences. Recurrent Neural Network is a successful tool in natural language processing, which is applied in mobility prediction due to its acceptance of sequential input, variable input length and ability to learn the 'meaning' of cell towers. By implementing the method on Call Detail Records collected in Andorra, we show that the method improved more than 40% over the baseline model, with 67% and 78% accuracy in next location at cell tower and merged cell tower level respectively. The 'meanings' of the cell tower could also be inferred, the same as learning the meanings of words in sentences, from the embedding layer of Recurrent Neural Network. The last project aims at tackling the challenge of severe traffic congestions with location recommendations. The availability of large-scale longitudinal geolocation data, such as Call Detail Records, offers planners and service providers an unprecedented opportunity to understand location preferences and alleviate traffic congestions. Location recommendation is a potential tool to achieve these two objectives. Previous research on location recommendations has focused on automatically and accurately inferring users' preferences, while little attention has been devoted to the constraints of service capacity. The ignorance may lead to congestion and long waiting time. We argue that Call Detail Records could help planners and authorities make interventions by providing personalized recommendations given the comprehensive urban-wide picture of historical behaviors and preferences. In this research, we propose a method to make location recommendations for system efficiency, defined as maximizing satisfactions toward recommendations subject to capacity constraints, exploiting travelers' choice flexibilities. We infer implicit location preferences based on sparse and passively-collected Call Detail Records. We then formulate an optimization model the defined system efficiency. As a proof-of-concept experiment, we implement the method in Andorra, a small European country heavily relying on tourism. By extensive simulations, we demonstrate that the method can reduce the travel time increased by congestion during peak hour from 11.73 minutes to 5.6 minutes with idealized trips under full compliance rates. We show that the average travel time increased by congestion is 6.17, 6.98, 8.37 and 10.98 minutes with 80%, 60%, 40% and 20% compliance rates. Overall, our results indicate that Call Detail Records can be used to make locations recommendation while reduce traffic congestion for system efficiency. The proposed method can be applied to other large-scale location traces and extended to other location or events recommendation applications. === by Yan Leng. === S.M. in Transportation === S.M.
author2 Jinhua Zhao, Larry Rudolph and Haris N. Koutsopoulos.
author_facet Jinhua Zhao, Larry Rudolph and Haris N. Koutsopoulos.
Leng, Yan, Ph.D. Massachusetts Institute of Technology
author Leng, Yan, Ph.D. Massachusetts Institute of Technology
author_sort Leng, Yan, Ph.D. Massachusetts Institute of Technology
title Urban computing using call detail records : mobility pattern mining, next-location prediction and location recommendation
title_short Urban computing using call detail records : mobility pattern mining, next-location prediction and location recommendation
title_full Urban computing using call detail records : mobility pattern mining, next-location prediction and location recommendation
title_fullStr Urban computing using call detail records : mobility pattern mining, next-location prediction and location recommendation
title_full_unstemmed Urban computing using call detail records : mobility pattern mining, next-location prediction and location recommendation
title_sort urban computing using call detail records : mobility pattern mining, next-location prediction and location recommendation
publisher Massachusetts Institute of Technology
publishDate 2016
url http://hdl.handle.net/1721.1/104156
work_keys_str_mv AT lengyanphdmassachusettsinstituteoftechnology urbancomputingusingcalldetailrecordsmobilitypatternminingnextlocationpredictionandlocationrecommendation
_version_ 1719334151711096832
spelling ndltd-MIT-oai-dspace.mit.edu-1721.1-1041562020-07-31T05:06:55Z Urban computing using call detail records : mobility pattern mining, next-location prediction and location recommendation Leng, Yan, Ph.D. Massachusetts Institute of Technology Jinhua Zhao, Larry Rudolph and Haris N. Koutsopoulos. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Civil and Environmental Engineering Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Civil and Environmental Engineering. Electrical Engineering and Computer Science. Thesis: S.M. in Transportation, Massachusetts Institute of Technology, Department of Civil and Environmental Engineering, 2016. Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016. This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Cataloged from student-submitted PDF version of thesis. Includes bibliographical references (pages 145-152). Urban computing fuses computer science with other fields, such as transportation, in the context of urban spaces by connecting ubiquitous sensing technologies, analytical models and visualizations to solve challenging problems in urban environment and operation systems. This paper focuses on Call Detail Records, one widely collected opportunistic sensing data source for billing purposes, to understand presence patterns, develop mobility prediction methods and reduce traffic congestions with location recommendations. Understanding human mobility and presence patterns at locations are the building blocks for behavior prediction, service design and system improvements. In the first part, this thesis focuses on 1) understanding presence patterns at user locations with a proposed metric Normalized Hourly Presence, 2) extracting common presence patterns across the population with Principal Component Analysis; 3) and infer home and workplaces using K-means Clustering and Fuzzy C-means Clustering. The proposed method was implemented on MIT Reality Mining data, by which we demonstrate that with inference rates of 56% and 82%, the method can improve 79% and 34% in accuracy respectively in home and workplace inference comparing to the baseline model. In addition, it was implemented on the CDR data collected in a crowded city in China to prove its scalability and applicability in real-world applications. With Fuzzy C-means Clustering, we could flexibly trade-off between inference rate and accuracy to understand the interplay between the two and apply it for various purposes. With an understanding of mobility patterns, the next crucial foundation in urban computing is mobility prediction, enabling transportation practitioners to take actions beforehand and commercial organizations to send location-based advertisements, etc. Specifically, this paper focuses on next-location prediction from Call Detail Records. Mobility traces was analogized to language models, mapping cell towers to words and individual location traces to sentences. Recurrent Neural Network is a successful tool in natural language processing, which is applied in mobility prediction due to its acceptance of sequential input, variable input length and ability to learn the 'meaning' of cell towers. By implementing the method on Call Detail Records collected in Andorra, we show that the method improved more than 40% over the baseline model, with 67% and 78% accuracy in next location at cell tower and merged cell tower level respectively. The 'meanings' of the cell tower could also be inferred, the same as learning the meanings of words in sentences, from the embedding layer of Recurrent Neural Network. The last project aims at tackling the challenge of severe traffic congestions with location recommendations. The availability of large-scale longitudinal geolocation data, such as Call Detail Records, offers planners and service providers an unprecedented opportunity to understand location preferences and alleviate traffic congestions. Location recommendation is a potential tool to achieve these two objectives. Previous research on location recommendations has focused on automatically and accurately inferring users' preferences, while little attention has been devoted to the constraints of service capacity. The ignorance may lead to congestion and long waiting time. We argue that Call Detail Records could help planners and authorities make interventions by providing personalized recommendations given the comprehensive urban-wide picture of historical behaviors and preferences. In this research, we propose a method to make location recommendations for system efficiency, defined as maximizing satisfactions toward recommendations subject to capacity constraints, exploiting travelers' choice flexibilities. We infer implicit location preferences based on sparse and passively-collected Call Detail Records. We then formulate an optimization model the defined system efficiency. As a proof-of-concept experiment, we implement the method in Andorra, a small European country heavily relying on tourism. By extensive simulations, we demonstrate that the method can reduce the travel time increased by congestion during peak hour from 11.73 minutes to 5.6 minutes with idealized trips under full compliance rates. We show that the average travel time increased by congestion is 6.17, 6.98, 8.37 and 10.98 minutes with 80%, 60%, 40% and 20% compliance rates. Overall, our results indicate that Call Detail Records can be used to make locations recommendation while reduce traffic congestion for system efficiency. The proposed method can be applied to other large-scale location traces and extended to other location or events recommendation applications. by Yan Leng. S.M. in Transportation S.M. 2016-09-13T18:11:02Z 2016-09-13T18:11:02Z 2016 2016 Thesis http://hdl.handle.net/1721.1/104156 958280452 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 152 pages application/pdf Massachusetts Institute of Technology