Binary classification for predicting propensity to buy flight tickets. : A study on whether binary classification can be used to predict Scandinavian Airlines customers’ propensity to buy a flight ticket within the next seven days.

A customers propensity to buy a certain product is a widely researched field and is applied in multiple industries. In this thesis it is showed that using binary classification on data from Scandinavian Airlines can predict their customers propensity to book a flight within the next coming seven day...

Full description

Bibliographic Details
Main Authors: Andersson, Martin, Mazouch, Marcus
Format: Others
Language:English
Published: Umeå universitet, Institutionen för matematik och matematisk statistik 2019
Subjects:
rbf
ai
sas
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-160855
Description
Summary:A customers propensity to buy a certain product is a widely researched field and is applied in multiple industries. In this thesis it is showed that using binary classification on data from Scandinavian Airlines can predict their customers propensity to book a flight within the next coming seven days. A comparison between logistic regression and support vector machine is presented and logistic regression with reduced number of variables is chosen as the final model, due to it’s simplicity and accuracy. The explanatory variables contains exclusively booking history, whilst customer demographics and search history is showed to be insignificant. === En kunds benägenhet att göra ett visst köp är ett allmänt undersökt område som applicerats i flera olika branscher. I den här studien visas det att statistiska binära klassificeringsmodeller kan användas för att prediktera Scandinavian Airlines kunders benägenhet att köpa en resa de kommande sju dagarna. En jämförelse är presenterad mellan logistisk regression och stödvektormaskin och logistisk regression med reducerat antal parametrar väljs som den slutgiltiga modellen tack vare sin enkelhet och träffsäkerhet. De förklarande variablerna är uteslutande bokningshistorik medan kundens demografi och sökdata visas vara insignifikant.