Demographics and Personality Discovery on Social Media: A Machine Learning Approach
This research proposes a new feature extraction algorithm using aggregated user engagements on social media in order to achieve demographics and personality discovery tasks. Our proposed framework can discover seven essential attributes, including gender identity, age group, residential area, educat...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-08-01
|
Series: | Information |
Subjects: | |
Online Access: | https://www.mdpi.com/2078-2489/12/9/353 |
id |
doaj-d97e91949ab24e3bbf2562a772729b2b |
---|---|
record_format |
Article |
spelling |
doaj-d97e91949ab24e3bbf2562a772729b2b2021-09-26T00:26:25ZengMDPI AGInformation2078-24892021-08-011235335310.3390/info12090353Demographics and Personality Discovery on Social Media: A Machine Learning ApproachSarach Tuomchomtam0Nuanwan Soonthornphisaj1Artificial Intelligence and Knowledge Discovery Laboratory, Department of Computer Science, Faculty of Science, Kasetsart University, Bangkok 10900, ThailandArtificial Intelligence and Knowledge Discovery Laboratory, Department of Computer Science, Faculty of Science, Kasetsart University, Bangkok 10900, ThailandThis research proposes a new feature extraction algorithm using aggregated user engagements on social media in order to achieve demographics and personality discovery tasks. Our proposed framework can discover seven essential attributes, including gender identity, age group, residential area, education level, political affiliation, religious belief, and personality type. Multiple feature sets are developed, including comment text, community activity, and hybrid features. Various machine learning algorithms are explored, such as support vector machines, random forest, multi-layer perceptron, and naïve Bayes. An empirical analysis is performed on various aspects, including correctness, robustness, training time, and the class imbalance problem. We obtained the highest prediction performance by using our proposed feature extraction algorithm. The result on personality type prediction was 87.18%. For the demographic attribute prediction task, our feature sets also outperformed the baseline at 98.1% for residential area, 94.7% for education level, 92.1% for gender identity, 91.5% for political affiliation, 60.6% for religious belief, and 52.0% for the age group. Moreover, this paper provides the guideline for the choice of classifiers with appropriate feature sets.https://www.mdpi.com/2078-2489/12/9/353demographic attributespersonality predictionsocial mediamachine learning |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Sarach Tuomchomtam Nuanwan Soonthornphisaj |
spellingShingle |
Sarach Tuomchomtam Nuanwan Soonthornphisaj Demographics and Personality Discovery on Social Media: A Machine Learning Approach Information demographic attributes personality prediction social media machine learning |
author_facet |
Sarach Tuomchomtam Nuanwan Soonthornphisaj |
author_sort |
Sarach Tuomchomtam |
title |
Demographics and Personality Discovery on Social Media: A Machine Learning Approach |
title_short |
Demographics and Personality Discovery on Social Media: A Machine Learning Approach |
title_full |
Demographics and Personality Discovery on Social Media: A Machine Learning Approach |
title_fullStr |
Demographics and Personality Discovery on Social Media: A Machine Learning Approach |
title_full_unstemmed |
Demographics and Personality Discovery on Social Media: A Machine Learning Approach |
title_sort |
demographics and personality discovery on social media: a machine learning approach |
publisher |
MDPI AG |
series |
Information |
issn |
2078-2489 |
publishDate |
2021-08-01 |
description |
This research proposes a new feature extraction algorithm using aggregated user engagements on social media in order to achieve demographics and personality discovery tasks. Our proposed framework can discover seven essential attributes, including gender identity, age group, residential area, education level, political affiliation, religious belief, and personality type. Multiple feature sets are developed, including comment text, community activity, and hybrid features. Various machine learning algorithms are explored, such as support vector machines, random forest, multi-layer perceptron, and naïve Bayes. An empirical analysis is performed on various aspects, including correctness, robustness, training time, and the class imbalance problem. We obtained the highest prediction performance by using our proposed feature extraction algorithm. The result on personality type prediction was 87.18%. For the demographic attribute prediction task, our feature sets also outperformed the baseline at 98.1% for residential area, 94.7% for education level, 92.1% for gender identity, 91.5% for political affiliation, 60.6% for religious belief, and 52.0% for the age group. Moreover, this paper provides the guideline for the choice of classifiers with appropriate feature sets. |
topic |
demographic attributes personality prediction social media machine learning |
url |
https://www.mdpi.com/2078-2489/12/9/353 |
work_keys_str_mv |
AT sarachtuomchomtam demographicsandpersonalitydiscoveryonsocialmediaamachinelearningapproach AT nuanwansoonthornphisaj demographicsandpersonalitydiscoveryonsocialmediaamachinelearningapproach |
_version_ |
1717366193098063872 |