Clustering Educational Digital Library Usage Data: Comparisons of Latent Class Analysis and K-Means Algorithms

There are common pitfalls and neglected areas when using clustering approaches to solve educational problems. A clustering algorithm is often used without the choice being justified. Few comparisons between a selected algorithm and a competing algorithm are presented, and results are presented witho...

Full description

Bibliographic Details
Main Author: Xu, Beijie
Format: Others
Published: DigitalCommons@USU 2011
Subjects:
Online Access:https://digitalcommons.usu.edu/etd/954
https://digitalcommons.usu.edu/cgi/viewcontent.cgi?article=1950&context=etd
id ndltd-UTAHS-oai-digitalcommons.usu.edu-etd-1950
record_format oai_dc
spelling ndltd-UTAHS-oai-digitalcommons.usu.edu-etd-19502019-10-13T06:07:59Z Clustering Educational Digital Library Usage Data: Comparisons of Latent Class Analysis and K-Means Algorithms Xu, Beijie There are common pitfalls and neglected areas when using clustering approaches to solve educational problems. A clustering algorithm is often used without the choice being justified. Few comparisons between a selected algorithm and a competing algorithm are presented, and results are presented without validation. Lastly, few studies fully utilize data provided in an educational environment to evaluate their findings. In response to these problems, this thesis describes a rigorous study comparing two clustering algorithms in the context of an educational digital library service, called the Instructional Architect. First, a detailed description of the chosen clustering algorithm, namely, latent class analysis (LCA), is presented. Second, three kinds of preprocessed data are separately applied to both the selected algorithm and a competing algorithm, namely, K-means algorithm. Third, a series of comprehensive evaluations on four aspects of each clustering result, i.e., intra-cluster and inter-cluster distances, Davies-Bouldin index, users' demographic profile, and cluster evolution, are conducted to compare the clustering results of LCA and K-means algorithms. Evaluation results show that LCA outperforms K-means in producing consistent clustering results at different settings, finding compact clusters, and finding connections between users' teaching experience and their effectiveness in using the IA. The implication, contributions, and limitation of this research are discussed. 2011-05-01T07:00:00Z text application/pdf https://digitalcommons.usu.edu/etd/954 https://digitalcommons.usu.edu/cgi/viewcontent.cgi?article=1950&context=etd Copyright for this work is held by the author. Transmission or reproduction of materials protected by copyright beyond that allowed by fair use requires the written permission of the copyright owners. Works not in the public domain cannot be commercially exploited without permission of the copyright owner. Responsibility for any use rests exclusively with the user. For more information contact Andrew Wesolek (andrew.wesolek@usu.edu). All Graduate Theses and Dissertations DigitalCommons@USU Clustering Data Mining Digital Library educational data mining K-Means Latent class analysis Computer Science Computer Engineering
collection NDLTD
format Others
sources NDLTD
topic Clustering
Data Mining
Digital Library
educational data mining
K-Means
Latent class analysis
Computer Science
Computer Engineering
spellingShingle Clustering
Data Mining
Digital Library
educational data mining
K-Means
Latent class analysis
Computer Science
Computer Engineering
Xu, Beijie
Clustering Educational Digital Library Usage Data: Comparisons of Latent Class Analysis and K-Means Algorithms
description There are common pitfalls and neglected areas when using clustering approaches to solve educational problems. A clustering algorithm is often used without the choice being justified. Few comparisons between a selected algorithm and a competing algorithm are presented, and results are presented without validation. Lastly, few studies fully utilize data provided in an educational environment to evaluate their findings. In response to these problems, this thesis describes a rigorous study comparing two clustering algorithms in the context of an educational digital library service, called the Instructional Architect. First, a detailed description of the chosen clustering algorithm, namely, latent class analysis (LCA), is presented. Second, three kinds of preprocessed data are separately applied to both the selected algorithm and a competing algorithm, namely, K-means algorithm. Third, a series of comprehensive evaluations on four aspects of each clustering result, i.e., intra-cluster and inter-cluster distances, Davies-Bouldin index, users' demographic profile, and cluster evolution, are conducted to compare the clustering results of LCA and K-means algorithms. Evaluation results show that LCA outperforms K-means in producing consistent clustering results at different settings, finding compact clusters, and finding connections between users' teaching experience and their effectiveness in using the IA. The implication, contributions, and limitation of this research are discussed.
author Xu, Beijie
author_facet Xu, Beijie
author_sort Xu, Beijie
title Clustering Educational Digital Library Usage Data: Comparisons of Latent Class Analysis and K-Means Algorithms
title_short Clustering Educational Digital Library Usage Data: Comparisons of Latent Class Analysis and K-Means Algorithms
title_full Clustering Educational Digital Library Usage Data: Comparisons of Latent Class Analysis and K-Means Algorithms
title_fullStr Clustering Educational Digital Library Usage Data: Comparisons of Latent Class Analysis and K-Means Algorithms
title_full_unstemmed Clustering Educational Digital Library Usage Data: Comparisons of Latent Class Analysis and K-Means Algorithms
title_sort clustering educational digital library usage data: comparisons of latent class analysis and k-means algorithms
publisher DigitalCommons@USU
publishDate 2011
url https://digitalcommons.usu.edu/etd/954
https://digitalcommons.usu.edu/cgi/viewcontent.cgi?article=1950&context=etd
work_keys_str_mv AT xubeijie clusteringeducationaldigitallibraryusagedatacomparisonsoflatentclassanalysisandkmeansalgorithms
_version_ 1719267524005068800