Classification of computer programs in the Scratch online community
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February, 2020 === Cataloged from student-submitted PDF of thesis. === Includes bibliographical references (pages 133-136). === Scratch is a graphical programming platform that empowers...
Main Author: | |
---|---|
Other Authors: | |
Format: | Others |
Language: | English |
Published: |
Massachusetts Institute of Technology
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/1721.1/129862 |
id |
ndltd-MIT-oai-dspace.mit.edu-1721.1-129862 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-MIT-oai-dspace.mit.edu-1721.1-1298622021-02-21T05:17:09Z Classification of computer programs in the Scratch online community Abdalla, Lena(Lena A.) Andrew Sliwinski. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Electrical Engineering and Computer Science. Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February, 2020 Cataloged from student-submitted PDF of thesis. Includes bibliographical references (pages 133-136). Scratch is a graphical programming platform that empowers children to create computer programs and realize their ideas. Although the Scratch online community is filled with a variety of diverse projects, many of these projects also share similarities. For example, they tend to fall into certain categories, including games, animations, stories, and more. Throughout this thesis, I describe the application of Natural Language Processing (NLP) techniques to vectorize and classify Scratch projects by type. This effort included constructing a labeled dataset of 873 Scratch projects and their corresponding types, to be used for training a supervised classifier model. This dataset was constructed through a collective process of consensus-based annotation by experts. To realize the goal of classifying Scratch projects by type, I first train an unsupervised model of meaningful vector representations for Scratch blocks based on the composition of 500,000 projects. Using the unsupervised model as a basis for representing Scratch blocks, I then train a supervised classifier model that categorizes Scratch projects by type into one of: "animation", "game", and "other". After an extensive hyperparameter tuning process, I am able to train a classifier model with an F1 Score of 0.737. I include in this paper an in-depth analysis of the unsupervised and supervised models, and explore the different elements that were learned during training. Overall, I demonstrate that NLP techniques can be used in the classification of computer programs to a reasonable level of accuracy. by Lena Abdalla. M. Eng. M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science 2021-02-19T20:26:08Z 2021-02-19T20:26:08Z 2020 2020 Thesis https://hdl.handle.net/1721.1/129862 1237279491 eng MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. http://dspace.mit.edu/handle/1721.1/7582 136 pages application/pdf Massachusetts Institute of Technology |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
Electrical Engineering and Computer Science. |
spellingShingle |
Electrical Engineering and Computer Science. Abdalla, Lena(Lena A.) Classification of computer programs in the Scratch online community |
description |
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February, 2020 === Cataloged from student-submitted PDF of thesis. === Includes bibliographical references (pages 133-136). === Scratch is a graphical programming platform that empowers children to create computer programs and realize their ideas. Although the Scratch online community is filled with a variety of diverse projects, many of these projects also share similarities. For example, they tend to fall into certain categories, including games, animations, stories, and more. Throughout this thesis, I describe the application of Natural Language Processing (NLP) techniques to vectorize and classify Scratch projects by type. This effort included constructing a labeled dataset of 873 Scratch projects and their corresponding types, to be used for training a supervised classifier model. This dataset was constructed through a collective process of consensus-based annotation by experts. To realize the goal of classifying Scratch projects by type, I first train an unsupervised model of meaningful vector representations for Scratch blocks based on the composition of 500,000 projects. Using the unsupervised model as a basis for representing Scratch blocks, I then train a supervised classifier model that categorizes Scratch projects by type into one of: "animation", "game", and "other". After an extensive hyperparameter tuning process, I am able to train a classifier model with an F1 Score of 0.737. I include in this paper an in-depth analysis of the unsupervised and supervised models, and explore the different elements that were learned during training. Overall, I demonstrate that NLP techniques can be used in the classification of computer programs to a reasonable level of accuracy. === by Lena Abdalla. === M. Eng. === M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science |
author2 |
Andrew Sliwinski. |
author_facet |
Andrew Sliwinski. Abdalla, Lena(Lena A.) |
author |
Abdalla, Lena(Lena A.) |
author_sort |
Abdalla, Lena(Lena A.) |
title |
Classification of computer programs in the Scratch online community |
title_short |
Classification of computer programs in the Scratch online community |
title_full |
Classification of computer programs in the Scratch online community |
title_fullStr |
Classification of computer programs in the Scratch online community |
title_full_unstemmed |
Classification of computer programs in the Scratch online community |
title_sort |
classification of computer programs in the scratch online community |
publisher |
Massachusetts Institute of Technology |
publishDate |
2021 |
url |
https://hdl.handle.net/1721.1/129862 |
work_keys_str_mv |
AT abdallalenalenaa classificationofcomputerprogramsinthescratchonlinecommunity |
_version_ |
1719377877931130880 |