Classification of computer programs in the Scratch online community

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February, 2020 === Cataloged from student-submitted PDF of thesis. === Includes bibliographical references (pages 133-136). === Scratch is a graphical programming platform that empowers...

Full description

Bibliographic Details
Main Author: Abdalla, Lena(Lena A.)
Other Authors: Andrew Sliwinski.
Format: Others
Language:English
Published: Massachusetts Institute of Technology 2021
Subjects:
Online Access:https://hdl.handle.net/1721.1/129862
id ndltd-MIT-oai-dspace.mit.edu-1721.1-129862
record_format oai_dc
spelling ndltd-MIT-oai-dspace.mit.edu-1721.1-1298622021-02-21T05:17:09Z Classification of computer programs in the Scratch online community Abdalla, Lena(Lena A.) Andrew Sliwinski. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Electrical Engineering and Computer Science. Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February, 2020 Cataloged from student-submitted PDF of thesis. Includes bibliographical references (pages 133-136). Scratch is a graphical programming platform that empowers children to create computer programs and realize their ideas. Although the Scratch online community is filled with a variety of diverse projects, many of these projects also share similarities. For example, they tend to fall into certain categories, including games, animations, stories, and more. Throughout this thesis, I describe the application of Natural Language Processing (NLP) techniques to vectorize and classify Scratch projects by type. This effort included constructing a labeled dataset of 873 Scratch projects and their corresponding types, to be used for training a supervised classifier model. This dataset was constructed through a collective process of consensus-based annotation by experts. To realize the goal of classifying Scratch projects by type, I first train an unsupervised model of meaningful vector representations for Scratch blocks based on the composition of 500,000 projects. Using the unsupervised model as a basis for representing Scratch blocks, I then train a supervised classifier model that categorizes Scratch projects by type into one of: "animation", "game", and "other". After an extensive hyperparameter tuning process, I am able to train a classifier model with an F1 Score of 0.737. I include in this paper an in-depth analysis of the unsupervised and supervised models, and explore the different elements that were learned during training. Overall, I demonstrate that NLP techniques can be used in the classification of computer programs to a reasonable level of accuracy. by Lena Abdalla. M. Eng. M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science 2021-02-19T20:26:08Z 2021-02-19T20:26:08Z 2020 2020 Thesis https://hdl.handle.net/1721.1/129862 1237279491 eng MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. http://dspace.mit.edu/handle/1721.1/7582 136 pages application/pdf Massachusetts Institute of Technology
collection NDLTD
language English
format Others
sources NDLTD
topic Electrical Engineering and Computer Science.
spellingShingle Electrical Engineering and Computer Science.
Abdalla, Lena(Lena A.)
Classification of computer programs in the Scratch online community
description Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February, 2020 === Cataloged from student-submitted PDF of thesis. === Includes bibliographical references (pages 133-136). === Scratch is a graphical programming platform that empowers children to create computer programs and realize their ideas. Although the Scratch online community is filled with a variety of diverse projects, many of these projects also share similarities. For example, they tend to fall into certain categories, including games, animations, stories, and more. Throughout this thesis, I describe the application of Natural Language Processing (NLP) techniques to vectorize and classify Scratch projects by type. This effort included constructing a labeled dataset of 873 Scratch projects and their corresponding types, to be used for training a supervised classifier model. This dataset was constructed through a collective process of consensus-based annotation by experts. To realize the goal of classifying Scratch projects by type, I first train an unsupervised model of meaningful vector representations for Scratch blocks based on the composition of 500,000 projects. Using the unsupervised model as a basis for representing Scratch blocks, I then train a supervised classifier model that categorizes Scratch projects by type into one of: "animation", "game", and "other". After an extensive hyperparameter tuning process, I am able to train a classifier model with an F1 Score of 0.737. I include in this paper an in-depth analysis of the unsupervised and supervised models, and explore the different elements that were learned during training. Overall, I demonstrate that NLP techniques can be used in the classification of computer programs to a reasonable level of accuracy. === by Lena Abdalla. === M. Eng. === M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
author2 Andrew Sliwinski.
author_facet Andrew Sliwinski.
Abdalla, Lena(Lena A.)
author Abdalla, Lena(Lena A.)
author_sort Abdalla, Lena(Lena A.)
title Classification of computer programs in the Scratch online community
title_short Classification of computer programs in the Scratch online community
title_full Classification of computer programs in the Scratch online community
title_fullStr Classification of computer programs in the Scratch online community
title_full_unstemmed Classification of computer programs in the Scratch online community
title_sort classification of computer programs in the scratch online community
publisher Massachusetts Institute of Technology
publishDate 2021
url https://hdl.handle.net/1721.1/129862
work_keys_str_mv AT abdallalenalenaa classificationofcomputerprogramsinthescratchonlinecommunity
_version_ 1719377877931130880