A Procedure for Extending Input Selection Algorithms to Low Quality Data in Modelling Problems with Application to the Automatic Grading of Uploaded Assignments

When selecting relevant inputs in modeling problems with low quality data, the ranking of the most informative inputs is also uncertain. In this paper, this issue is addressed through a new procedure that allows the extending of different crisp feature selection algorithms to vague data. The partial...

Full description

Bibliographic Details
Main Authors: José Otero, Ana Palacios, Rosario Suárez, Luis Junco, Inés Couso, Luciano Sánchez
Format: Article
Language:English
Published: Hindawi Limited 2014-01-01
Series:The Scientific World Journal
Online Access:http://dx.doi.org/10.1155/2014/468405
id doaj-7a8cb82f473a4c34b68ffa078ae502bd
record_format Article
spelling doaj-7a8cb82f473a4c34b68ffa078ae502bd2020-11-25T01:37:59ZengHindawi LimitedThe Scientific World Journal2356-61401537-744X2014-01-01201410.1155/2014/468405468405A Procedure for Extending Input Selection Algorithms to Low Quality Data in Modelling Problems with Application to the Automatic Grading of Uploaded AssignmentsJosé Otero0Ana Palacios1Rosario Suárez2Luis Junco3Inés Couso4Luciano Sánchez5Computer Science Department, Universidad de Oviedo, Sedes Departamentales, Edificio 1, Campus de Viesques, 33203 Gijón, SpainComputer Science Department, Universidad de Granada, C/Periodista Daniel Saucedo Arana s/n, 18071 Granada, SpainComputer Science Department, Universidad de Oviedo, Sedes Departamentales, Edificio 1, Campus de Viesques, 33203 Gijón, SpainComputer Science Department, Universidad de Oviedo, Sedes Departamentales, Edificio 1, Campus de Viesques, 33203 Gijón, SpainStatistics Department, E. U. I. T. Industrial, Universidad de Oviedo, Módulo 1, Planta 4, Campus de Viesques, 33203 Gijón, SpainComputer Science Department, Universidad de Oviedo, Sedes Departamentales, Edificio 1, Campus de Viesques, 33203 Gijón, SpainWhen selecting relevant inputs in modeling problems with low quality data, the ranking of the most informative inputs is also uncertain. In this paper, this issue is addressed through a new procedure that allows the extending of different crisp feature selection algorithms to vague data. The partial knowledge about the ordinal of each feature is modelled by means of a possibility distribution, and a ranking is hereby applied to sort these distributions. It will be shown that this technique makes the most use of the available information in some vague datasets. The approach is demonstrated in a real-world application. In the context of massive online computer science courses, methods are sought for automatically providing the student with a qualification through code metrics. Feature selection methods are used to find the metrics involved in the most meaningful predictions. In this study, 800 source code files, collected and revised by the authors in classroom Computer Science lectures taught between 2013 and 2014, are analyzed with the proposed technique, and the most relevant metrics for the automatic grading task are discussed.http://dx.doi.org/10.1155/2014/468405
collection DOAJ
language English
format Article
sources DOAJ
author José Otero
Ana Palacios
Rosario Suárez
Luis Junco
Inés Couso
Luciano Sánchez
spellingShingle José Otero
Ana Palacios
Rosario Suárez
Luis Junco
Inés Couso
Luciano Sánchez
A Procedure for Extending Input Selection Algorithms to Low Quality Data in Modelling Problems with Application to the Automatic Grading of Uploaded Assignments
The Scientific World Journal
author_facet José Otero
Ana Palacios
Rosario Suárez
Luis Junco
Inés Couso
Luciano Sánchez
author_sort José Otero
title A Procedure for Extending Input Selection Algorithms to Low Quality Data in Modelling Problems with Application to the Automatic Grading of Uploaded Assignments
title_short A Procedure for Extending Input Selection Algorithms to Low Quality Data in Modelling Problems with Application to the Automatic Grading of Uploaded Assignments
title_full A Procedure for Extending Input Selection Algorithms to Low Quality Data in Modelling Problems with Application to the Automatic Grading of Uploaded Assignments
title_fullStr A Procedure for Extending Input Selection Algorithms to Low Quality Data in Modelling Problems with Application to the Automatic Grading of Uploaded Assignments
title_full_unstemmed A Procedure for Extending Input Selection Algorithms to Low Quality Data in Modelling Problems with Application to the Automatic Grading of Uploaded Assignments
title_sort procedure for extending input selection algorithms to low quality data in modelling problems with application to the automatic grading of uploaded assignments
publisher Hindawi Limited
series The Scientific World Journal
issn 2356-6140
1537-744X
publishDate 2014-01-01
description When selecting relevant inputs in modeling problems with low quality data, the ranking of the most informative inputs is also uncertain. In this paper, this issue is addressed through a new procedure that allows the extending of different crisp feature selection algorithms to vague data. The partial knowledge about the ordinal of each feature is modelled by means of a possibility distribution, and a ranking is hereby applied to sort these distributions. It will be shown that this technique makes the most use of the available information in some vague datasets. The approach is demonstrated in a real-world application. In the context of massive online computer science courses, methods are sought for automatically providing the student with a qualification through code metrics. Feature selection methods are used to find the metrics involved in the most meaningful predictions. In this study, 800 source code files, collected and revised by the authors in classroom Computer Science lectures taught between 2013 and 2014, are analyzed with the proposed technique, and the most relevant metrics for the automatic grading task are discussed.
url http://dx.doi.org/10.1155/2014/468405
work_keys_str_mv AT joseotero aprocedureforextendinginputselectionalgorithmstolowqualitydatainmodellingproblemswithapplicationtotheautomaticgradingofuploadedassignments
AT anapalacios aprocedureforextendinginputselectionalgorithmstolowqualitydatainmodellingproblemswithapplicationtotheautomaticgradingofuploadedassignments
AT rosariosuarez aprocedureforextendinginputselectionalgorithmstolowqualitydatainmodellingproblemswithapplicationtotheautomaticgradingofuploadedassignments
AT luisjunco aprocedureforextendinginputselectionalgorithmstolowqualitydatainmodellingproblemswithapplicationtotheautomaticgradingofuploadedassignments
AT inescouso aprocedureforextendinginputselectionalgorithmstolowqualitydatainmodellingproblemswithapplicationtotheautomaticgradingofuploadedassignments
AT lucianosanchez aprocedureforextendinginputselectionalgorithmstolowqualitydatainmodellingproblemswithapplicationtotheautomaticgradingofuploadedassignments
AT joseotero procedureforextendinginputselectionalgorithmstolowqualitydatainmodellingproblemswithapplicationtotheautomaticgradingofuploadedassignments
AT anapalacios procedureforextendinginputselectionalgorithmstolowqualitydatainmodellingproblemswithapplicationtotheautomaticgradingofuploadedassignments
AT rosariosuarez procedureforextendinginputselectionalgorithmstolowqualitydatainmodellingproblemswithapplicationtotheautomaticgradingofuploadedassignments
AT luisjunco procedureforextendinginputselectionalgorithmstolowqualitydatainmodellingproblemswithapplicationtotheautomaticgradingofuploadedassignments
AT inescouso procedureforextendinginputselectionalgorithmstolowqualitydatainmodellingproblemswithapplicationtotheautomaticgradingofuploadedassignments
AT lucianosanchez procedureforextendinginputselectionalgorithmstolowqualitydatainmodellingproblemswithapplicationtotheautomaticgradingofuploadedassignments
_version_ 1725055943829356544