Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable.

The quality of the judicial system of a country can be verified by the overall length time of lawsuits, or the lead time. When the lead time is excessive, a country's economy can be affected, leading to the adoption of measures such as the creation of the Saturn Center in Europe. Although there...

Full description

Bibliographic Details
Main Authors: Lúcia Adriana Dos Santos Gruginskie, Guilherme Luís Roehe Vaccaro
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2018-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC5983432?pdf=render
id doaj-c84a238b60124274a4be74b03c5c44d3
record_format Article
spelling doaj-c84a238b60124274a4be74b03c5c44d32020-11-24T20:47:59ZengPublic Library of Science (PLoS)PLoS ONE1932-62032018-01-01136e019812210.1371/journal.pone.0198122Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable.Lúcia Adriana Dos Santos GruginskieGuilherme Luís Roehe VaccaroThe quality of the judicial system of a country can be verified by the overall length time of lawsuits, or the lead time. When the lead time is excessive, a country's economy can be affected, leading to the adoption of measures such as the creation of the Saturn Center in Europe. Although there are performance indicators to measure the lead time of lawsuits, the analysis and the fit of prediction models are still underdeveloped themes in the literature. To contribute to this subject, this article compares different prediction models according to their accuracy, sensitivity, specificity, precision, and F1 measure. The database used was from TRF4-the Tribunal Regional Federal da 4a Região-a federal court in southern Brazil, corresponding to the 2nd Instance civil lawsuits completed in 2016. The models were fitted using support vector machine, naive Bayes, random forests, and neural network approaches with categorical predictor variables. The lead time of the 2nd Instance judgment was selected as the response variable measured in days and categorized in bands. The comparison among the models showed that the support vector machine and random forest approaches produced measurements that were superior to those of the other models. The evaluation of the models was made using k-fold cross-validation similar to that applied to the test models.http://europepmc.org/articles/PMC5983432?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Lúcia Adriana Dos Santos Gruginskie
Guilherme Luís Roehe Vaccaro
spellingShingle Lúcia Adriana Dos Santos Gruginskie
Guilherme Luís Roehe Vaccaro
Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable.
PLoS ONE
author_facet Lúcia Adriana Dos Santos Gruginskie
Guilherme Luís Roehe Vaccaro
author_sort Lúcia Adriana Dos Santos Gruginskie
title Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable.
title_short Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable.
title_full Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable.
title_fullStr Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable.
title_full_unstemmed Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable.
title_sort lawsuit lead time prediction: comparison of data mining techniques based on categorical response variable.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2018-01-01
description The quality of the judicial system of a country can be verified by the overall length time of lawsuits, or the lead time. When the lead time is excessive, a country's economy can be affected, leading to the adoption of measures such as the creation of the Saturn Center in Europe. Although there are performance indicators to measure the lead time of lawsuits, the analysis and the fit of prediction models are still underdeveloped themes in the literature. To contribute to this subject, this article compares different prediction models according to their accuracy, sensitivity, specificity, precision, and F1 measure. The database used was from TRF4-the Tribunal Regional Federal da 4a Região-a federal court in southern Brazil, corresponding to the 2nd Instance civil lawsuits completed in 2016. The models were fitted using support vector machine, naive Bayes, random forests, and neural network approaches with categorical predictor variables. The lead time of the 2nd Instance judgment was selected as the response variable measured in days and categorized in bands. The comparison among the models showed that the support vector machine and random forest approaches produced measurements that were superior to those of the other models. The evaluation of the models was made using k-fold cross-validation similar to that applied to the test models.
url http://europepmc.org/articles/PMC5983432?pdf=render
work_keys_str_mv AT luciaadrianadossantosgruginskie lawsuitleadtimepredictioncomparisonofdataminingtechniquesbasedoncategoricalresponsevariable
AT guilhermeluisroehevaccaro lawsuitleadtimepredictioncomparisonofdataminingtechniquesbasedoncategoricalresponsevariable
_version_ 1716809322095181824