Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable.
The quality of the judicial system of a country can be verified by the overall length time of lawsuits, or the lead time. When the lead time is excessive, a country's economy can be affected, leading to the adoption of measures such as the creation of the Saturn Center in Europe. Although there...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2018-01-01
|
Series: | PLoS ONE |
Online Access: | http://europepmc.org/articles/PMC5983432?pdf=render |
id |
doaj-c84a238b60124274a4be74b03c5c44d3 |
---|---|
record_format |
Article |
spelling |
doaj-c84a238b60124274a4be74b03c5c44d32020-11-24T20:47:59ZengPublic Library of Science (PLoS)PLoS ONE1932-62032018-01-01136e019812210.1371/journal.pone.0198122Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable.Lúcia Adriana Dos Santos GruginskieGuilherme Luís Roehe VaccaroThe quality of the judicial system of a country can be verified by the overall length time of lawsuits, or the lead time. When the lead time is excessive, a country's economy can be affected, leading to the adoption of measures such as the creation of the Saturn Center in Europe. Although there are performance indicators to measure the lead time of lawsuits, the analysis and the fit of prediction models are still underdeveloped themes in the literature. To contribute to this subject, this article compares different prediction models according to their accuracy, sensitivity, specificity, precision, and F1 measure. The database used was from TRF4-the Tribunal Regional Federal da 4a Região-a federal court in southern Brazil, corresponding to the 2nd Instance civil lawsuits completed in 2016. The models were fitted using support vector machine, naive Bayes, random forests, and neural network approaches with categorical predictor variables. The lead time of the 2nd Instance judgment was selected as the response variable measured in days and categorized in bands. The comparison among the models showed that the support vector machine and random forest approaches produced measurements that were superior to those of the other models. The evaluation of the models was made using k-fold cross-validation similar to that applied to the test models.http://europepmc.org/articles/PMC5983432?pdf=render |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Lúcia Adriana Dos Santos Gruginskie Guilherme Luís Roehe Vaccaro |
spellingShingle |
Lúcia Adriana Dos Santos Gruginskie Guilherme Luís Roehe Vaccaro Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable. PLoS ONE |
author_facet |
Lúcia Adriana Dos Santos Gruginskie Guilherme Luís Roehe Vaccaro |
author_sort |
Lúcia Adriana Dos Santos Gruginskie |
title |
Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable. |
title_short |
Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable. |
title_full |
Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable. |
title_fullStr |
Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable. |
title_full_unstemmed |
Lawsuit lead time prediction: Comparison of data mining techniques based on categorical response variable. |
title_sort |
lawsuit lead time prediction: comparison of data mining techniques based on categorical response variable. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS ONE |
issn |
1932-6203 |
publishDate |
2018-01-01 |
description |
The quality of the judicial system of a country can be verified by the overall length time of lawsuits, or the lead time. When the lead time is excessive, a country's economy can be affected, leading to the adoption of measures such as the creation of the Saturn Center in Europe. Although there are performance indicators to measure the lead time of lawsuits, the analysis and the fit of prediction models are still underdeveloped themes in the literature. To contribute to this subject, this article compares different prediction models according to their accuracy, sensitivity, specificity, precision, and F1 measure. The database used was from TRF4-the Tribunal Regional Federal da 4a Região-a federal court in southern Brazil, corresponding to the 2nd Instance civil lawsuits completed in 2016. The models were fitted using support vector machine, naive Bayes, random forests, and neural network approaches with categorical predictor variables. The lead time of the 2nd Instance judgment was selected as the response variable measured in days and categorized in bands. The comparison among the models showed that the support vector machine and random forest approaches produced measurements that were superior to those of the other models. The evaluation of the models was made using k-fold cross-validation similar to that applied to the test models. |
url |
http://europepmc.org/articles/PMC5983432?pdf=render |
work_keys_str_mv |
AT luciaadrianadossantosgruginskie lawsuitleadtimepredictioncomparisonofdataminingtechniquesbasedoncategoricalresponsevariable AT guilhermeluisroehevaccaro lawsuitleadtimepredictioncomparisonofdataminingtechniquesbasedoncategoricalresponsevariable |
_version_ |
1716809322095181824 |