Multi-class and Multi-label classication of Darkweb Data
abstract: In this research, I try to solve multi-class multi-label classication problem, where the goal is to automatically assign one or more labels(tags) to discussion topics seen in deepweb. I observed natural hierarchy in our dataset, and I used dierent techniques to ensure hierarchical integ...
Other Authors: | |
---|---|
Format: | Dissertation |
Language: | English |
Published: |
2018
|
Subjects: | |
Online Access: | http://hdl.handle.net/2286/R.I.48469 |
id |
ndltd-asu.edu-item-48469 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-asu.edu-item-484692018-06-22T03:09:11Z Multi-class and Multi-label classication of Darkweb Data abstract: In this research, I try to solve multi-class multi-label classication problem, where the goal is to automatically assign one or more labels(tags) to discussion topics seen in deepweb. I observed natural hierarchy in our dataset, and I used dierent techniques to ensure hierarchical integrity constraint on the predicted tag list. To solve `class imbalance' and `scarcity of labeled data' problems, I developed semisupervised model based on elastic search(ES) document relevance score. I evaluate our models using standard K-fold cross-validation method. Ensuring hierarchical integrity constraints improved F1 score by 11.9% over standard supervised learning, while our ES based semi-supervised learning model out-performed other models in terms of precision(78.4%) score while maintaining comparable recall(21%) score. Dissertation/Thesis Patil, Revanth (Author) Shakarian, Paulo (Advisor) Doupe, Adam (Committee member) Davulcu, Hasan (Committee member) Arizona State University (Publisher) Computer science eng 40 pages Masters Thesis Computer Science 2018 Masters Thesis http://hdl.handle.net/2286/R.I.48469 http://rightsstatements.org/vocab/InC/1.0/ All Rights Reserved 2018 |
collection |
NDLTD |
language |
English |
format |
Dissertation |
sources |
NDLTD |
topic |
Computer science |
spellingShingle |
Computer science Multi-class and Multi-label classication of Darkweb Data |
description |
abstract: In this research, I try to solve multi-class multi-label classication problem, where
the goal is to automatically assign one or more labels(tags) to discussion topics seen
in deepweb. I observed natural hierarchy in our dataset, and I used dierent
techniques to ensure hierarchical integrity constraint on the predicted tag list. To
solve `class imbalance' and `scarcity of labeled data' problems, I developed semisupervised
model based on elastic search(ES) document relevance score. I evaluate
our models using standard K-fold cross-validation method. Ensuring hierarchical
integrity constraints improved F1 score by 11.9% over standard supervised learning,
while our ES based semi-supervised learning model out-performed other models in
terms of precision(78.4%) score while maintaining comparable recall(21%) score. === Dissertation/Thesis === Masters Thesis Computer Science 2018 |
author2 |
Patil, Revanth (Author) |
author_facet |
Patil, Revanth (Author) |
title |
Multi-class and Multi-label classication of Darkweb Data |
title_short |
Multi-class and Multi-label classication of Darkweb Data |
title_full |
Multi-class and Multi-label classication of Darkweb Data |
title_fullStr |
Multi-class and Multi-label classication of Darkweb Data |
title_full_unstemmed |
Multi-class and Multi-label classication of Darkweb Data |
title_sort |
multi-class and multi-label classication of darkweb data |
publishDate |
2018 |
url |
http://hdl.handle.net/2286/R.I.48469 |
_version_ |
1718701683549143040 |