Generation of Synthetic Data for Handwritten Word Alteration Detection

Fraudsters often alter handwritten contents in a document in order to achieve illicit purposes. At times, this may result in financial and mental loss to an individual or an organization. Hence, ink analysis is necessary to identify such an alteration. Convolution Neural Network (CNN) can be used to...

Full description

Bibliographic Details
Main Authors:	Prabhat Dansena, Soumen Bag, Rajarshi Pal
Format:	Article
Language:	English
Published:	IEEE 2021-01-01
Series:	IEEE Access
Subjects:	Convolution neural network document forensics handwritten ink analysis synthetic data
Online Access:	https://ieeexplore.ieee.org/document/9354159/

id	doaj-cf8c40f532b149cbb63eb0f6e294ca03
record_format	Article
spelling	doaj-cf8c40f532b149cbb63eb0f6e294ca032021-03-30T15:21:17ZengIEEEIEEE Access2169-35362021-01-019389793899010.1109/ACCESS.2021.30593429354159Generation of Synthetic Data for Handwritten Word Alteration DetectionPrabhat Dansena0https://orcid.org/0000-0001-5982-1215Soumen Bag1Rajarshi Pal2Department of Computer Science and Engineering, Indian Institute of Technology (ISM) Dhanbad, Dhanbad, IndiaDepartment of Computer Science and Engineering, Indian Institute of Technology (ISM) Dhanbad, Dhanbad, IndiaInstitute for Development and Research in Banking Technology, Hyderabad, IndiaFraudsters often alter handwritten contents in a document in order to achieve illicit purposes. At times, this may result in financial and mental loss to an individual or an organization. Hence, ink analysis is necessary to identify such an alteration. Convolution Neural Network (CNN) can be used to identify such cases of alteration, as CNN has emerged as a monumental success in the field of computer vision for varieties of classification tasks. But, CNN requires large amount of labeled data for training. Hence, there is a need to generate a large dataset for the experiments relating to handwritten word alteration detection. Collection, digitization, and cropping of a large number of altered and unaltered handwritten words are tedious and time consuming. To overcome such an issue, an approach for synthetic word data generation is presented in this paper for handwritten word alteration detection experiments. This scheme is designed in such a way that the synthetically generated words are very similar to the original ones. In order to achieve this, handwritten character data set is prepared using 10 blue and 10 black pens. These handwritten characters are used for creating synthetic word alteration data set. The presented approach uses relatively less number of handwritten character images to create a huge word alteration data set. Further, deep learning models are trained on the synthetically generated data set for word alteration detection.https://ieeexplore.ieee.org/document/9354159/Convolution neural networkdocument forensicshandwrittenink analysissynthetic data
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Prabhat Dansena Soumen Bag Rajarshi Pal
spellingShingle	Prabhat Dansena Soumen Bag Rajarshi Pal Generation of Synthetic Data for Handwritten Word Alteration Detection IEEE Access Convolution neural network document forensics handwritten ink analysis synthetic data
author_facet	Prabhat Dansena Soumen Bag Rajarshi Pal
author_sort	Prabhat Dansena
title	Generation of Synthetic Data for Handwritten Word Alteration Detection
title_short	Generation of Synthetic Data for Handwritten Word Alteration Detection
title_full	Generation of Synthetic Data for Handwritten Word Alteration Detection
title_fullStr	Generation of Synthetic Data for Handwritten Word Alteration Detection
title_full_unstemmed	Generation of Synthetic Data for Handwritten Word Alteration Detection
title_sort	generation of synthetic data for handwritten word alteration detection
publisher	IEEE
series	IEEE Access
issn	2169-3536
publishDate	2021-01-01
description	Fraudsters often alter handwritten contents in a document in order to achieve illicit purposes. At times, this may result in financial and mental loss to an individual or an organization. Hence, ink analysis is necessary to identify such an alteration. Convolution Neural Network (CNN) can be used to identify such cases of alteration, as CNN has emerged as a monumental success in the field of computer vision for varieties of classification tasks. But, CNN requires large amount of labeled data for training. Hence, there is a need to generate a large dataset for the experiments relating to handwritten word alteration detection. Collection, digitization, and cropping of a large number of altered and unaltered handwritten words are tedious and time consuming. To overcome such an issue, an approach for synthetic word data generation is presented in this paper for handwritten word alteration detection experiments. This scheme is designed in such a way that the synthetically generated words are very similar to the original ones. In order to achieve this, handwritten character data set is prepared using 10 blue and 10 black pens. These handwritten characters are used for creating synthetic word alteration data set. The presented approach uses relatively less number of handwritten character images to create a huge word alteration data set. Further, deep learning models are trained on the synthetically generated data set for word alteration detection.
topic	Convolution neural network document forensics handwritten ink analysis synthetic data
url	https://ieeexplore.ieee.org/document/9354159/
work_keys_str_mv	AT prabhatdansena generationofsyntheticdataforhandwrittenwordalterationdetection AT soumenbag generationofsyntheticdataforhandwrittenwordalterationdetection AT rajarshipal generationofsyntheticdataforhandwrittenwordalterationdetection
_version_	1724179669482209280

Generation of Synthetic Data for Handwritten Word Alteration Detection

Similar Items