Data extraction of digitized old newspaper content to streamline the search process for users with a genealogy perspective

This thesis presents the data extraction of digitized old newspaper content and the implementation of a search function to simplify for the user. This is developed as a master’s degree project at Linköping University. The application allows the user to search for interesting content in a database of...

Full description

Bibliographic Details
Main Author: Pettersson, Sandra
Format: Others
Language:English
Published: Linköpings universitet, Medie- och Informationsteknik 2019
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-160533
id ndltd-UPSALLA1-oai-DiVA.org-liu-160533
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-liu-1605332019-09-26T04:20:42ZData extraction of digitized old newspaper content to streamline the search process for users with a genealogy perspectiveengPettersson, SandraLinköpings universitet, Medie- och InformationsteknikLinköpings universitet, Tekniska fakulteten2019genealogyMySQL databasenewspapersMedia and Communication TechnologyMedieteknikThis thesis presents the data extraction of digitized old newspaper content and the implementation of a search function to simplify for the user. This is developed as a master’s degree project at Linköping University. The application allows the user to search for interesting content in a database of articles and can be used by both genealogists, local historians and novices. The database is filled with data from OCR scanned newspapers and the user can either search the database by their own or with the help of their family tree. The family tree is implemented by reading the users GEDcom file and extracting useful information that is then used to get better search results. The result is returned to the user in the form of digital articles. The work concludes that the information from GEDcom files can be used to find new interesting facts and that the user should be allowed to affect how the data is reduced, in the form of article categorization and filtering. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-160533application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Others
sources NDLTD
topic genealogy
MySQL database
newspapers
Media and Communication Technology
Medieteknik
spellingShingle genealogy
MySQL database
newspapers
Media and Communication Technology
Medieteknik
Pettersson, Sandra
Data extraction of digitized old newspaper content to streamline the search process for users with a genealogy perspective
description This thesis presents the data extraction of digitized old newspaper content and the implementation of a search function to simplify for the user. This is developed as a master’s degree project at Linköping University. The application allows the user to search for interesting content in a database of articles and can be used by both genealogists, local historians and novices. The database is filled with data from OCR scanned newspapers and the user can either search the database by their own or with the help of their family tree. The family tree is implemented by reading the users GEDcom file and extracting useful information that is then used to get better search results. The result is returned to the user in the form of digital articles. The work concludes that the information from GEDcom files can be used to find new interesting facts and that the user should be allowed to affect how the data is reduced, in the form of article categorization and filtering.
author Pettersson, Sandra
author_facet Pettersson, Sandra
author_sort Pettersson, Sandra
title Data extraction of digitized old newspaper content to streamline the search process for users with a genealogy perspective
title_short Data extraction of digitized old newspaper content to streamline the search process for users with a genealogy perspective
title_full Data extraction of digitized old newspaper content to streamline the search process for users with a genealogy perspective
title_fullStr Data extraction of digitized old newspaper content to streamline the search process for users with a genealogy perspective
title_full_unstemmed Data extraction of digitized old newspaper content to streamline the search process for users with a genealogy perspective
title_sort data extraction of digitized old newspaper content to streamline the search process for users with a genealogy perspective
publisher Linköpings universitet, Medie- och Informationsteknik
publishDate 2019
url http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-160533
work_keys_str_mv AT petterssonsandra dataextractionofdigitizedoldnewspapercontenttostreamlinethesearchprocessforuserswithagenealogyperspective
_version_ 1719258514442944512