Extracting Transaction Information from Financial Press Releases

The use cases of Information Extraction (IE) are more or less endless, often consisting of a combination of Named Entity Recognition (NER) and Relation Extraction (RE). One use case of IE is the extraction of transaction information from Norwegian insider transaction Press Releases (PRs), where a tr...

Full description

Bibliographic Details
Main Author:	Sjöberg, Agaton
Format:	Others
Language:	English
Published:	Linköpings universitet, Artificiell intelligens och integrerade datorsystem 2021
Subjects:	Natural Language Processing Information Extraction Named Entity Recognition Relation Extraction Latent Structure Refinement Financial Press Release Insider Transaction Language Technology (Computational Linguistics) Språkteknologi (språkvetenskaplig databehandling)
Online Access:	http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-177688

id	ndltd-UPSALLA1-oai-DiVA.org-liu-177688
record_format	oai_dc
spelling	ndltd-UPSALLA1-oai-DiVA.org-liu-1776882021-07-05T05:23:09ZExtracting Transaction Information from Financial Press ReleasesengExtrahering av Transaktionsdata från Finansiella PressmeddelandenSjöberg, AgatonLinköpings universitet, Artificiell intelligens och integrerade datorsystem2021Natural Language ProcessingInformation ExtractionNamed Entity RecognitionRelation ExtractionLatent Structure RefinementFinancial Press ReleaseInsider TransactionLanguage Technology (Computational Linguistics)Språkteknologi (språkvetenskaplig databehandling)The use cases of Information Extraction (IE) are more or less endless, often consisting of a combination of Named Entity Recognition (NER) and Relation Extraction (RE). One use case of IE is the extraction of transaction information from Norwegian insider transaction Press Releases (PRs), where a transaction consists of at most four entities: the name of the owner performing the transaction, the number of shares transferred, the transaction date, and the price of the shares bought or sold. The relationships between the entities define which entity belongs to which transaction, and whether shares were bought or sold. This report has investigated how a pair of supervised NER and RE models extract this information. Since these Norwegian PRs were not labeled, two different approaches to annotating the transaction entities and their associated relations were investigated, and it was found that it is better to annotate only entities that occur in a relation than annotating all occurrences. Furthermore, the number of PRs needed to achieve a satisfactory result in the IE pipeline was investigated. The study shows that training with about 400 PRs is sufficient for the results to converge, at around 0.85 in F1-score. Finally, the report shows that there is not much difference between a complex RE model and a simple rule-based approach, when applied on the studied corpus. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-177688doi:21/039application/pdfinfo:eu-repo/semantics/openAccess
collection	NDLTD
language	English
format	Others
sources	NDLTD
topic	Natural Language Processing Information Extraction Named Entity Recognition Relation Extraction Latent Structure Refinement Financial Press Release Insider Transaction Language Technology (Computational Linguistics) Språkteknologi (språkvetenskaplig databehandling)
spellingShingle	Natural Language Processing Information Extraction Named Entity Recognition Relation Extraction Latent Structure Refinement Financial Press Release Insider Transaction Language Technology (Computational Linguistics) Språkteknologi (språkvetenskaplig databehandling) Sjöberg, Agaton Extracting Transaction Information from Financial Press Releases
description	The use cases of Information Extraction (IE) are more or less endless, often consisting of a combination of Named Entity Recognition (NER) and Relation Extraction (RE). One use case of IE is the extraction of transaction information from Norwegian insider transaction Press Releases (PRs), where a transaction consists of at most four entities: the name of the owner performing the transaction, the number of shares transferred, the transaction date, and the price of the shares bought or sold. The relationships between the entities define which entity belongs to which transaction, and whether shares were bought or sold. This report has investigated how a pair of supervised NER and RE models extract this information. Since these Norwegian PRs were not labeled, two different approaches to annotating the transaction entities and their associated relations were investigated, and it was found that it is better to annotate only entities that occur in a relation than annotating all occurrences. Furthermore, the number of PRs needed to achieve a satisfactory result in the IE pipeline was investigated. The study shows that training with about 400 PRs is sufficient for the results to converge, at around 0.85 in F1-score. Finally, the report shows that there is not much difference between a complex RE model and a simple rule-based approach, when applied on the studied corpus.
author	Sjöberg, Agaton
author_facet	Sjöberg, Agaton
author_sort	Sjöberg, Agaton
title	Extracting Transaction Information from Financial Press Releases
title_short	Extracting Transaction Information from Financial Press Releases
title_full	Extracting Transaction Information from Financial Press Releases
title_fullStr	Extracting Transaction Information from Financial Press Releases
title_full_unstemmed	Extracting Transaction Information from Financial Press Releases
title_sort	extracting transaction information from financial press releases
publisher	Linköpings universitet, Artificiell intelligens och integrerade datorsystem
publishDate	2021
url	http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-177688
work_keys_str_mv	AT sjobergagaton extractingtransactioninformationfromfinancialpressreleases AT sjobergagaton extraheringavtransaktionsdatafranfinansiellapressmeddelanden
_version_	1719415641408012288

Extracting Transaction Information from Financial Press Releases

Similar Items