iSentenizer-μ: Multilingual Sentence Boundary Detection Model

Sentence boundary detection (SBD) system is normally quite sensitive to genres of data that the system is trained on. The genres of data are often referred to the shifts of text topics and new languages domains. Although new detection models can be retrained for different languages or new text genre...

Full description

Bibliographic Details
Main Authors: Derek F. Wong, Lidia S. Chao, Xiaodong Zeng
Format: Article
Language:English
Published: Hindawi Limited 2014-01-01
Series:The Scientific World Journal
Online Access:http://dx.doi.org/10.1155/2014/196574