Summary: | In this thesis, we show how a text-based Recommendation Systems can greatly benefit from neural statistical language models, more particularly BERT. We evaluate the framework on a digital and collaborative platform for radiologists, by automatically suggesting scientific papers from the medical database PubMed, to provide evidence in diagnostic radiology. The models use contextualized vectors to represent text, accounting for writing style, misspelling and jargon. By using pre-computed representations of text passages, we are able to use compute-heavy statistical language models in production environments, where supercomputers are not available during inference. The results suggest pre-computed embeddings are very effective when the texts came from the same domain, and less effective (but still useful) in capturing the interaction between clinical and scientific text. Nonetheless, the suggested solutions hold promises in this and other areas in medicine. Possibly, the results are transferable to other domains, such as processing of legal documents and patent search.
|