Domain Heuristic Fusion of Multi-Word Embeddings for Nutrient Value Prediction

Being both a poison and a cure for many lifestyle and non-communicable diseases, food is inscribing itself into the prime focus of precise medicine. The monitoring of few groups of nutrients is crucial for some patients, and methods for easing their calculations are emerging. Our proposed machine le...

Full description

Bibliographic Details
Main Authors: Gordana Ispirova, Tome Eftimov, Barbara Koroušić Seljak
Format: Article
Language:English
Published: MDPI AG 2021-08-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/9/16/1941
id doaj-56b4b66ab47d4b508d6ceebcd795729f
record_format Article
spelling doaj-56b4b66ab47d4b508d6ceebcd795729f2021-08-26T14:02:20ZengMDPI AGMathematics2227-73902021-08-0191941194110.3390/math9161941Domain Heuristic Fusion of Multi-Word Embeddings for Nutrient Value PredictionGordana Ispirova0Tome Eftimov1Barbara Koroušić Seljak2Computer Systems Department, Jožef Stefan Institute, 1000 Ljubljana, SloveniaComputer Systems Department, Jožef Stefan Institute, 1000 Ljubljana, SloveniaComputer Systems Department, Jožef Stefan Institute, 1000 Ljubljana, SloveniaBeing both a poison and a cure for many lifestyle and non-communicable diseases, food is inscribing itself into the prime focus of precise medicine. The monitoring of few groups of nutrients is crucial for some patients, and methods for easing their calculations are emerging. Our proposed machine learning pipeline deals with nutrient prediction based on learned vector representations on short text–recipe names. In this study, we explored how the prediction results change when, instead of using the vector representations of the recipe description, we use the embeddings of the list of ingredients. The nutrient content of one food depends on its ingredients; therefore, the text of the ingredients contains more relevant information. We define a domain-specific heuristic for merging the embeddings of the ingredients, which combines the quantities of each ingredient in order to use them as features in machine learning models for nutrient prediction. The results from the experiments indicate that the prediction results improve when using the domain-specific heuristic. The prediction models for protein prediction were highly effective, with accuracies up to 97.98%. Implementing a domain-specific heuristic for combining multi-word embeddings yields better results than using conventional merging heuristics, with up to 60% more accuracy in some cases.https://www.mdpi.com/2227-7390/9/16/1941domain-specific embeddingsdomain knowledgemachine learningdata miningmacronutrient predictionrepresentation learning
collection DOAJ
language English
format Article
sources DOAJ
author Gordana Ispirova
Tome Eftimov
Barbara Koroušić Seljak
spellingShingle Gordana Ispirova
Tome Eftimov
Barbara Koroušić Seljak
Domain Heuristic Fusion of Multi-Word Embeddings for Nutrient Value Prediction
Mathematics
domain-specific embeddings
domain knowledge
machine learning
data mining
macronutrient prediction
representation learning
author_facet Gordana Ispirova
Tome Eftimov
Barbara Koroušić Seljak
author_sort Gordana Ispirova
title Domain Heuristic Fusion of Multi-Word Embeddings for Nutrient Value Prediction
title_short Domain Heuristic Fusion of Multi-Word Embeddings for Nutrient Value Prediction
title_full Domain Heuristic Fusion of Multi-Word Embeddings for Nutrient Value Prediction
title_fullStr Domain Heuristic Fusion of Multi-Word Embeddings for Nutrient Value Prediction
title_full_unstemmed Domain Heuristic Fusion of Multi-Word Embeddings for Nutrient Value Prediction
title_sort domain heuristic fusion of multi-word embeddings for nutrient value prediction
publisher MDPI AG
series Mathematics
issn 2227-7390
publishDate 2021-08-01
description Being both a poison and a cure for many lifestyle and non-communicable diseases, food is inscribing itself into the prime focus of precise medicine. The monitoring of few groups of nutrients is crucial for some patients, and methods for easing their calculations are emerging. Our proposed machine learning pipeline deals with nutrient prediction based on learned vector representations on short text–recipe names. In this study, we explored how the prediction results change when, instead of using the vector representations of the recipe description, we use the embeddings of the list of ingredients. The nutrient content of one food depends on its ingredients; therefore, the text of the ingredients contains more relevant information. We define a domain-specific heuristic for merging the embeddings of the ingredients, which combines the quantities of each ingredient in order to use them as features in machine learning models for nutrient prediction. The results from the experiments indicate that the prediction results improve when using the domain-specific heuristic. The prediction models for protein prediction were highly effective, with accuracies up to 97.98%. Implementing a domain-specific heuristic for combining multi-word embeddings yields better results than using conventional merging heuristics, with up to 60% more accuracy in some cases.
topic domain-specific embeddings
domain knowledge
machine learning
data mining
macronutrient prediction
representation learning
url https://www.mdpi.com/2227-7390/9/16/1941
work_keys_str_mv AT gordanaispirova domainheuristicfusionofmultiwordembeddingsfornutrientvalueprediction
AT tomeeftimov domainheuristicfusionofmultiwordembeddingsfornutrientvalueprediction
AT barbarakorousicseljak domainheuristicfusionofmultiwordembeddingsfornutrientvalueprediction
_version_ 1721191683185442816