How to make use of unlabeled observations in species distribution modeling using point process models

Abstract Species distribution modeling, which allows users to predict the spatial distribution of species with the use of environmental covariates, has become increasingly popular, with many software platforms providing tools to fit such models. However, the species observations used can have varyin...

Full description

Bibliographic Details
Main Authors: Emy Guilbault, Ian Renner, Michael Mahony, Eric Beh
Format: Article
Language:English
Published: Wiley 2021-05-01
Series:Ecology and Evolution
Subjects:
Online Access:https://doi.org/10.1002/ece3.7411
id doaj-01476a3ea0a34ffe80e16927c740f4b2
record_format Article
spelling doaj-01476a3ea0a34ffe80e16927c740f4b22021-05-19T04:56:22ZengWileyEcology and Evolution2045-77582021-05-0111105220524310.1002/ece3.7411How to make use of unlabeled observations in species distribution modeling using point process modelsEmy Guilbault0Ian Renner1Michael Mahony2Eric Beh3Faculty of Science School of Mathematical and Physical Sciences The University of Newcastle Callaghan NSW AustraliaFaculty of Science School of Mathematical and Physical Sciences The University of Newcastle Callaghan NSW AustraliaFaculty of Science School of Environmental and Life Sciences The University of Newcastle Callaghan NSW AustraliaFaculty of Science School of Mathematical and Physical Sciences The University of Newcastle Callaghan NSW AustraliaAbstract Species distribution modeling, which allows users to predict the spatial distribution of species with the use of environmental covariates, has become increasingly popular, with many software platforms providing tools to fit such models. However, the species observations used can have varying levels of quality and can have incomplete information, such as uncertain or unknown species identity. In this paper, we develop two algorithms to classify observations with unknown species identities which simultaneously predict several species distributions using spatial point processes. Through simulations, we compare the performance of these algorithms using 7 different initializations to the performance of models fitted using only the observations with known species identity. We show that performance varies with differences in correlation among species distributions, species abundance, and the proportion of observations with unknown species identities. Additionally, some of the methods developed here outperformed the models that did not use the misspecified data. We applied the best‐performing methods to a dataset of three frog species (Mixophyes). These models represent a helpful and promising tool for opportunistic surveys where misidentification is possible or for the distribution of species newly separated in their taxonomy.https://doi.org/10.1002/ece3.7411classificationecological statisticsEM algorithmmachine learningmisidentificationmixture modeling
collection DOAJ
language English
format Article
sources DOAJ
author Emy Guilbault
Ian Renner
Michael Mahony
Eric Beh
spellingShingle Emy Guilbault
Ian Renner
Michael Mahony
Eric Beh
How to make use of unlabeled observations in species distribution modeling using point process models
Ecology and Evolution
classification
ecological statistics
EM algorithm
machine learning
misidentification
mixture modeling
author_facet Emy Guilbault
Ian Renner
Michael Mahony
Eric Beh
author_sort Emy Guilbault
title How to make use of unlabeled observations in species distribution modeling using point process models
title_short How to make use of unlabeled observations in species distribution modeling using point process models
title_full How to make use of unlabeled observations in species distribution modeling using point process models
title_fullStr How to make use of unlabeled observations in species distribution modeling using point process models
title_full_unstemmed How to make use of unlabeled observations in species distribution modeling using point process models
title_sort how to make use of unlabeled observations in species distribution modeling using point process models
publisher Wiley
series Ecology and Evolution
issn 2045-7758
publishDate 2021-05-01
description Abstract Species distribution modeling, which allows users to predict the spatial distribution of species with the use of environmental covariates, has become increasingly popular, with many software platforms providing tools to fit such models. However, the species observations used can have varying levels of quality and can have incomplete information, such as uncertain or unknown species identity. In this paper, we develop two algorithms to classify observations with unknown species identities which simultaneously predict several species distributions using spatial point processes. Through simulations, we compare the performance of these algorithms using 7 different initializations to the performance of models fitted using only the observations with known species identity. We show that performance varies with differences in correlation among species distributions, species abundance, and the proportion of observations with unknown species identities. Additionally, some of the methods developed here outperformed the models that did not use the misspecified data. We applied the best‐performing methods to a dataset of three frog species (Mixophyes). These models represent a helpful and promising tool for opportunistic surveys where misidentification is possible or for the distribution of species newly separated in their taxonomy.
topic classification
ecological statistics
EM algorithm
machine learning
misidentification
mixture modeling
url https://doi.org/10.1002/ece3.7411
work_keys_str_mv AT emyguilbault howtomakeuseofunlabeledobservationsinspeciesdistributionmodelingusingpointprocessmodels
AT ianrenner howtomakeuseofunlabeledobservationsinspeciesdistributionmodelingusingpointprocessmodels
AT michaelmahony howtomakeuseofunlabeledobservationsinspeciesdistributionmodelingusingpointprocessmodels
AT ericbeh howtomakeuseofunlabeledobservationsinspeciesdistributionmodelingusingpointprocessmodels
_version_ 1721436987530936320