A Pseudo-Value Approach to Analyze the Semantic Similarity of the Speech of Children With and Without Autism Spectrum Disorder

Conversational impairments are well known among people with autism spectrum disorder (ASD), but their measurement requires time-consuming manual annotation of language samples. Natural language processing (NLP) has shown promise in identifying semantic difficulties when compared to clinician-annotat...

Full description

Bibliographic Details
Main Authors: Joel R. Adams, Alexandra C. Salem, Heather MacFarlane, Rosemary Ingham, Steven D. Bedrick, Eric Fombonne, Jill K. Dolata, Alison Presmanes Hill, Jan van Santen
Format: Article
Language:English
Published: Frontiers Media S.A. 2021-07-01
Series:Frontiers in Psychology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fpsyg.2021.668344/full
id doaj-19e21d37ea5d4af2a417a271bd64e602
record_format Article
collection DOAJ
language English
format Article
sources DOAJ
author Joel R. Adams
Alexandra C. Salem
Alexandra C. Salem
Heather MacFarlane
Heather MacFarlane
Rosemary Ingham
Steven D. Bedrick
Eric Fombonne
Eric Fombonne
Jill K. Dolata
Jill K. Dolata
Alison Presmanes Hill
Jan van Santen
spellingShingle Joel R. Adams
Alexandra C. Salem
Alexandra C. Salem
Heather MacFarlane
Heather MacFarlane
Rosemary Ingham
Steven D. Bedrick
Eric Fombonne
Eric Fombonne
Jill K. Dolata
Jill K. Dolata
Alison Presmanes Hill
Jan van Santen
A Pseudo-Value Approach to Analyze the Semantic Similarity of the Speech of Children With and Without Autism Spectrum Disorder
Frontiers in Psychology
autism
language disorder
semantics
natural language proceeding
child
statistical methods
author_facet Joel R. Adams
Alexandra C. Salem
Alexandra C. Salem
Heather MacFarlane
Heather MacFarlane
Rosemary Ingham
Steven D. Bedrick
Eric Fombonne
Eric Fombonne
Jill K. Dolata
Jill K. Dolata
Alison Presmanes Hill
Jan van Santen
author_sort Joel R. Adams
title A Pseudo-Value Approach to Analyze the Semantic Similarity of the Speech of Children With and Without Autism Spectrum Disorder
title_short A Pseudo-Value Approach to Analyze the Semantic Similarity of the Speech of Children With and Without Autism Spectrum Disorder
title_full A Pseudo-Value Approach to Analyze the Semantic Similarity of the Speech of Children With and Without Autism Spectrum Disorder
title_fullStr A Pseudo-Value Approach to Analyze the Semantic Similarity of the Speech of Children With and Without Autism Spectrum Disorder
title_full_unstemmed A Pseudo-Value Approach to Analyze the Semantic Similarity of the Speech of Children With and Without Autism Spectrum Disorder
title_sort pseudo-value approach to analyze the semantic similarity of the speech of children with and without autism spectrum disorder
publisher Frontiers Media S.A.
series Frontiers in Psychology
issn 1664-1078
publishDate 2021-07-01
description Conversational impairments are well known among people with autism spectrum disorder (ASD), but their measurement requires time-consuming manual annotation of language samples. Natural language processing (NLP) has shown promise in identifying semantic difficulties when compared to clinician-annotated reference transcripts. Our goal was to develop a novel measure of lexico-semantic similarity – based on recent work in natural language processing (NLP) and recent applications of pseudo-value analysis – which could be applied to transcripts of children’s conversational language, without recourse to some ground-truth reference document. We hypothesized that: (a) semantic coherence, as measured by this method, would discriminate between children with and without ASD and (b) more variability would be found in the group with ASD. We used data from 70 4- to 8-year-old males with ASD (N = 38) or typically developing (TD; N = 32) enrolled in a language study. Participants were administered a battery of standardized diagnostic tests, including the Autism Diagnostic Observation Schedule (ADOS). ADOS was recorded and transcribed, and we analyzed children’s language output during the conversation/interview ADOS tasks. Transcripts were converted to vectors via a word2vec model trained on the Google News Corpus. Pairwise similarity across all subjects and a sample grand mean were calculated. Using a leave-one-out algorithm, a pseudo-value, detailed below, representing each subject’s contribution to the grand mean was generated. Means of pseudo-values were compared between the two groups. Analyses were co-varied for nonverbal IQ, mean length of utterance, and number of distinct word roots (NDR). Statistically significant differences were observed in means of pseudo-values between TD and ASD groups (p = 0.007). TD subjects had higher pseudo-value scores suggesting that similarity scores of TD subjects were more similar to the overall group mean. Variance of pseudo-values was greater in the ASD group. Nonverbal IQ, mean length of utterance, or NDR did not account for between group differences. The findings suggest that our pseudo-value-based method can be effectively used to identify specific semantic difficulties that characterize children with ASD without requiring a reference transcript.
topic autism
language disorder
semantics
natural language proceeding
child
statistical methods
url https://www.frontiersin.org/articles/10.3389/fpsyg.2021.668344/full
work_keys_str_mv AT joelradams apseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT alexandracsalem apseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT alexandracsalem apseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT heathermacfarlane apseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT heathermacfarlane apseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT rosemaryingham apseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT stevendbedrick apseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT ericfombonne apseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT ericfombonne apseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT jillkdolata apseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT jillkdolata apseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT alisonpresmaneshill apseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT janvansanten apseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT joelradams pseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT alexandracsalem pseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT alexandracsalem pseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT heathermacfarlane pseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT heathermacfarlane pseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT rosemaryingham pseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT stevendbedrick pseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT ericfombonne pseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT ericfombonne pseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT jillkdolata pseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT jillkdolata pseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT alisonpresmaneshill pseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
AT janvansanten pseudovalueapproachtoanalyzethesemanticsimilarityofthespeechofchildrenwithandwithoutautismspectrumdisorder
_version_ 1721292590980005888
spelling doaj-19e21d37ea5d4af2a417a271bd64e6022021-07-21T15:25:46ZengFrontiers Media S.A.Frontiers in Psychology1664-10782021-07-011210.3389/fpsyg.2021.668344668344A Pseudo-Value Approach to Analyze the Semantic Similarity of the Speech of Children With and Without Autism Spectrum DisorderJoel R. Adams0Alexandra C. Salem1Alexandra C. Salem2Heather MacFarlane3Heather MacFarlane4Rosemary Ingham5Steven D. Bedrick6Eric Fombonne7Eric Fombonne8Jill K. Dolata9Jill K. Dolata10Alison Presmanes Hill11Jan van Santen12Center for Spoken Language Understanding, Oregon Health & Science University, Portland, OR, United StatesCenter for Spoken Language Understanding, Oregon Health & Science University, Portland, OR, United StatesDepartment of Psychiatry, Oregon Health & Science University, Portland, OR, United StatesCenter for Spoken Language Understanding, Oregon Health & Science University, Portland, OR, United StatesDepartment of Psychiatry, Oregon Health & Science University, Portland, OR, United StatesCenter for Spoken Language Understanding, Oregon Health & Science University, Portland, OR, United StatesCenter for Spoken Language Understanding, Oregon Health & Science University, Portland, OR, United StatesDepartment of Psychiatry, Oregon Health & Science University, Portland, OR, United StatesInstitute on Development and Disability, Oregon Health & Science University, Portland, OR, United StatesCenter for Spoken Language Understanding, Oregon Health & Science University, Portland, OR, United StatesInstitute on Development and Disability, Oregon Health & Science University, Portland, OR, United StatesCenter for Spoken Language Understanding, Oregon Health & Science University, Portland, OR, United StatesBioSpeech Inc., Portland, OR, United StatesConversational impairments are well known among people with autism spectrum disorder (ASD), but their measurement requires time-consuming manual annotation of language samples. Natural language processing (NLP) has shown promise in identifying semantic difficulties when compared to clinician-annotated reference transcripts. Our goal was to develop a novel measure of lexico-semantic similarity – based on recent work in natural language processing (NLP) and recent applications of pseudo-value analysis – which could be applied to transcripts of children’s conversational language, without recourse to some ground-truth reference document. We hypothesized that: (a) semantic coherence, as measured by this method, would discriminate between children with and without ASD and (b) more variability would be found in the group with ASD. We used data from 70 4- to 8-year-old males with ASD (N = 38) or typically developing (TD; N = 32) enrolled in a language study. Participants were administered a battery of standardized diagnostic tests, including the Autism Diagnostic Observation Schedule (ADOS). ADOS was recorded and transcribed, and we analyzed children’s language output during the conversation/interview ADOS tasks. Transcripts were converted to vectors via a word2vec model trained on the Google News Corpus. Pairwise similarity across all subjects and a sample grand mean were calculated. Using a leave-one-out algorithm, a pseudo-value, detailed below, representing each subject’s contribution to the grand mean was generated. Means of pseudo-values were compared between the two groups. Analyses were co-varied for nonverbal IQ, mean length of utterance, and number of distinct word roots (NDR). Statistically significant differences were observed in means of pseudo-values between TD and ASD groups (p = 0.007). TD subjects had higher pseudo-value scores suggesting that similarity scores of TD subjects were more similar to the overall group mean. Variance of pseudo-values was greater in the ASD group. Nonverbal IQ, mean length of utterance, or NDR did not account for between group differences. The findings suggest that our pseudo-value-based method can be effectively used to identify specific semantic difficulties that characterize children with ASD without requiring a reference transcript.https://www.frontiersin.org/articles/10.3389/fpsyg.2021.668344/fullautismlanguage disordersemanticsnatural language proceedingchildstatistical methods