Summary: | 碩士 === 國立政治大學 === 應用數學系 === 106 === This study aims to explore the characteristics of the popularity of Tang poetry, and hopes to provide new research direction for Tang poetry. First, we use multivariate statistical methods, which include principal component analysis and factor analysis, to analyze the data given by the book Ranking on Tang Poems. Based on the results of analysis, we extract the characteristics of the popularity of Tang poetry, and compare modern with ancient preferences of reading. Finally, we use word embedding techniques to further analyze the suitability of the results extracted by principal component analysis and factor analysis.
After analyzing the data given by the Ranking on Tang Poems, principal component analysis suggests the following two characteristics: time difference and poem integrity. “Time difference” refers to “Having its own pre-understanding, each era has its own aesthetic standard, which makes some differences of poetic appreciation between ancient and modern readers”. “Poem integrity” refers to “A poem is selected either in a complete form or in a partial form according to the editing requirements.”
Based on factor analysis, we sum up two factors that may influence the popularity of Tang poetry: history related strength and poetic classicism. The “history related strength” refers to “The poem preferences of ancient and modern readers may be influenced by the history related strength of the poem.” The “poetic classicism” indicates that “Poem can be considered to lead a school of thoughts from the academic perspective.”
Using word embedding techniques to study the textual similarity of poems, we find that each of first principal component and two factors has a significant rank correlation with the textual similarity of the top ranking poems based on its corresponding principal component or factor.
|