Summary: | 碩士 === 元智大學 === 資訊管理學系 === 103 === Starting PC invention, has been the development of Internet and cloud technology, computer technology, and people have a close relationship, beginning in 2012, "Big Data" is becoming a new concept is now the most watched.
Big Data is also known as big data, massive data, their data growth continues to be in part from the extensive collection of information from various sources, such as mobile devices, high-altitude sensing technology, the Internet community media, software recording ... etc. By 2020, these data will double every two years to increase the speed of growth, but its importance does not lie in how much data, but how to use tools from a variety of sources, and to find out the clues and trends, 60% of respondents believed the organization could use more data to analyze, so that organizational innovation, and to achieve differentiation, this is truly the key to competition. The purpose of this paper, is to use Lucene as the basis for data indexing and search by the Java development environment. The results can be indexed and calculate TF-IDF values, statistical number and weight of each word appears heavy, and can also calculate word keyword in the query text, selected in line with the speech feature emotions when personnel to provide follow-up analyze text clouds or emotional words, the number of times a keyword appears on the timeline can be calculated according to different API, as forecast analysis.
|