Summary: | 碩士 === 國立臺灣科技大學 === 電子工程系 === 104 === For in-memory computing frameworks such as Apache Spark, objects (i.e., the intermediated data) can be accommodated in the main memory for speeding up the execution process. In terms of a worker node in the in-memory computing frameworks, when its main memory space is not enough to accommodate the new computed or the retrieved object, Apache Spark uses the Least Recently Used (LRU) eviction policy to release enough main memory space. When the evicted object is required in the future, it can be retrieved by re-computing or reading from the external storage devices. However, the retrieving cost of the evicted object could be large due to the intuitive LRU eviction policy and the bad effect of using the straightforward policy to deal with the evicted object. In this thesis, we propose a cost-aware object management method for in-memory computing frameworks. When the main memory space of a worker node is not enough to accommodate the new computed or the retrieved object, we first pick appreciate objects which are already accommodated in the main memory as candidates for eviction and then evict objects with the minimal sum of the creation cost and the maximum sum of the occupied main memory space. According to the experimental results, we can achieve the goal under different access scenarios (i.e., 80/20 and 50/50 principles).
|