Resumen de Managing dynamic non-uiform cache architectures

Ayuda

Resumen de Managing dynamic non-uiform cache architectures

Javier Lira Rueda

Researchers from both academia and industry agree that future CMPs will accommodate large shared on-chip last-level caches. However, the exponential increase in multicore processor cache sizes accompanied by growing on-chip wire delays make it difficult to implement traditional caches with a single, uniform access latency. Non-Uniform Cache Access (NUCA) designs have been proposed to address this situation. A NUCA cache divides the whole cache memory into smaller banks that are distributed along the chip and can be accessed independently. Response time in NUCA caches does not only depend on the latency of the actual bank, but also on the time required to reach the bank that has the requested data and to send it to the core. So, the NUCA cache allows those banks that are located next to the cores to have lower access latencies than the banks that are further away, thus mitigating the effects of the cache’s internal wires. These cache architectures have been traditionally classified based on their placement decisions as static (S-NUCA) or dynamic (DNUCA). In this thesis, we have focused on D-NUCA as it exploits the dynamic features that NUCA caches offer, like data migration. The flexibility that D-NUCA provides, however, raises new challenges that hardens the management of this kind of cache architectures in CMP systems. We have identified these new challenges and tackled them from the point of view of the four NUCA policies: replacement, access, placement and migration. First, we focus on the challenges introduced by the replacement policy in D-NUCA. Data migration makes most frequently accessed data blocks to be concentrated on the banks that are closer to the processors. This creates big differences in the average usage rate of the NUCA banks, being the banks that are close to the processors the most accessed banks, while the banks that are further away are not accessed so often. Upon a replacement in a particular bank of the NUCA cache, the probabilities of the evicted data block to be reused by the program will differ if its last location in the NUCA cache was a bank that are close to the processors, or not. The decentralized nature of NUCA, however, prevents a NUCA bank from knowing that other bank is constantly evicting data blocks that are later being reused. We propose three different techniques to dealwith the replacement policy, being The Auction the most successful one. Then, we deal with the challenges in the access policy. As data blocks can be mapped in multiple banks within the NUCA cache. Finding the requesting data in a D-NUCA cache is a difficult task. In addition, data can freely move between these banks, thus the search scheme must look up all banks where the requesting data block can be mapped to ascertain if it is in the NUCA cache, or not. We have proposed HK-NUCA. This is a search scheme that uses home knowledge to effectively reduce the average number of messages introduced to the on-chip network to satisfy a memory request. With regard to the placement policy, this thesis shows the implementation of a hybrid NUCA cache. We have proposed a novel placement policy that accomodates both memory technologies, SRAM and eDRAM, in a single NUCA cache. Finally, in order to deal with the migration policy in D-NUCA caches, we propose The Migration Prefetcher. This is a technique that anticipates data migrations. Summarizing, in this thesis we propose different techniques to efficiently manage future D-NUCA cache architectures on CMPs. We demonstrate the effectivity of our techniques to deal with the challenges introduced by D-NUCA caches. Our techniques outperform existing solutions in the literature, and are in most cases more energy efficient.

Acceso de usuarios registrados

¿Olvidó su contraseña?

¿Es nuevo? Regístrese

Ventajas de registrarse

Dialnet Plus

Coordinado por: