One of the main concerns of today's data centers is to maximize server utilization. In each server processor, multiple applications are executed concurrently, increasing resource efficiency. However, performance and fairness highly depend on the share of resources that each application receives, leading to performance unpredictability. The rising number of cores (and running applications) with every new generation of processors is leading to a growing concern for interference at the shared resources. This thesis focuses on addressing resource interference when different applications are consolidated on the same server processor from two main perspectives: high-performance computing (HPC) and cloud computing.
In the context of HPC, resource management approaches are proposed to reduce inter-application interference at two major critical resources: the last level cache (LLC) and the processor cores. The LLC plays a key role in the system performance of current multi-cores by reducing the number of long-latency main memory accesses. LLC partitioning approaches are proposed for both inclusive and non-inclusive LLCs, as both designs are present in current server processors. In both cases, newly problematic LLC behaviors are identified and efficiently detected, granting a larger cache share to those applications that use best the LLC space. As for processor cores, many parallel applications, like graph applications, do not scale well with an increasing number of cores. Moreover, the default Linux time-sharing scheduler performs poorly when running graph applications, which process vast amounts of data. To maximize system utilization, this thesis proposes to co-locate multiple graph applications on the same server processor by assigning the optimal number of cores to each one, dynamically adapting the number of threads spawned by the running applications.
When studying the impact of system-shared resources on cloud computing, this thesis addresses three major challenges: the complex infrastructure of cloud systems, the nature of cloud applications, and the impact of inter-VM interference. Firstly, this thesis presents the experimental platform developed to perform representative cloud studies with the main cloud system components (hardware and software). Secondly, an extensive characterization study is presented on a set of representative latency-critical workloads which must meet strict quality of service (QoS) requirements. The aim of the studies is to outline issues cloud providers should consider to improve performance and resource utilization. Finally, we propose an online approach that detects and accurately estimates inter-VM interference when co-locating multiple latency-critical VMs. The approach relies on metrics that can be easily monitored in the public cloud as VMs are handled as ``black boxes''. The research described above is carried out following the restrictions and requirements to be applicable to public cloud production systems.
In summary, this thesis addresses contention in the main system shared resources in the context of server consolidation, both in HPC and cloud computing. Experimental results show that important gains are obtained over the Linux OS scheduler by reducing interference. In inclusive LLCs, turnaround time (TT) is reduced by over 40% while improving IPC by more than 3%. In non-inclusive LLCs, fairness and TT are improved by 44% and 24%, respectively, while improving performance by up to 3.5%. By distributing core resources efficiently, almost perfect fairness can be obtained (94%), and TT can be reduced by up to 80%. In cloud computing, performance degradation due to resource contention can be estimated with an overall prediction error of 5%. All the approaches proposed in this thesis have been designed to be applied in commercial server processors without requiring any prior information, making decisions dynamically with data collected from hardware performance counters.
© 2001-2024 Fundación Dialnet · Todos los derechos reservados