Ayuda
Ir al contenido

Dialnet


Towards multiprogrammed gpus

  • Autores: Ivan Tanasić
  • Directores de la Tesis: Josep Ignasi Navarro Mas (dir. tes.), Isaac Gelado Fernández (codir. tes.), Eduard Ayguadé Parra (codir. tes.)
  • Lectura: En la Universitat Politècnica de Catalunya (UPC) ( España ) en 2017
  • Idioma: español
  • Tribunal Calificador de la Tesis: Wen Mei Hwu (presid.), Antonio José Peña Monferrer (secret.), Abraham Mendelson (voc.)
  • Materias:
  • Texto completo no disponible (Saber más ...)
  • Resumen
    • Programmable Graphics Processing Units (GPUs) have recently become the most pervasitheve massively parallel processors. They have come a long way, from fixed function ASICs designed to accelerate graphics tasks to a programmable architecture that can also execute general-purpose computations. Because of their performance and efficiency, an increasing amount of software is relying on them to accelerate data parallel and computationally intensive sections of code. They have earned a place in many systems, from low power mobile devices to the biggest data centers in the world. However, GPUs are still plagued by the fact that they essentially have no multiprogramming support, resulting in low system performance if the GPU is shared among multiple programs. In this dissertation we set to provide the rich GPU multiprogramming support by improving the multitasking capabilities and increasing the virtual memory functionality and performance.

      The main issue hindering the multitasking support in GPUs is the nonpreemptive execution of GPU kernels. Here we propose two preemption mechanisms with dierent design philosophies, that can be used by a scheduler to preempt execution on GPU cores and make room for some other process. We also argue for the spatial sharing of the GPU and propose a concrete hardware scheduler implementation that dynamically partitions the GPU cores among running kernels, according to their set priorities. Opposing the assumptions made in the related work, we demonstrate that preemptive execution is feasible and the desired approach to GPU multitasking. We further show improved system fairness and responsiveness with our scheduling policy.

      We also pinpoint that at the core of the insufficient virtual memory support lies the exceptions handling mechanism used by modern GPUs. Currently, GPUs offload the actual exception handling work to the CPU, while the faulting instruction is stalled in the GPU core. This stall-on-fault model prevents some of the virtual memory features and optimizations and is especially harmful in multiprogrammed environments because it prevents context switching the GPU unless all the in-flight faults are resolved. In this disseritation, we propose three GPU core organizations with varying performance-complexity trade-off that get rid of the stall-on-fault execution and enable preemptible exceptions on the GPU (i.e., the faulting instruction can be squashed and restarted later). Building on this support, we implement two use cases and demonstrate their utility. One is a scheme that performs context switch of the faulted threads and tries to find some other useful work to do in the meantime, hiding the latency of the fault and improving the system performance. The other enables the fault handling code to run locally, on the GPU, instead of relying on the CPU offloading and show that the local fault handling can also improve performance.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus

Opciones de compartir

Opciones de entorno