The increasingly growing data sets processed on HPC platforms raise major challenges for the underlying storage layer. A promising alternative to traditional file-based storage systems are simpler blobs (binary large objects). They offer lower overhead and better performance at the cost of largely unused features such as file hierarchies or permissions. In a similar fashion, blobs are increasingly considered for replacing distributed file systems for Big Data Analytics (BDA) or as a base for storage abstractions like key-value stores or time-series databases. From these observations we advocate that blobs provide a solid storage model for convergence between HPC and BDA platforms. We identify data consistency as a hard problem to solve in this context because of the different choices made by both communities: while BDA developers typically rely on the storage system to provide data access coordination, the lack of such semantics on HPC platforms requires developers to use application-level tools for this task. In this thesis we propose the key design principles of Týr, a converging storage system designed to answer the needs of both HPC and BDA applications, natively offering data access coordination in the form of transactions. We demonstrate the relevance and efficiency of its design in the light of convergence in multiple applicative contexts from both communities. These experiments validate that Týr delivers its promise of high-throughput and versatility, hence fueling storage-based convergence between HPC and BDA.
© 2001-2024 Fundación Dialnet · Todos los derechos reservados