english Icono del idioma   español Icono del idioma  

Por favor, use este identificador para citar o enlazar este ítem: https://hdl.handle.net/20.500.12008/53706 Cómo citar
Título: A synchronization-free incomplete LU factorization for GPUs with level-set analysis
Autor: Freire, Manuel
Dufrechou, Ernesto
Ezzatti, Pablo
Tipo: Preprint
Palabras clave: ILU, CSR, Synchronization Free
Fecha de publicación: 2025
Resumen: Incomplete factorization methods are powerful algebraic preconditioners widely used to accelerate the convergence of linear solvers. The parallelization of ILU methods has been extensively studied, particularly for GPUs, which are ubiquitous parallel computing devices. In recent years, synchronizationfree methods have become the mainstream approach for solving sparse triangular linear systems. Although the sparse triangular solver and ILU factorization are closely related, the application of synchronization-free strategies to ILU factorization has not been explored in the literature to the same extent as the triangular solver. In this work, we present synchronization-free implementations of the ILU-0 preconditioner on GPUs. Specifically, we propose three implementations that vary in how row updates are handled after each coefficient elimination, as well as an additional approach that leverages a prior level-set analysis to optimize the execution schedule decomposition, which computes the full factorization of A, ILU only performs an incomplete factorization by discarding certain fill-ins that would otherwise appear in L and U. This approach preserves sparsity in the factors, helping to control memory usage and computational costs. ILU is a widely used algebraic preconditioner and is often chosen when no further information about the problem is available. However, ILU factorizations can be computationally expensive, especially for large sparse matrices, partly because ILU parallelism is limited by serial dependencies in the Gaussian elimination sequence. To address this, various efforts have been made to parallelize the ILU on GPUs, including approaches based on level-set analysis [3], [4], graph-coloring [5], and iterative methods [6], [7].
Financiadores: FCE_3_2022_1_172419 - MODELAR: Modelado del desempeñO de métoDos numÉricos en pLataformas de hArdware heteRogéneas.
Citación: Freire, M., Dufrechou, E. y Ezzatti, P. A synchronization-free incomplete LU factorization for GPUs with level-set analysis [Preprint] Publicado en : 33rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP), Turin, Italy, 2025, pp. 217-225, DOI: 10.1109/PDP66500.2025.00037.
Licencia: Licencia Creative Commons Atribución - No Comercial - Sin Derivadas (CC - By-NC-ND 4.0)
Aparece en las colecciones: Publicaciones académicas y científicas - Instituto de Computación

Ficheros en este ítem:
Fichero Descripción Tamaño Formato   
FDE25.pdfPreprint441,91 kBAdobe PDFVisualizar/Abrir


Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons Creative Commons