Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects E Agullo, J Demmel, J Dongarra, B Hadri, J Kurzak, J Langou, H Ltaief, ... Journal of Physics: Conference Series 180 (1), 012037, 2009 | 563 | 2009 |

Dense linear algebra solvers for multicore with GPU accelerators S Tomov, R Nath, H Ltaief, J Dongarra 2010 IEEE International Symposium on Parallel & Distributed Processing …, 2010 | 318 | 2010 |

Flexible development of dense linear algebra algorithms on massively parallel architectures with DPLASMA G Bosilca, A Bouteiller, A Danalis, M Faverge, A Haidar, T Herault, ... 2011 IEEE International Symposium on Parallel and Distributed Processing …, 2011 | 227* | 2011 |

A hybridization methodology for high-performance linear algebra software for GPUs E Agullo, C Augonnet, J Dongarra, H Ltaief, R Namyst, S Thibault, ... GPU Computing Gems Jade Edition, 473-484, 2012 | 146 | 2012 |

Scheduling dense linear algebra operations on multicore processors J Kurzak, H Ltaief, J Dongarra, RM Badia Concurrency and Computation: Practice and Experience 22 (1), 15-44, 2010 | 145 | 2010 |

QR factorization on a multicore node enhanced with multiple GPU accelerators E Agullo, C Augonnet, J Dongarra, M Faverge, H Ltaief, S Thibault, ... 2011 IEEE International Parallel & Distributed Processing Symposium, 932-943, 2011 | 141 | 2011 |

Trends in data locality abstractions for HPC systems D Unat, A Dubey, T Hoefler, J Shalf, M Abraham, M Bianco, ... IEEE Transactions on Parallel and Distributed Systems 28 (10), 3007-3020, 2017 | 104 | 2017 |

Comparative study of one-sided factorizations with multiple software packages on multi-core hardware E Agullo, B Hadri, H Ltaief, J Dongarrra Proceedings of the Conference on High Performance Computing Networking …, 2009 | 101 | 2009 |

Multicore-optimized wavefront diamond blocking for optimizing stencil updates T Malas, G Hager, H Ltaief, H Stengel, G Wellein, D Keyes SIAM Journal on Scientific Computing 37 (4), C439-C464, 2015 | 92 | 2015 |

LU factorization for accelerator-based systems E Agullo, C Augonnet, J Dongarra, M Faverge, J Langou, H Ltaief, ... 2011 9th IEEE/ACS International Conference on Computer Systems and …, 2011 | 88 | 2011 |

Parallel reduction to condensed forms for symmetric eigenvalue problems using aggregated fine-grained and memory-aware kernels A Haidar, H Ltaief, J Dongarra Proceedings of 2011 International Conference for High Performance Computing …, 2011 | 80 | 2011 |

A scalable high performant Cholesky factorization for multicore with GPU accelerators H Ltaief, S Tomov, R Nath, P Du, J Dongarra High Performance Computing for Computational Science–VECPAR 2010: 9th …, 2011 | 74 | 2011 |

Energy footprint of advanced dense numerical linear algebra using tile algorithms on multicore architectures J Dongarra, H Ltaief, P Luszczek, VM Weaver 2012 Second International Conference on Cloud and Green Computing, 274-281, 2012 | 68 | 2012 |

Plasma users guide E Agullo, J Dongarra, B Hadri, J Kurzak, J Langou, J Langou, H Ltaief, ... Technical report, ICL, UTK, 2009 | 66 | 2009 |

Tile low rank Cholesky factorization for climate/weather modeling applications on manycore architectures K Akbudak, H Ltaief, A Mikhalev, D Keyes High Performance Computing: 32nd International Conference, ISC High …, 2017 | 64 | 2017 |

Exageostat: A high performance unified software for geostatistics on manycore systems S Abdulah, H Ltaief, Y Sun, MG Genton, DE Keyes IEEE Transactions on Parallel and Distributed Systems 29 (12), 2771-2784, 2018 | 62* | 2018 |

Two-stage tridiagonal reduction for dense symmetric matrices using tile algorithms on multicore architectures P Luszczek, H Ltaief, J Dongarra 2011 IEEE International Parallel & Distributed Processing Symposium, 944-955, 2011 | 54 | 2011 |

The compute and control for adaptive optics (CACAO) real-time control software package O Guyon, A Sevin, D Gratadour, J Bernard, H Ltaief, D Sukkari, S Cetre, ... Adaptive Optics Systems VI 10703, 469-480, 2018 | 51 | 2018 |

Programming abstractions for data locality A Tate, A Kamil, A Dubey, A Groblinger, B Chamberlain, B Goglin, ... Office of Scientific and Technical Information (OSTI), 2014 | 51 | 2014 |

Achieving numerical accuracy and high performance using recursive tile LU factorization with partial pivoting J Dongarra, M Faverge, H Ltaief, P Luszczek Concurrency and Computation: Practice and Experience 26 (7), 1408-1431, 2014 | 50 | 2014 |