Meta-dataset: A dataset of datasets for learning to learn from few examples E Triantafillou, T Zhu, V Dumoulin, P Lamblin, U Evci, K Xu, R Goroshin, ... arXiv preprint arXiv:1903.03096, 2019 | 735 | 2019 |
Rigging the lottery: Making all tickets winners U Evci, T Gale, J Menick, PS Castro, E Elsen International conference on machine learning, 2943-2952, 2020 | 633 | 2020 |
Scaling vision transformers to 22 billion parameters M Dehghani, J Djolonga, B Mustafa, P Padlewski, J Heek, J Gilmer, ... International Conference on Machine Learning, 7480-7512, 2023 | 485 | 2023 |
Empirical analysis of the hessian of over-parametrized neural networks L Sagun, U Evci, VU Guney, Y Dauphin, L Bottou arXiv preprint arXiv:1706.04454, 2017 | 403 | 2017 |
The difficulty of training sparse neural networks U Evci, F Pedregosa, A Gomez, E Elsen arXiv preprint arXiv:1906.10732, 2019 | 107 | 2019 |
Head2toe: Utilizing intermediate representations for better transfer learning U Evci, V Dumoulin, H Larochelle, MC Mozer International Conference on Machine Learning, 6009-6033, 2022 | 86 | 2022 |
The dormant neuron phenomenon in deep reinforcement learning G Sokar, R Agarwal, PS Castro, U Evci International Conference on Machine Learning, 32145-32168, 2023 | 78 | 2023 |
Gradient flow in sparse neural networks and how lottery tickets win U Evci, Y Ioannou, C Keskin, Y Dauphin Proceedings of the AAAI conference on artificial intelligence 36 (6), 6577-6586, 2022 | 78 | 2022 |
Gradmax: Growing neural networks using gradient information U Evci, B van Merrienboer, T Unterthiner, M Vladymyrov, F Pedregosa arXiv preprint arXiv:2201.05125, 2022 | 58 | 2022 |
Comparing transfer and meta learning approaches on a unified few-shot classification benchmark V Dumoulin, N Houlsby, U Evci, X Zhai, R Goroshin, S Gelly, H Larochelle arXiv preprint arXiv:2104.02638, 2021 | 55* | 2021 |
A practical sparse approximation for real time recurrent learning J Menick, E Elsen, U Evci, S Osindero, K Simonyan, A Graves arXiv preprint arXiv:2006.07232, 2020 | 55* | 2020 |
The state of sparse training in deep reinforcement learning L Graesser, U Evci, E Elsen, PS Castro International Conference on Machine Learning, 7766-7792, 2022 | 37 | 2022 |
Scaling laws for sparsely-connected foundation models E Frantar, C Riquelme, N Houlsby, D Alistarh, U Evci arXiv preprint arXiv:2309.08520, 2023 | 14 | 2023 |
Dynamic Sparse Training with Structured Sparsity M Lasby, A Golubeva, U Evci, M Nica, Y Ioannou arXiv preprint arXiv:2305.02299, 2023 | 11 | 2023 |
Training Recipe for N: M Structured Sparsity with Decaying Pruning Mask A Yazdanbakhsh, SC Kao, S Agrawal, S Subramanian, T Krishna, U Evci arXiv preprint arXiv:2209.07617, 2022 | 10 | 2022 |
Detecting dead weights and units in neural networks U Evci arXiv preprint arXiv:1806.06068, 2018 | 10 | 2018 |
Progressive Gradient Flow for Robust N: M Sparsity Training in Transformers AR Bambhaniya, A Yazdanbakhsh, S Subramanian, SC Kao, S Agrawal, ... arXiv preprint arXiv:2402.04744, 2024 | 6 | 2024 |
JaxPruner: A concise library for sparsity research JH Lee, W Park, NE Mitchell, J Pilault, JSO Ceron, HB Kim, N Lee, ... Conference on Parsimony and Learning, 515-528, 2024 | 5 | 2024 |
One Step from the Locomotion to the Stepping Pattern R Boulic, U Evci, E Molla, P Pisupati Proceedings of the 29th International Conference on Computer Animation and …, 2016 | 1 | 2016 |
Learning Parameter Sharing with Tensor Decompositions and Sparsity C Üyük, M Lasby, M Yassin, U Evci, Y Ioannou arXiv preprint arXiv:2411.09816, 2024 | | 2024 |