default search action
Mikhail Smelyanskiy
Person information
SPARQL queries
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [i18]Bo Adler, Niket Agarwal, Ashwath Aithal, Dong H. Anh, Pallab Bhattacharya, Annika Brundyn, Jared Casper, Bryan Catanzaro, Sharon Clay, Jonathan M. Cohen, Sirshak Das, Ayush Dattagupta, Olivier Delalleau, Leon Derczynski, Yi Dong, Daniel Egert, Ellie Evans, Aleksander Ficek, Denys Fridman, Shaona Ghosh, Boris Ginsburg, Igor Gitman, Tomasz Grzegorzek, Robert Hero, Jining Huang, Vibhu Jawa, Joseph Jennings, Aastha Jhunjhunwala, John Kamalu, Sadaf Khan, Oleksii Kuchaiev, Patrick LeGresley, Hui Li, Jiwei Liu, Zihan Liu, Eileen Long, Ameya Sunil Mahabaleshwarkar, Somshubra Majumdar, James Maki, Miguel Martinez, Maer Rodrigues de Melo, Ivan Moshkov, Deepak Narayanan, Sean Narenthiran, Jesus Navarro, Phong Nguyen, Osvald Nitski, Vahid Noroozi, Guruprasad Nutheti, Christopher Parisien, Jupinder Parmar, Mostofa Patwary, Krzysztof Pawelec, Wei Ping, Shrimai Prabhumoye, Rajarshi Roy, Trisha Saar, Vasanth Rao Naik Sabavat, Sanjeev Satheesh, Jane Polak Scowcroft, Jason Sewall, Pavel Shamis, Gerald Shen, Mohammad Shoeybi, Dave Sizer, Misha Smelyanskiy, Felipe Soares, Makesh Narsimhan Sreedhar, Dan Su, Sandeep Subramanian, Shengyang Sun, Shubham Toshniwal, Hao Wang, Zhilin Wang, Jiaxuan You, Jiaqi Zeng, Jimmy Zhang, Jing Zhang, Vivienne Zhang, Yian Zhang, Chen Zhu:
Nemotron-4 340B Technical Report. CoRR abs/2406.11704 (2024) - 2022
- [c41]Ehsan K. Ardestani, Changkyu Kim, Seung Jae Lee, Luoshang Pan, Jens Axboe, Valmiki Rampersad, Banit Agrawal, Fuxun Yu, Ansha Yu, Trung Le, Hector Yuen, Dheevatsa Mudigere, Shishir Juluri, Akshat Nanda, Manoj Wodekar, Krishnakumar Nair, Maxim Naumov, Chris Petersen, Mikhail Smelyanskiy, Vijay Rao:
Supporting Massive DLRM Inference through Software Defined Memory. ICDCS 2022: 302-312 - [c40]Dheevatsa Mudigere, Yuchen Hao, Jianyu Huang, Zhihao Jia, Andrew Tulloch, Srinivas Sridharan, Xing Liu, Mustafa Ozdal, Jade Nie, Jongsoo Park, Liang Luo, Jie Amy Yang, Leon Gao, Dmytro Ivchenko, Aarti Basant, Yuxi Hu, Jiyan Yang, Ehsan K. Ardestani, Xiaodong Wang, Rakesh Komuravelli, Ching-Hsiang Chu, Serhat Yilmaz, Huayu Li, Jiyuan Qian, Zhuobo Feng, Yinbin Ma, Junjie Yang, Ellie Wen, Hong Li, Lin Yang, Chonglin Sun, Whitney Zhao, Dimitry Melts, Krishna Dhulipala, K. R. Kishore, Tyler Graf, Assaf Eisenman, Kiran Kumar Matam, Adi Gangidi, Guoqiang Jerry Chen, Manoj Krishnan, Avinash Nayak, Krishnakumar Nair, Bharath Muthiah, Mahmoud khorashadi, Pallab Bhattacharya, Petr Lapukhov, Maxim Naumov, Ajit Mathews, Lin Qiao, Mikhail Smelyanskiy, Bill Jia, Vijay Rao:
Software-hardware co-design for fast and scalable training of deep learning recommendation models. ISCA 2022: 993-1011 - [c39]Assaf Eisenman, Kiran Kumar Matam, Steven Ingram, Dheevatsa Mudigere, Raghuraman Krishnamoorthi, Krishnakumar Nair, Misha Smelyanskiy, Murali Annavaram:
Check-N-Run: a Checkpointing System for Training Deep Learning Recommendation Models. NSDI 2022: 929-943 - [c38]Colin Unger, Zhihao Jia, Wei Wu, Sina Lin, Mandeep Baines, Carlos Efrain Quintero Narvaez, Vinay Ramakrishnaiah, Nirmal Prajapati, Patrick S. McCormick, Jamaludin Mohd-Yusof, Xi Luo, Dheevatsa Mudigere, Jongsoo Park, Misha Smelyanskiy, Alex Aiken:
Unity: Accelerating DNN Training Through Joint Optimization of Algebraic Transformations and Parallelization. OSDI 2022: 267-284 - 2021
- [j11]Zhaoxia Deng, Jongsoo Park, Ping Tak Peter Tang, Haixin Liu, Jie Yang, Hector Yuen, Jianyu Huang, Daya Shanker Khudia, Xiaohan Wei, Ellie Wen, Dhruv Choudhary, Raghuraman Krishnamoorthi, Carole-Jean Wu, Nadathur Satish, Changkyu Kim, Maxim Naumov, Sam Naghshineh, Mikhail Smelyanskiy:
Low-Precision Hardware Architectures Meet Recommendation Model Inference at Scale. IEEE Micro 41(5): 93-100 (2021) - [i17]Daya Shanker Khudia, Jianyu Huang, Protonu Basu, Summer Deng, Haixin Liu, Jongsoo Park, Mikhail Smelyanskiy:
FBGEMM: Enabling High-Performance Low-Precision Deep Learning Inference. CoRR abs/2101.05615 (2021) - [i16]Dheevatsa Mudigere, Yuchen Hao, Jianyu Huang, Andrew Tulloch, Srinivas Sridharan, Xing Liu, Mustafa Ozdal, Jade Nie, Jongsoo Park, Liang Luo, Jie Amy Yang, Leon Gao, Dmytro Ivchenko, Aarti Basant, Yuxi Hu, Jiyan Yang, Ehsan K. Ardestani, Xiaodong Wang, Rakesh Komuravelli, Ching-Hsiang Chu, Serhat Yilmaz, Huayu Li, Jiyuan Qian, Zhuobo Feng, Yinbin Ma, Junjie Yang, Ellie Wen, Hong Li, Lin Yang, Chonglin Sun, Whitney Zhao, Dimitry Melts, Krishna Dhulipala, K. R. Kishore, Tyler Graf, Assaf Eisenman, Kiran Kumar Matam, Adi Gangidi, Guoqiang Jerry Chen, Manoj Krishnan, Avinash Nayak, Krishnakumar Nair, Bharath Muthiah, Mahmoud khorashadi, Pallab Bhattacharya, Petr Lapukhov, Maxim Naumov, Lin Qiao, Mikhail Smelyanskiy, Bill Jia, Vijay Rao:
High-performance, Distributed Training of Large-scale Deep Learning Recommendation Models. CoRR abs/2104.05158 (2021) - [i15]Zhaoxia Deng, Jongsoo Park, Ping Tak Peter Tang, Haixin Liu, Jie Yang, Hector Yuen, Jianyu Huang, Daya Shanker Khudia, Xiaohan Wei, Ellie Wen, Dhruv Choudhary, Raghuraman Krishnamoorthi, Carole-Jean Wu, Nadathur Satish, Changkyu Kim, Maxim Naumov, Sam Naghshineh, Mikhail Smelyanskiy:
Low-Precision Hardware Architectures Meet Recommendation Model Inference at Scale. CoRR abs/2105.12676 (2021) - [i14]Ehsan K. Ardestani, Changkyu Kim, Seung Jae Lee, Luoshang Pan, Valmiki Rampersad, Jens Axboe, Banit Agrawal, Fuxun Yu, Ansha Yu, Trung Le, Hector Yuen, Shishir Juluri, Akshat Nanda, Manoj Wodekar, Dheevatsa Mudigere, Krishnakumar Nair, Maxim Naumov, Chris Peterson, Mikhail Smelyanskiy, Vijay Rao:
Supporting Massive DLRM Inference Through Software Defined Memory. CoRR abs/2110.11489 (2021) - [i13]Ravi Krishna, Aravind Kalaiah, Bichen Wu, Maxim Naumov, Dheevatsa Mudigere, Misha Smelyanskiy, Kurt Keutzer:
Differentiable NAS Framework and Application to Ads CTR Prediction. CoRR abs/2110.14812 (2021) - 2020
- [c37]Udit Gupta, Carole-Jean Wu, Xiaodong Wang, Maxim Naumov, Brandon Reagen, David Brooks, Bradford Cottel, Kim M. Hazelwood, Mark Hempstead, Bill Jia, Hsien-Hsin S. Lee, Andrey Malevich, Dheevatsa Mudigere, Mikhail Smelyanskiy, Liang Xiong, Xuan Zhang:
The Architectural Implications of Facebook's DNN-Based Personalized Recommendation. HPCA 2020: 488-501 - [c36]Liu Ke, Udit Gupta, Benjamin Youngjae Cho, David Brooks, Vikas Chandra, Utku Diril, Amin Firoozshahian, Kim M. Hazelwood, Bill Jia, Hsien-Hsin S. Lee, Meng Li, Bert Maher, Dheevatsa Mudigere, Maxim Naumov, Martin Schatz, Mikhail Smelyanskiy, Xiaodong Wang, Brandon Reagen, Carole-Jean Wu, Mark Hempstead, Xuan Zhang:
RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing. ISCA 2020: 790-803 - [i12]Maxim Naumov, John Kim, Dheevatsa Mudigere, Srinivas Sridharan, Xiaodong Wang, Whitney Zhao, Serhat Yilmaz, Changkyu Kim, Hector Yuen, Mustafa Ozdal, Krishnakumar Nair, Isabel Gao, Bor-Yiing Su, Jiyan Yang, Mikhail Smelyanskiy:
Deep Learning Training in Facebook Data Centers: Design of Scale-up and Scale-out Systems. CoRR abs/2003.09518 (2020) - [i11]Assaf Eisenman, Kiran Kumar Matam, Steven Ingram, Dheevatsa Mudigere, Raghuraman Krishnamoorthi, Murali Annavaram, Krishnakumar Nair, Misha Smelyanskiy:
Check-N-Run: A Checkpointing System for Training Recommendation Models. CoRR abs/2010.08679 (2020)
2010 – 2019
- 2019
- [c35]Misha Smelyanskiy:
Zion: Facebook Next- Generation Large Memory Training Platform. Hot Chips Symposium 2019: 1-22 - [c34]Assaf Eisenman, Maxim Naumov, Darryl Gardner, Misha Smelyanskiy, Sergey Pupyrev, Kim M. Hazelwood, Asaf Cidon, Sachin Katti:
Bandana: Using Non-Volatile Memory for Storing Deep Learning Models. SysML 2019 - [i10]Dhiraj D. Kalamkar, Dheevatsa Mudigere, Naveen Mellempudi, Dipankar Das, Kunal Banerjee, Sasikanth Avancha, Dharma Teja Vooturi, Nataraj Jammalamadaka, Jianyu Huang, Hector Yuen, Jiyan Yang, Jongsoo Park, Alexander Heinecke, Evangelos Georganas, Sudarshan Srinivasan, Abhisek Kundu, Misha Smelyanskiy, Bharat Kaul, Pradeep Dubey:
A Study of BFLOAT16 for Deep Learning Training. CoRR abs/1905.12322 (2019) - [i9]Maxim Naumov, Dheevatsa Mudigere, Hao-Jun Michael Shi, Jianyu Huang, Narayanan Sundaraman, Jongsoo Park, Xiaodong Wang, Udit Gupta, Carole-Jean Wu, Alisson G. Azzolini, Dmytro Dzhulgakov, Andrey Mallevich, Ilia Cherniavskii, Yinghai Lu, Raghuraman Krishnamoorthi, Ansha Yu, Volodymyr Kondratenko, Stephanie Pereira, Xianjie Chen, Wenlin Chen, Vijay Rao, Bill Jia, Liang Xiong, Misha Smelyanskiy:
Deep Learning Recommendation Model for Personalization and Recommendation Systems. CoRR abs/1906.00091 (2019) - [i8]Udit Gupta, Xiaodong Wang, Maxim Naumov, Carole-Jean Wu, Brandon Reagen, David Brooks, Bradford Cottel, Kim M. Hazelwood, Bill Jia, Hsien-Hsin S. Lee, Andrey Malevich, Dheevatsa Mudigere, Mikhail Smelyanskiy, Liang Xiong, Xuan Zhang:
The Architectural Implications of Facebook's DNN-based Personalized Recommendation. CoRR abs/1906.03109 (2019) - [i7]Liu Ke, Udit Gupta, Carole-Jean Wu, Benjamin Youngjae Cho, Mark Hempstead, Brandon Reagen, Xuan Zhang, David M. Brooks, Vikas Chandra, Utku Diril, Amin Firoozshahian, Kim M. Hazelwood, Bill Jia, Hsien-Hsin S. Lee, Meng Li, Bert Maher, Dheevatsa Mudigere, Maxim Naumov, Martin Schatz, Mikhail Smelyanskiy, Xiaodong Wang:
RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing. CoRR abs/1912.12953 (2019) - 2018
- [c33]Kim M. Hazelwood, Sarah Bird, David M. Brooks, Soumith Chintala, Utku Diril, Dmytro Dzhulgakov, Mohamed Fawzy, Bill Jia, Yangqing Jia, Aditya Kalro, James Law, Kevin Lee, Jason Lu, Pieter Noordhuis, Misha Smelyanskiy, Liang Xiong, Xiaodong Wang:
Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective. HPCA 2018: 620-629 - [i6]Nadav Rotem, Jordan Fix, Saleem Abdulrasool, Summer Deng, Roman Dzhabarov, James Hegeman, Roman Levenstein, Bert Maher, Nadathur Satish, Jakob Olesen, Jongsoo Park, Artem Rakhov, Misha Smelyanskiy:
Glow: Graph Lowering Compiler Techniques for Neural Networks. CoRR abs/1805.00907 (2018) - [i5]Assaf Eisenman, Maxim Naumov, Darryl Gardner, Misha Smelyanskiy, Sergey Pupyrev, Kim M. Hazelwood, Asaf Cidon, Sachin Katti:
Bandana: Using Non-volatile Memory for Storing Deep Learning Models. CoRR abs/1811.05922 (2018) - [i4]Jongsoo Park, Maxim Naumov, Protonu Basu, Summer Deng, Aravind Kalaiah, Daya Shanker Khudia, James Law, Parth Malani, Andrey Malevich, Nadathur Satish, Juan Miguel Pino, Martin Schatz, Alexander Sidorov, Viswanath Sivakumar, Andrew Tulloch, Xiaodong Wang, Yiming Wu, Hector Yuen, Utku Diril, Dmytro Dzhulgakov, Kim M. Hazelwood, Bill Jia, Yangqing Jia, Lin Qiao, Vijay Rao, Nadav Rotem, Sungjoo Yoo, Mikhail Smelyanskiy:
Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications. CoRR abs/1811.09886 (2018) - 2017
- [c32]Xi He, Dheevatsa Mudigere, Mikhail Smelyanskiy, Martin Takác:
Distributed Hessian-Free Optimization for Deep Neural Network. AAAI Workshops 2017 - [c31]Nitish Shirish Keskar, Dheevatsa Mudigere, Jorge Nocedal, Mikhail Smelyanskiy, Ping Tak Peter Tang:
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima. ICLR 2017 - 2016
- [j10]Jongsoo Park, Mikhail Smelyanskiy, Karthikeyan Vaidyanathan, Alexander Heinecke, Dhiraj D. Kalamkar, Md. Mostofa Ali Patwary, Vadim O. Pirogov, Pradeep Dubey, Xing Liu, Carlos Rosales, Cyril Mazauric, Christopher S. Daley:
Optimizations in a high-performance conjugate gradient benchmark for IA-based multi- and many-core processors. Int. J. High Perform. Comput. Appl. 30(1): 11-27 (2016) - [j9]Edmond Chow, Xing Liu, Sanchit Misra, Marat Dukhan, Mikhail Smelyanskiy, Jeff R. Hammond, Yunfei Du, Xiangke Liao, Pradeep Dubey:
Scaling up Hartree-Fock calculations on Tianhe-2. Int. J. High Perform. Comput. Appl. 30(1): 85-102 (2016) - [j8]Field G. Van Zee, Tyler M. Smith, Bryan Marker, Tze Meng Low, Robert A. van de Geijn, Francisco D. Igual, Mikhail Smelyanskiy, Xianyi Zhang, Michael Kistler, Vernon Austel, John A. Gunnels, Lee Killough:
The BLIS Framework: Experiments in Portability. ACM Trans. Math. Softw. 42(2): 12:1-12:19 (2016) - [c30]Hongbo Rong, Jongsoo Park, Lingxiang Xiang, Todd A. Anderson, Mikhail Smelyanskiy:
Sparso: Context-driven Optimizations of Sparse Linear Algebra. PACT 2016: 247-259 - [c29]Scott Sallinen, Nadathur Satish, Mikhail Smelyanskiy, Samantika S. Sury, Christopher Ré:
High Performance Parallel Stochastic Gradient Descent in Shared Memory. IPDPS 2016: 873-882 - [c28]Thomas Häner, Damian S. Steiger, Mikhail Smelyanskiy, Matthias Troyer:
High performance emulation of quantum circuits. SC 2016: 866-874 - [i3]Mikhail Smelyanskiy, Nicolas P. D. Sawaya, Alán Aspuru-Guzik:
qHiPSTER: The Quantum High Performance Software Testing Environment. CoRR abs/1601.07195 (2016) - [i2]Xi He, Dheevatsa Mudigere, Mikhail Smelyanskiy, Martin Takác:
Large Scale Distributed Hessian-Free Optimization for Deep Neural Network. CoRR abs/1606.00511 (2016) - [i1]Nitish Shirish Keskar, Dheevatsa Mudigere, Jorge Nocedal, Mikhail Smelyanskiy, Ping Tak Peter Tang:
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima. CoRR abs/1609.04836 (2016) - 2015
- [j7]Nadathur Satish, Changkyu Kim, Jatin Chhugani, Hideki Saito, Rakesh Krishnaiyer, Mikhail Smelyanskiy, Milind Girkar, Pradeep Dubey:
Can traditional programming bridge the ninja performance gap for parallel computing applications? Commun. ACM 58(5): 77-86 (2015) - [c27]Dheevatsa Mudigere, Srinivas Sridharan, Anand M. Deshpande, Jongsoo Park, Alexander Heinecke, Mikhail Smelyanskiy, Bharat Kaul, Pradeep Dubey, Dinesh K. Kaushik, David E. Keyes:
Exploring Shared-Memory Optimizations for an Unstructured Mesh CFD Application on Modern Parallel Systems. IPDPS 2015: 723-732 - [c26]Jongsoo Park, Mikhail Smelyanskiy, Ulrike Meier Yang, Dheevatsa Mudigere, Pradeep Dubey:
High-performance algebraic multigrid solver optimized for multi-core based distributed parallel systems. SC 2015: 54:1-54:12 - 2014
- [c25]Tyler M. Smith, Robert A. van de Geijn, Mikhail Smelyanskiy, Jeff R. Hammond, Field G. Van Zee:
Anatomy of High-Performance Many-Threaded Matrix Multiplication. IPDPS 2014: 1049-1059 - [c24]Karthikeyan Vaidyanathan, Kiran Pamnany, Dhiraj D. Kalamkar, Alexander Heinecke, Mikhail Smelyanskiy, Jongsoo Park, Daehyun Kim, Aniruddha G. Shet, Bharat Kaul, Bálint Joó, Pradeep Dubey:
Improving Communication Performance and Scalability of Native Applications on Intel Xeon Phi Coprocessor Clusters. IPDPS 2014: 1083-1092 - [c23]Alexander Heinecke, Alexander Breuer, Sebastian Rettenberger, Michael Bader, Alice-Agnes Gabriel, Christian Pelties, Arndt Bode, William Barth, Xiangke Liao, Karthikeyan Vaidyanathan, Mikhail Smelyanskiy, Pradeep Dubey:
Petascale High Order Dynamic Rupture Earthquake Simulations on Heterogeneous Supercomputers. SC 2014: 3-14 - [c22]Simon Heybrock, Bálint Joó, Dhiraj D. Kalamkar, Mikhail Smelyanskiy, Karthikeyan Vaidyanathan, Tilo Wettig, Pradeep Dubey:
Lattice QCD with Domain Decomposition on Intel® Xeon Phi Co-Processors. SC 2014: 69-80 - [c21]Jongsoo Park, Mikhail Smelyanskiy, Karthikeyan Vaidyanathan, Alexander Heinecke, Dhiraj D. Kalamkar, Xing Liu, Md. Mostofa Ali Patwary, Yutong Lu, Pradeep Dubey:
Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and its Application to Unstructured Matrices. SC 2014: 945-955 - [c20]Jongsoo Park, Mikhail Smelyanskiy, Narayanan Sundaram, Pradeep Dubey:
Sparsifying Synchronization for High-Performance Shared-Memory Sparse Triangular Solver. ISC 2014: 124-140 - 2013
- [j6]Jongsoo Park, Ping Tak Peter Tang, Mikhail Smelyanskiy, Daehyun Kim, Thomas Benson:
Efficient backprojection-based synthetic aperture radar computation with many-core processors. Sci. Program. 21(3-4): 165-179 (2013) - [c19]Xing Liu, Mikhail Smelyanskiy, Edmond Chow, Pradeep Dubey:
Efficient sparse matrix-vector multiplication on x86-based many-core processors. ICS 2013: 273-282 - [c18]Alexander Heinecke, Karthikeyan Vaidyanathan, Mikhail Smelyanskiy, Alexander Kobotov, Roman Dubtsov, Greg Henry, Aniruddha G. Shet, George Chrysos, Pradeep Dubey:
Design and Implementation of the Linpack Benchmark for Single and Multi-node Systems Based on Intel® Xeon Phi Coprocessor. IPDPS 2013: 126-137 - [c17]Simon J. Pennycook, Christopher J. Hughes, Mikhail Smelyanskiy, Stephen A. Jarvis:
Exploring SIMD for Molecular Dynamics, Using Intel® Xeon® Processors and Intel® Xeon Phi Coprocessors. IPDPS 2013: 1085-1097 - [c16]Bálint Joó, Dhiraj D. Kalamkar, Karthikeyan Vaidyanathan, Mikhail Smelyanskiy, Kiran Pamnany, Victor W. Lee, Pradeep Dubey, William A. Watson III:
Lattice QCD on Intel® Xeon PhiTM Coprocessors. ISC 2013: 40-54 - 2012
- [c15]Xing Liu, Edmond Chow, Karthikeyan Vaidyanathan, Mikhail Smelyanskiy:
Improving the Performance of Dynamical Simulations Via Multiple Right-Hand Sides. IPDPS 2012: 36-47 - [c14]Dhiraj D. Kalamkar, Joshua D. Trzasko, Srinivas Sridharan, Mikhail Smelyanskiy, Daehyun Kim, Armando Manduca, Yunhong Shu, Matt A. Bernstein, Bharat Kaul, Pradeep Dubey:
High Performance Non-uniform FFT on Modern X86-based Multi-core Systems. IPDPS 2012: 449-460 - [c13]Nadathur Satish, Changkyu Kim, Jatin Chhugani, Hideki Saito, Rakesh Krishnaiyer, Mikhail Smelyanskiy, Milind Girkar, Pradeep Dubey:
Can traditional programming bridge the Ninja performance gap for parallel computing applications? ISCA 2012: 440-451 - [c12]Jongsoo Park, Ping Tak Peter Tang, Mikhail Smelyanskiy, Daehyun Kim, Thomas Benson:
Efficient backprojection-based synthetic aperture radar computation with many-core processors. SC 2012: 28 - [c11]Samuel Williams, Dhiraj D. Kalamkar, Amik Singh, Anand M. Deshpande, Brian van Straalen, Mikhail Smelyanskiy, Ann S. Almgren, Pradeep Dubey, John Shalf, Leonid Oliker:
Optimization of geometric multigrid for emerging multi- and manycore processors. SC 2012: 96 - [c10]Mikhail Smelyanskiy, Jason Sewall, Dhiraj D. Kalamkar, Nadathur Satish, Pradeep Dubey, Nikita Astafiev, Ilya Burylov, Andrey Nikolaev, Sergey Maidanov, Shuo Li, Sunil Kulkarni, Charles H. Finan, Ekaterina Gonina:
Analysis and Optimization of Financial Analytics Benchmark on Modern Multi- and Many-core IA-Based Architectures. SC Companion 2012: 1154-1162 - 2011
- [j5]Michael Deisher, Mikhail Smelyanskiy, Brian Nickerson, Victor W. Lee, Michael Chuvelev, Pradeep Dubey:
Designing and dynamically load balancing hybrid LU for multi/many-core. Comput. Sci. Res. Dev. 26(3-4): 211-220 (2011) - [j4]Daehyun Kim, Joshua Trzasko, Mikhail Smelyanskiy, Clifton Haider, Pradeep Dubey, Armando Manduca:
High-Performance 3D Compressive Sensing MRI Reconstruction Using Many-Core Architectures. Int. J. Biomed. Imaging 2011: 473128:1-473128:11 (2011) - [c9]Mikhail Smelyanskiy, Karthikeyan Vaidyanathan, Jee W. Choi, Bálint Joó, Jatin Chhugani, Michael A. Clark, Pradeep Dubey:
High-performance lattice QCD for multi-core based parallel systems using a cache-friendly hybrid threaded-MPI approach. SC 2011: 69:1-69:11 - [e1]Mikhail Smelyanskiy, Matthew Dixon, David Daly, Maria Eleftheriou, José E. Moreira, Kyung Dong Ryu:
WHPCF'11, Proceedings of the Fourth Workshop on High Performance Computational Finance, co-located with SC11, Seattle, WA, USA, November 13, 2011. ACM 2011, ISBN 978-1-4503-1108-3 [contents] - 2010
- [c8]Victor W. Lee, Changkyu Kim, Jatin Chhugani, Michael Deisher, Daehyun Kim, Anthony D. Nguyen, Nadathur Satish, Mikhail Smelyanskiy, Srinivas Chennupaty, Per Hammarlund, Ronak Singhal, Pradeep Dubey:
Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU. ISCA 2010: 451-460
2000 – 2009
- 2009
- [j3]Mikhail Smelyanskiy, David R. Holmes III, Jatin Chhugani, Alan Larson, Doug Carmean, Dennis P. Hanson, Pradeep Dubey, Kurt Augustine, Daehyun Kim, Alan Kyker, Victor W. Lee, Anthony D. Nguyen, Larry Seiler, Richard A. Robb:
Mapping High-Fidelity Volume Rendering for Medical Imaging to CPU, GPU and Many-Core Architectures. IEEE Trans. Vis. Comput. Graph. 15(6): 1563-1570 (2009) - 2008
- [j2]José Luis Morales, Jorge Nocedal, Mikhail Smelyanskiy:
An algorithm for the fast solution of symmetric linear complementarity problems. Numerische Mathematik 111(2): 251-266 (2008) - [j1]Yen-Kuang Chen, Jatin Chhugani, Pradeep Dubey, Christopher J. Hughes, Daehyun Kim, Sanjeev Kumar, Victor W. Lee, Anthony D. Nguyen, Mikhail Smelyanskiy:
Convergence of Recognition, Mining, and Synthesis Workloads and Its Implications. Proc. IEEE 96(5): 790-807 (2008) - [c7]Sanjeev Kumar, Daehyun Kim, Mikhail Smelyanskiy, Yen-Kuang Chen, Jatin Chhugani, Christopher J. Hughes, Changkyu Kim, Victor W. Lee, Anthony D. Nguyen:
Atomic Vector Operations on Chip Multiprocessors. ISCA 2008: 441-452 - 2007
- [c6]Mikhail Smelyanskiy, Victor W. Lee, Daehyun Kim, Anthony D. Nguyen, Pradeep Dubey:
Scaling performance of interior-point method on large-scale chip multiprocessor system. SC 2007: 22 - 2004
- [c5]Mikhail Smelyanskiy, Scott A. Mahlke, Edward S. Davidson:
Probabilistic Predicate-Aware Modulo Scheduling. CGO 2004: 151-162 - 2003
- [c4]Kevin Fan, Nathan Clark, Michael L. Chu, K. V. Manjunath, Rajiv A. Ravindran, Mikhail Smelyanskiy, Scott A. Mahlke:
Systematic Register Bypass Customization for Application-Specific Processors. ASAP 2003: 64-74 - [c3]Mikhail Smelyanskiy, Scott A. Mahlke, Edward S. Davidson, Hsien-Hsin S. Lee:
Predicate-Aware Scheduling: A Technique for Reducing Resource Constraints. CGO 2003: 169-178 - 2001
- [c2]Hsien-Hsin S. Lee, Mikhail Smelyanskiy, Chris J. Newburn, Gary S. Tyson:
Stack Value File: Custom Microarchitecture for the Stack. HPCA 2001: 5-14 - 2000
- [c1]Mikhail Smelyanskiy, Gary S. Tyson, Edward S. Davidson:
Register Queues: A New Hardware/Software Approach to Efficient Software Pipelining. IEEE PACT 2000: 3-12
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 21:25 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint