Medusa: Simplified Graph Processing on GPUs (2012-2014)
Publications
Abstract
Graphs are common data structures for many applications, and efficient graph processing is a must for application
performance. Recently, the graphics processing unit (GPU) has been adopted to accelerate various graph processing algorithms
such as BFS and shortest paths. However, it is difficult to write correct and efficient GPU programs and even more difficult for graph
processing due to the irregularities of graph structures. To simplify graph processing on GPUs, we propose a programming
framework called Medusa which enables developers to leverage the capabilities of GPUs by writing sequential C/C++ code.
Medusa offers a small set of user-defined APIs and embraces a runtime system to automatically execute those APIs in parallel on the
GPU. We develop a series of graph-centric optimizations based on the architecture features of GPUs for efficiency. Additionally,
Medusa is extended to execute on multiple GPUs within a machine. Our experiments show that 1) Medusa greatly simplifies
implementation of GPGPU programs for graph processing, with many fewer lines of source code written by developers and 2) the
optimization techniques significantly improve the performance of the runtime system, making its performance comparable with or
better than manually tuned GPU graph operations.
Highlights
Our system Medusa is the pioneering system of accelerating graph processing on GPUs,
which inspired the usage of heterogeneous architectures into large graph processing in academia and industry players
such as BlazeGraph. (Reference is avaiable upon request)
In the following, we present more details on the "impact factors" of this project (see definition of "impact factors").
Citations
Relevance to Industry and Open-Source Community
This system has inspired other open-source systems and industry systems.
- [TACO] Zhang, Yu, Xiaofei Liao, Lin Gu, Hai Jin, Kan Hu, Haikun Liu, and Bingsheng He
AsynGraph: Maximizing Data Parallelism for Efficient Iterative Graph Processing on GPUs, TACO 2020.
- [ATC] Zheng, Long, Xianliang Li, Yaohui Zheng, Yu Huang, Xiaofei Liao, Hai Jin, Jingling Xue, Zhiyuan Shao, and Qiang-Sheng Hua
Scaph: Scalable GPU-Accelerated Graph Processing with Value-Driven Differential Scheduling, ATC 2020.
- [TPDS] Xu, Xianghao, Fang Wang, Hong Jiang, Yongli Cheng, Dan Feng, and Yongxuan Zhang
A Hybrid Update Strategy for I/O-Efficient Out-of-Core Graph Processing, TPDS 2020.
- [ICS] Jin, Ruoming, Zhen Peng, Wendell Wu, Feodor Dragan, Gagan Agrawal, and Bin Ren
Parallelizing pruned landmark labeling: dealing with dependencies in graph algorithms, ICS 2020.
-
[PPoPP] Wang, Yangzihao, Andrew Davidson, Yuechao Pan, Yuduo Wu, Andy Riffel, and John D. Owens.
Gunrock: A high-performance graph processing library on the GPU, PPoPP 2016.
-
[PACT] Khorasani, Farzad, Rajiv Gupta, and Laxmi N. Bhuyan.
Scalable simd-efficient graph processing on gpus, PACT 2015.
-
[HPCE] Che, Shuai.
GasCL: A vertex-centric graph model for GPUs, HPCE 2014.
-
[TPDS] Zhong, Wenyong, Jianhua Sun, Hao Chen, Jun Xiao, Zhiwen Chen, Chang Cheng, and Xuanhua Shi.
Optimizing graph processing on gpus, TPDS 2016.
-
[ICTIS] Aher, Shweta Nitin, and Sandip M. Walunj.
Accelerate the execution of graph processing using GPU, ICTIS 2018.
-
[Industry] SYSTAP.
Blazegraph: High Performance Graph Database with Semantic Web and Tinkerpop Support.
System Repeatability and Academic Impacts
The system is used in the evaluation of the following papers:
-
[PPoPP] Wang, Yangzihao, Andrew Davidson, Yuechao Pan, Yuduo Wu, Andy Riffel, and John D. Owens.
Gunrock: A high-performance graph processing library on the GPU, PPoPP 2016.
-
[GRADES] Fu, Zhisong, Michael Personick, and Bryan Thompson.
MapGraph: A high level API for fast development of high performance graph analytics on GPUs, GRADES 2014.
-
[CSUR] Shi, Xuanhua, Zhigao Zheng, Yongluan Zhou, Hai Jin, Ligang He, Bo Liu, and Qiang-Sheng Hua.
Graph processing on GPUs: A survey, CSUR 2018.
-
[OOPSLA] Pai, Sreepathi, and Keshav Pingali.
A compiler for throughput optimization of graph algorithms on GPUs, OOPSLA 2016.
-
[IPDPS] Pan, Yuechao, Yangzihao Wang, Yuduo Wu, Carl Yang, and John D. Owens.
Multi-GPU graph analytics, IPDPS 2017.
-
[CCGRID] Guo, Yong, Ana Lucia Varbanescu, Alexandru Iosup, and Dick Epema.
An empirical performance evaluation of gpu-enabled graph-processing systems, CCGRID 2015.
-
[VLDB] Jia, Zhihao, Yongkee Kwon, Galen Shipman, Pat McCormick, Mattan Erez, and Alex Aiken.
A distributed multi-GPU system for fast graph processing, VLDB 2017.
-
[TPDS] Zhong, Wenyong, Jianhua Sun, Hao Chen, Jun Xiao, Zhiwen Chen, Chang Cheng, and Xuanhua Shi.
Optimizing graph processing on gpus, TPDS 2017.
-
[Cluster Computing] Lai, Siyan, Guangda Lai, Fangzhou Lu, Guojun Shen, Jing Jin, and Xiaola Lin.
A BSP model graph processing system on many cores, Cluster Computing 2017.
Educational Adoptions
- [Book] Julian Shun, Morgan & Claypool.
Shared-Memory Parallelism Can be Simple, Fast, and Scalable,
1 Jun 2017.
- [Book] Hamid Sarbazi-Azad, Morgan Kaufmann.
Advances in GPU Research and Practice,
15 Sep 2016.
- [Book] Rajkumar Buyya, Rodrigo N. Calheiros, Amir Vahid Dastjerdi, Morgan Kaufmann.
Big Data: Principles and Paradigms,
7 Jun 2016.
- [Book] Xiaolin Li, Judy Qiu, Springer.
Cloud Computing for Data-Intensive Applications,
2 Dec 2014.
- [Book] Sherif Sakr, Mohamed Gaber. CRC Press.
Large Scale and Big Data: Processing and Management,
9 Jan 2015.
- [Book] Scholarly Editions.
Issues in Computer Programming: 2013 Edition,
1 May 2013.
- [Book] Da Yan, Bu Yingyi, Yuanyuan Tian, Amol Deshpande.
Big Graph Analytics Platforms,
12 Jan 2017.
- [Book] Da Yan, Yuanyuan Tian, James Cheng,Springer.
Systems for Big Graph Analytics,
31 May 2017 .
- [Book] Michael W. Berry, Azlinah Mohamed, Bee Wah Yap,Springer Nature.
Supervised and Unsupervised Learning for Data Science,
4 Sep 2019.
- [Course] National University of Singapore.
CS6284 Topics in Computer Science: Big Data Meets New Hardware
- [Course] Massachusetts Institute of Technology.
6.886 Graph Analytics Spring 2018
- [Course] University of Cambridge.
Data Centric Systems and Networking (2014-2015 Michaelmas Term)
- [Course] UC Berkeley.
CS 267 Spring 2014
- [Course] Ulsan National Institute of Science and Technology.
ECE519 Massively Parallel Programming (Fall 2013)
- [Course] Daegu Gyeongbuk Institute of Science and Technology.
IC611: Database Systems Design (Fall 2013)
Media Coverage
Back to Bingsheng's Influential Works © 2020