Data management on emerging storage devices (2009-2014 for Flash, 2014-present for NVRAM)
Publications
Abstract
The non-volatile memory (NVM) has DRAM-like performance and disk-like persistency which make it possible to replace both disk and DRAM to build single level systems. To keep data consistency in such systems is non-trivial because memory writes may be reordered by
CPU and memory controller. In this paper, we study the consistency cost for an important and common data
structure, B+Tree. Although the memory fence and CPU cacheline flush instructions can order memory writes to
achieve data consistency, they introduce a significant overhead (more than 10X slower in performance). Based
on our quantitative analysis of consistency cost, we propose NV-Tree, a consistent and cache-optimized B+Tree
variant with reduced CPU cacheline flush. We implement and evaluate NV-Tree and NV-Store, a key-value store based on NV-Tree, on an NVDIMM server. NVTree outperforms the state-of-art consistent tree structures by up to 12X under write-intensive workloads.
NV-Store increases the throughput by up to 4.8X under YCSB workloads compared to Redis.
Highlights
This line of research has led the ground rethinking of data management systems on emerging storage devices, and have attracted broad industry interests (e.g., Huawei and Microsoft).
In the following, we present more details on the "impact factors" of this project (see definition of "impact factors").
Citations
- Those papers collectively have received over 500 citations since 2009.
Relevance to Industry and Open-Source Community
This system has inspired other open-source systems and industry systems.
- [WWW] Wang, Sheng, Xiaolin Qin, Zhifeng Bao, and Bohan Li.
Tide-tree: A self-tuning indexing scheme for hybrid storage system, WWW 2016.
- [TKDE] Choi, Gyu Sang, Byung-Won On, and Ingyu Lee.
PB+-Tree: PCM-Aware B+-Tree, TKDE 2015.
- [CIKM] Thonangi, Risi, Shivnath Babu, and Jun Yang.
A practical concurrent index for solid-state drives, CIKM 2012.
- [PVLDB] Athanassoulis, Manos, and Anastasia Ailamaki.
BF-Tree: Approximate Tree Indexing, PVLDB 2014.
- [arVix] Wang, Chundong, Gunavaran Brihadiswarn, Xingbin Jiang, and Sudipta Chattopadhyay.
Circ-Tree: A B+-Tree Variant with Circular Design for Persistent Memory, arVix 2019.
- [LCTES] Wang, Chundong, Sudipta Chattopadhyay, and Gunavaran Brihadiswarn.
Crash recoverable ARMv8-oriented B+-tree for byte-addressable persistent memory, LCTES 2019.
- [PVLDB] Arulraj, Joy, Justin Levandoski, Umar Farooq Minhas, and Per-Ake Larson.
BzTree: A High-Performance Latch-free Range Index for Non-Volatile Memory, PVLDB 2018.
- [TOS] Chen, Cheng, Jun Yang, Qingsong Wei, Chundong Wang, and Mingdi Xue.
Optimizing File Systems with Fine-grained Metadata Journaling on Byte-addressable NVM, TOS 2017.
System Repeatability and Academic Impacts
The system is used in the evaluation of the following papers:
- [PVLDB] Roh, Hongchan, Sanghyun Park, Sungho Kim, Mincheol Shin, and Sang-Won Lee.
B+-tree Index Optimization by Exploiting Internal Parallelism of Flash-based Solid State Drives, PVLDB 2011.
- [Data Eng] Roh, Hongchan, Sanghyun Park, Mincheol Shin, and Sang-Won Lee.
MPSearch: Multi-Path Search for Tree-based Indexes to Exploit Internal Parallelism of Flash SSDs, Data Eng 2014.
- [WISA] Jiang, Zhiwen, Yongji Wu, Yong Zhang, Chao Li, and Chunxiao Xing.
AB-Tree: A Write-Optimized Adaptive Index Structure on Solid State Disk, WISA 2014.
- [FAST] Lee, Se Kwon, K. Hyun Lim, Hyunsub Song, Beomseok Nam, and Sam H. Noh.
WORT: Write Optimal Radix Tree for Persistent Memory Storage Systems, FAST 2017.
Educational Adoptions
- [Book] Joy Arulraj, Andrew Pavlo, Morgan & Claypool Publishers.
Non-Volatile Memory Database Management Systems,
12 Feb 2019.
- [Book] Hideto Hidaka, Springer.
Embedded Flash Memory for Embedded Systems: Technology, Design for Sub-systems, and Innovations,
9 Sep 2017.
- [Book] Martin Kleppmann, "O'Reilly Media, Inc.".
Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems,
16 Mar 2017.
- [Book] Alex Petrov, "O'Reilly Media, Inc.".
Database Internals: A Deep Dive into How Distributed Data Systems Work,
13 Sep 2019.
- [Course] Sogang University.
Topics in Distributed Systems [CSE 6468]
Media Coverage
Back to Bingsheng's Influential Works © 2020