FOLLOWUS
1.School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China
2.MIIT Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Xi'an 710072, China
3.National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, Northwestern Polytechnical University, Xi'an 710072, China
4.School of Software, Northwestern Polytechnical University, Xi'an 710072, China
‡Corresponding author
limy@mail.nwpu.edu.cn
纸质出版日期:2023-07-0 ,
收稿日期:2022-10-21,
修回日期:2023-03-05,
Scan QR Code
张晓, 黎梦钰, Michael NGULUBE, 等. MyWAL:一种基于精简输入输出堆栈的键值存储系统性能优化方案[J]. 信息与电子工程前沿(英文版), 2023,24(7):980-993.
XIAO ZHANG, MENGYU LI, MICHAEL NGULUBE, et al. MyWAL: performance optimization by removing redundant input/output stack in key-value store. [J]. Frontiers of information technology & electronic engineering, 2023, 24(7): 980-993.
张晓, 黎梦钰, Michael NGULUBE, 等. MyWAL:一种基于精简输入输出堆栈的键值存储系统性能优化方案[J]. 信息与电子工程前沿(英文版), 2023,24(7):980-993. DOI: 10.1631/FITEE.2200496.
XIAO ZHANG, MENGYU LI, MICHAEL NGULUBE, et al. MyWAL: performance optimization by removing redundant input/output stack in key-value store. [J]. Frontiers of information technology & electronic engineering, 2023, 24(7): 980-993. DOI: 10.1631/FITEE.2200496.
基于日志结构合并(LSM)树的键值(KV)存储系统可优化随机写入性能,并提高读取性能,因此被广泛应用于电子商务、在线分析和实时通信等现代数据存储系统。日志结构合并树将变更的KV数据存在内存中,批量刷新至内存,优化了随机写入效率,但是在系统意外崩溃时会有数据丢失。为了避免内存中的数据丢失,RocksDB在更新内存之前,会将数据写入写前日志(WAL)中。但是开启同步WAL后系统的写入性能会受到较大的影响。在本文中,我们分析了利用本地文件系统保存WAL的一些缺陷,在此基础上提出了一种新的WAL机制,该机制根据WAL文件的特性直接管理原始设备(或分区),避免了无用的元数据更新,同时保证了数据顺序写入磁盘。实验结果表明,对于固态硬盘(SSD)SSD上的小KV数据,MyWAL可以将RocksDB的数据写入性能提高5到8倍。在NVMe SSD和非易失性存储器(NVM)上,MyWAL可以将数据写入性能提高10%–30%。此外,YCSB的结果表明,与SpanDB相比,写入延迟降低了50%。
Based on a log-structured merge (LSM) tree
the key-value (KV) storage system can provide high reading performance and optimize random writing performance. It is widely used in modern data storage systems like e-commerce
online analytics
and real-time communication. An LSM tree stores new KV data in the memory and flushes to disk in batches. To prevent data loss in memory if there is an unexpected crash
RocksDB appends updating data in the write-ahead log (WAL) before updating the memory. However
synchronous WAL significantly reduces writing performance. In this paper
we present a new WAL mechanism named MyWAL. It directly manages raw devices (or partitions) instead of saving data on a traditional file system. These can avoid useless metadata updating and write data sequentially on disks. Experimental results show that MyWAL can significantly improve the data writing performance of RocksDB compared to the traditional WAL for small KV data on solid-state disks (SSDs)
as much as five to eight times faster. On non-volatile memory express soild-state drives (NVMe SSDs) and non-volatile memory (NVM)
MyWAL can improve data writing performance by 10%–30%. Furthermore
the results of YCSB (Yahoo! Cloud Serving Benchmark) show that the latency decreased by 50% compared with SpanDB.
键值(KV)存储日志结构合并(LSM)树非易失性存储器(NVM)NVMe SSD写前日志(WAL)
Key-value (KV) storeLog-structured merge (LSM) treeNon-volatile memory (NVM)Non-volatile memory express soild-state drive (NVMe SSD)Write-ahead log (WAL)
Absalyamov I, Carey MJ, Tsotras VJ, 2018. Lightweight cardinality estimation in LSM-based systems. Proc Int Conf on Management of Data, p.841-855. https://doi.org/10.1145/3183713.3183761https://doi.org/10.1145/3183713.3183761
Athanassoulis M, Chen SM, Ailamaki A, et al., 2011. MaSM: efficient online updates in data warehouses. Proc ACM SIGMOD Int Conf on Management of Data, p.865-876. https://doi.org/10.1145/1989323.1989414https://doi.org/10.1145/1989323.1989414
Chen H, Ruan CY, Li C, et al., 2021. SpanDB: a fast, cost-effective LSM-tree based KV store on hybrid storage. 19th USENIX Conf on File and Storage Technologies, p.17-32.
Dayan N, Idreos S, 2018. Dostoevsky: better space-time trade-offs for LSM-tree based key-value stores via adaptive removal of superfluous merging. Proc Int Conf on Management of Data, p.505-520. https://doi.org/10.1145/3183713.3196927https://doi.org/10.1145/3183713.3196927
Dong SY, Kryczka A, Jin YQ, et al., 2021. Evolution of development priorities in key-value stores serving large-scale applications: the RocksDB experience. 19th USENIX Conf on File and Storage Technologies, p.33-49.
Facebook, 2019. RocksDB, a persistent key-value store for fast storage environments. http://rocksdb.org/http://rocksdb.org/ [Accessed on Jan. 7, 2021].
Izraelevitz J, Yang J, Zhang L, et al., 2019. Basic performance measurements of the Intel Optane DC persistent memory module. https://arxiv.org/abs/1903.05714https://arxiv.org/abs/1903.05714
Kaiyrakhmet O, Lee S, Nam B, et al., 2019. SLM-DB: single-level key-value store with persistent memory. 17th USENIX Conf on File and Storage Technologies, p.191-205.
Kannan S, Bhat N, Gavrilovska A, et al., 2018. Redesigning LSMs for nonvolatile memory with NoveLSM. Proc USENIX Conf on Usenix Annual Technical Conf, p.993-1005.
Leavitt N, 2010. Will NoSQL databases live up to their promise?Computer, 43(2):12-14. https://doi.org/10.1109/MC.2010.58https://doi.org/10.1109/MC.2010.58
Lu LY, Pillai TS, Gopalakrishnan H, et al., 2017. WiscKey: separating keys from values in SSD-conscious storage. ACM Trans Stor, 13(1):5. https://doi.org/10.1145/3033273https://doi.org/10.1145/3033273
Luo C, Carey MJ, 2019. Efficient data ingestion and query processing for LSM-based storage systems. Proc VLDB Endow, 12(5):531-543. https://doi.org/10.14778/3303753.3303759https://doi.org/10.14778/3303753.3303759
Mei F, Cao Q, Jiang H, et al., 2018. SifrDB: a unified solution for write-optimized key-value stores in large datacenter. Proc ACM Symp on Cloud Computing, p.477-489. https://doi.org/10.1145/3267809.3267829https://doi.org/10.1145/3267809.3267829
Pan FF, Yue YL, Xiong J, 2017. dCompaction: delayed compaction for the LSM-tree. Int J Parallel Prog, 45(6):1310-1325. https://doi.org/10.1007/s10766-016-0472-zhttps://doi.org/10.1007/s10766-016-0472-z
Papagiannis A, Saloustros G, González-Férez P, et al., 2018. An efficient memory-mapped key-value store for flash storage. Proc ACM Symp on Cloud Computing, p.490-502. https://doi.org/10.1145/3267809.3267824https://doi.org/10.1145/3267809.3267824
Qader MA, Cheng SW, Hristidis V, 2018. A comparative study of secondary indexing techniques in LSM-based NoSQL databases. Proc Int Conf on Management of Data, p.551-566. https://doi.org/10.1145/3183713.3196900https://doi.org/10.1145/3183713.3196900
Raju P, Kadekodi R, Chidambaram V, et al., 2017. PebblesDB: building key-value stores using fragmented log-structured merge trees. Proc 26th Symp on Operating Systems Principles, p.497-514. https://doi.org/10.1145/3132747.3132765https://doi.org/10.1145/3132747.3132765
Ren K, Zheng Q, Arulraj J, et al., 2017. SlimDB: a space-efficient key-value storage engine for semi-sorted data. Proc VLDB Endow, 10(13):2037-2048. https://doi.org/10.14778/3151106.3151108https://doi.org/10.14778/3151106.3151108
Stonebraker M, 2010. SQL databases v. NoSQL databases. Commun ACM, 53(4):10-11. https://doi.org/10.1145/1721654.1721659https://doi.org/10.1145/1721654.1721659
Teng DJ, Guo L, Lee R, et al., 2017. LSbM-tree: re-enabling buffer caching in data management for mixed reads and writes. IEEE 37th Int Conf on Distributed Computing Systems, p.68-79. https://doi.org/10.1109/ICDCS.2017.70https://doi.org/10.1109/ICDCS.2017.70
Wu XB, Xu YH, Shao ZL, et al., 2015. LSM-trie: an LSM-tree-based ultra-large key-value store for small data items. USENIX Annual Technical Conf, p.71-82.
Yao T, Wan JG, Huang P, et al., 2017. A light-weight compaction tree to reduce I/O amplification toward efficient key-value stores. Proc 33rd Int Conf on Massive Storage Systems and Technology, p.1-13.
Yao T, Zhang YW, Wan JG, et al., 2020. MatrixKV: reducing write stalls and write amplification in LSM-tree based KV stores with a matrix container in NVM. Proc USENIX Conf on Usenix Annual Technical Conf, Article 2.
Zhang YM, Li YK, Guo F, et al., 2018. ElasticBF: fine-grained and elastic bloom filter towards efficient read for LSM-tree-based KV stores. Proc 10th USENIX Conf on Hot Topics in Storage and File Systems, Article 11.
Zhang ZG, Yue YL, He BS, et al., 2014. Pipelined compaction for the LSM-tree. IEEE 28th Int Parallel and Distributed Processing Symp, p.777-786. https://doi.org/10.1109/IPDPS.2014.85https://doi.org/10.1109/IPDPS.2014.85
Zhu YC, Zhang Z, Cai P, et al., 2017. An efficient bulk loading approach of secondary index in distributed log-structured data stores. Proc 22nd Int Conf on Database Systems for Advanced Applications, p.87-102. https://doi.org/10.1007/978-3-319-55753-3_6https://doi.org/10.1007/978-3-319-55753-3_6
关联资源
相关文章
相关作者
相关机构