NVMeVirt is a virtual NVMe SSD emulator implemented as a Linux kernel module. NVMeVirt consists of PCIe device emulator, NVMe controller emulator, and various storage backends including but not limited to NVM (Non-Volatile Memory) SSD, Conventional SSD, ZNS (Zoned NameSpace) SSD, and KV (Key-Value) SSD. NVMeVirt opens up a new opportunity to design highly intelligent storage devices over the NVMe interface.
The management firmware inside of the SSD is usually called the Flash Translation Layer (FTL). The FTL plays an important role in determining the performance and reliability of the SSD. We are working on following research projects related to FTLs.
The goal of vStream is to provide a large number of virtual streams to application developers that are independent of the physical streams available in the recent Multi-streamed SSDs.
We are actively participating in the OpenSSD Project, which is an initiative to promote research and education on the recent SSD technology by providing easy access to OpenSSD platforms. We are also interested in building ecosystem around the OpenChannel SSD, based on the latest Cosmos+ OpenSSD platform.
NVMeDirect is a novel user-space I/O framework which improves the performance by allowing the user applications to access the storage device directly. The performance of MongoDB is improved by 10.8% by running it on the recent NVMeDirect 2.0 framework without any code change.
The storage device has remained dumb for the past several decades. Now it’s time to make it smarter. We are conducting several researches toward intelligent SSD-based storage systems.
ForestDB is a key-value storage engine we have developed in collaboration with Couchbase Inc. ForestDB uses a new hybrid indexing scheme called HB+Trie which allows for efficient indexing and retrieval of arbitrary length string keys. We are currently working on a version of ForestDB called ForestDB-raw that works on the raw block device. We plan to use ForestDB-raw in implementing a user-level file system for NVMeDirect and a storage engine for Ceph distributed storage system.
Ceph is a scalable, reliable and high-performance storage solution that is widely used in the cloud computing environment. Our goal is to develop a new storage engine that can get the most out of SSD’s performance. In addition, we are optimizing the Ceph file system for HPC (High-Performance Computing) environment.
Software Stack Development Optimized for Advanced High Performance Storage, SW Star Lab., IITP & Ministry of Science and ICT, April 2021 ~ December 2028. (PI)
A Smart Distributed Key-Value System for Machine Learning Applications, National Research Foundation of Korea (NRF), September 2019 ~ February 2024. (PI)