Write-ahead-logging is a common approach for providing recovery capability while allowing no-force buffer management and improving performance in most storage systems, ranging from traditional DBMSes to the emerging NoSQL systems, a.k.a distributed key-value stores such as HBase and Cassandra. Nevertheless, the separation of log and application data in this approach incurs potential write overheads observed in write-heavy applications and therefore adversely affects the write throughput and recovery time in the system.
To avoid the aforementioned problems, LogBase adopts log-only storage for removing the write bottleneck and supporting fast system recovery. The overall architecture of LogBase is depicted in the above figure. LogBase is designed to be dynamically deployed in cluster environment to take advantages of the elastic scaling property as other NoSQL systems. We design a multi-version index strategy for supporting efficient access to the data maintained in the log. We also enhance LogBase to support transactions that bundle multiple read and write operations across multiple data records and provide snapshot isolation. Since LogBase is optimized for write, the main cost will be read and scan. Therefore, novel indexing structures and processing strategies are designed to ensure query performance.