ES2: Elastic Storage System
Overview
A typical data
management system has to deal with real-time updates by
individual users, and as well as periodical large scale
analytical processing, indexing, and data extraction.
We provide an elastic cloud data storage system, called ES2, which
is designed to support both functionalities within the same
storage. Its features include:
- Elastic scaling
- Hybrid storage - supporting both OLTP and OLAP
- Flexible data partitioning based on the database
workload
- Load-adaptive replication
- Transactional semantics for bundled updates
- DBMS-like index functionality
- Multiple indexes of different types: hash, range,
multi-dimensional, bitmap indexes
System Architecture
Overview of Elastic Storage System (ES2)
- Data import control module supports
efficient data bulk-loading from external data sources. The
data could be loaded from various data sources such as
databases stored in conventional DBMSes, plain or structured
data files, and the intermediate data generated by other
Cloud applications.
- Physical storage module contains major components
such as distributed file system (DFS), meta-data catalog and
distributed indexing. The DFS is where the imported data are
actually stored. The meta-data catalog maintains both meta
information about the tables in the storage and various
fine-grained statistics information required by the data
access control module.
- Data access control module is responsible for
performing data access requests from the upper layer
applications. It has two sub-components: data access
interface and data manipulator. The data access interface
parses the data access requests into the corresponding
internal representations that the data manipulator operates
on and chooses a near optimal data access plan such as
parallel sequential scan or index scan or hybrid for
locating and operating on the target data stored in the
physical storage module.