Data Format
COOL uses a native column-oriented data format to facilitate cohort and analytical queries. The storage hierarchy is summarized in the figure.
COOL uses a native column-oriented data format to facilitate cohort and analytical queries. The storage hierarchy is summarized in the figure.
COOL supports multiple popular input data formats, from which the system can automatically convert them into native storage format. Currently, we have basic support for CSV, Parquet, Avro, and Arrow files. Note that to load the data into COOL, the input data shall have a compatible schema and a YAML file that describes the table according to COOL specifications. Please find out additional requirements for a specific format below. Notably, we require that records of users are grouped together.
How to Load Data to COOL