Falcon Overview

A Secure and Interpretable Federated Learning Platform

Overview

Falcon is currently designed for vertical federated learning (VFL), allowing multiple parties to train machine learning (ML) models without disclosing their raw data. First, it supports VFL training and prediction with strong and efficient privacy protection for a wide range of ML models. The protection is achieved by a hybrid strategy of threshold partially homomorphic encryption (PHE) and additive secret sharing scheme (SSS). Second, it facilitates understanding of VFL model predictions by a flexible and privacy-preserving interpretability framework, which enables the implementation of state-of-the-art interpretable methods in a decentralized setting. Third, it supports efficient data parallelism of VFL tasks and optimizes the parallelism factors to reduce the overall execution time.

Motivating Example

Falcon focuses on cross-silo data collaboration in the vertical federated learning setting. A motivating example is in the digital banking scenario, where a bank and a Fintech company aim to jointly build a machine learning model that evaluates credit card applications.

img1

The bank has some partial information about the users (e.g., account balances), while the Fintech company has some other information (e.g., the users’ online transactions). Based on data collaboration, the bank has a more accurate model, while the Fintech company could benefit from a pay-per-use model for its contribution to the training and prediction.

System Architecture

The Falcon system consists of three main components: coordinator, agent, and executor.

img2

The coordinator schedules jobs on the other components of the system. It accepts jobs, such as model training, prediction, and interpretability computation, from the users (e.g., data analysts) and returns the results. The job specifies the parties involved, the private data location for each party, and the job’s hyper-parameters.

The agent is local to each party. It receives a task from the coordinator and creates one or multiple task managers (i.e., TM workers). Each TM worker creates an executor, manages the life-cycle of that executor, and periodically reports the status to the TM master on the coordinator.

The executor of each party maintains the data, cryptographic keys, and trained VFL models. It contains a cryptographic primitive that includes PHE and SSS building blocks, a secure operator module supports operations based on the primitives, an algorithm module leverages the secure operators to support privacy-preserving VFL models, a model trainer for model fitting, a predictor for computing predictions, and an interpretability module calculates the prediction explanation.


Open Source

Our GitHub repository can be found here.


Publications

Our research papers can be found here.


Avatar
NUS DBsystem

AI- and Data-driven Financial Management and Analytics