1 November 2022 — Congrats to Zacchaeus Chok Yong Hsin, a 2nd year student from NUS School of Computing who clinched the grand prize at the 2022 Asia Pacific Huawei Developer Competition.
His idea “Genesis”, a synthetic data platform that helps firms anonymize, generate and analyze synthetic datasets beat more than 260 participants from nine countries, to come out tops. The competition attracted more than 130 teams across the region, with 9 finalists pitching their projects to the judges at the finals in Bangkok in September 2022.
Tell us a bit more about your project ‘Genesis’. How was it conceptualized and its key objectives?
Artificial intelligence begins with data. A powerful AI application depends on having the right data, which is both the most important and most difficult part of the process. In the real world, collecting quality data is time-consuming, expensive, and complicated. This is especially true in healthcare.
Recent advancements in AI have precipitated breakthroughs in generative systems, from synthetic voices to photorealistic artwork. GENESIS improves on this foundational technology (e.g. generative adversarial networks, attention-based models) to synthetically generate large-scale, privacy-preserving healthcare datasets that are representative of the wider population.
As Singapore’s population ages and healthcare costs rise, it is shifting its focus to preventive health. This means we need to be more creative and calibrated in healthcare delivery. Synthetic data not only defrays the cost of lengthy data collection workflows but also uncovers more and varied insights into how we can optimise healthcare resource allocation. Better data leads to better intelligence.
Could you give us a real-life example of how Genesis can be used by businesses?
Healthcare institutions (e.g. Primary Care Networks, Hospital Clusters) have access to fragmented healthcare data. Using GENESIS, these smaller datasets could be transformed into large-scale anonymised datasets for downstream analysis. Importantly, GENESIS proprietary algorithms can be customised to create representative samples of our population (e.g. gender, ethnicity). Datasets synthesised via GENESIS could be used to build health risk assessments.
- Increase dataset size, enabling researchers to apply state-of-the-art deep learning models to extract insights.
- Selectively augment datasets to account for protected and sensitive attributes (e.g. gender, ethnicity).
- Mitigate algorithmic bias in healthcare machine learning applications.
- Retain all the properties and patterns of original data while not relating to real individuals.
Since synthetic data is artificial, it can be shared among healthcare practitioners without traditional data concerns.
How did you feel about winning the Grand Prize?
Said Zacchaeus, “This represents a milestone for my project and I want to thank all the people who supported me. I definitely want to further develop the project. Winning the strong technical validation from leading CTOs and Cloud/AI experts is a testament to GENESIS technology roadmap. This paves the way for wide commercial adoption.”
Genesis is currently supported by the School of Computing Venture Initiation Programme, which provides financial grants to students with innovative ideas to support their startup ventures.
Assistant Professor Yair Zick: Ethics in Artificial Intelligence