We develop machine learning methods to understand diversities in cancer from genomic data to digital pathology images.

From the genome side, we are interested in understanding intratumour heterogeneity and cancer evolution. Here are the highlights:

  • Ccube: A fast Bayesian mixture model to estimate cancer cell fractions of simple somatic mutations (eg SNV, indel) paper.
  • SVclone: Using Ccube as its engine to estimate cancer cell fractions of structural variants paper.
  • PCAWG: Ccube and SVclone were used in the successful ICGC-PCAWG project, which analysed ~2700 whole genomes across ~40 cancer types paper1 paper2.
  • BitPhylogeny: A Bayesian nonparametric method to reconstruct intratumour phylogenies paper.

From the image side, we are interested in uncovering hidden diversities of the tumour microenvironment using digital pathology images.

  • PathologGAN: A customised general adversarial network that syntheses high quality images while learning interpretable manifold of real images paper1 paper2.

Ongoing collaborations: