Our research focuses on data science questions in decoding cancer epigenomic mechanisms. Our approaches include novel methodology developing, large data integration, and close wet-lab collaboration. Main projects are listed below.


Integrative Epigenomics

Accumulated public datasets in epigenomics enable us to better interpret gene regulation through integrative approaches. These datasets include that measuring chromatin protein binding (ChIP-seq, CUT&Tag, etc.), chromatin accessibility (ATAC-seq, DNase-seq, etc.), 3D chromatin looping (HiC, HiChIP, etc.), sequence editing/mutating in non-coding regions (CRISPR screens, target sequencing etc.) and so on. Our lab develop computational methods to integrate and interpret these datasets.

  • Understanding enhancer function with multi-omics data integration. Key questions to interpret enhancer function include who and how enhancers organize gene regulation. We integrate sequencing assays measuring enhancer acitivities, enhancer-protein binding, gene expression and enhaner-gene interactions to provide quantitative answers to the questions. With well-designed integrative approaches, we uncovered novel insights to understand super enhancer internal organization. By integrating public datasets into in-house domain applications, we identified critical enhancer regulators in cancers.

  • Harmonizing epigenomic sequencing variabilities across wide biological conditions. One barrier to efficient integrate epigenomic sequencing data is the data heterogeneities within one data modality and across multi-modalities. The heterogeneities raise from diverse biological and technical parameters in different studies. For example, ChIP-seq datasets are tolerant to high measuring variabilities due to PCR induced GC content biases or other intrisic bias factors. We develop statistical models to deconvolute such data heterogeneity and better interpret epigenomic sequencing signals.


Cancer Genomics

Cancers are the main biological setting where we apply our computational techniques. Our recent works focus on understanding oncogenesis mechanisms based on close collaboration with web-lab colleagues. We study oncogenesis mechanisms triggered by oncoviruses (e.g. Epstein–Barr virus (EBV) and human papillomavirus (HPV) etc.) and clonal hematopoiesis.

  • Oncogenesis driven by viral-associated 3D chromatin looping. One of our key hypotheses is that viruses alter the host genome 3D looping during oncogenesis. This provides us an unique angle to evaluate key biomarkers in viral-triggered cancers. We found that viruses may generate critical super enhancers involving in re-organizing human genome expression.

  • Clonal hematopoiesis biomarkers across solid tumors. Clonal hematopoiesis was found highly associated to reduced survival in cancer patients. The molecular mechanisms however are unclear. We work closely with the ORIEN network to decode relationships between cancers and clonal hematopoiesis, to understand the common and unique roles of clonal hematopoiesis.