I'm working on the application of machine learning, algorithmics, and statistical approaches to genomics data in order to solve questions related to cancer.


Next generation sequencing technologies allow to generating genomic and transrcriptomic data with extremely high dimensionality. Machine learning techniques and statistical approaches are well suited to mining such data to tackling a diversity of problems related to complex diseases and paving the way to personalized medicine.


My reasearch projects include using circulating miRNAs as biomarker for Breast Cancer BC screening in Rwandan women, designing of an integrative signature to predict a response to Neoadjuvant Chemoterapy NAC treatment in BC. For that I developped an optimized machine learning pipeline aiming to discovering short molecular biomarker signatures to be used in clinics.


I was involved in many projects. I designed a flexible RNA-seq analysis pipeline including steps: QC and cleaning, mapping, UMI deduplication, summarization and visualization. I applied gene coexpression network approaches on RNA-seq data to study gene module preservation and information flow. My research projects have led me to work with RNA-seq data generated using different library kits: Ovation SOLO, CATS diagenode, and Illumina. I worked also with data miRNAs generated with qPCR, and metabolites generated with MS technology.





