Upcoming events and workshops

Our latest events and workshops

We deliver regular events and workshops about computation and statistics and their application to various disciplines of health, food sciences and conservation.

Statistical Bioinformatics Seminar

The Statistical Bioinformatics Seminar is hosted jointly by the Sydney Precision Data Science Centre, and the Integrative Systems and Modelling Theme and Judith and David Coffey Life Lab in the Charles Perkins Centre. The aim of this series is to provide a forum for people working within the broad area of computation and statistics and their application to various aspects of biology to present their work and showcase their ongoing projects. It is intended to foster the exchange of ideas and build potential collaborations across multiple disciplines.

To be added to the mailing list, fill out this form. For any other information, please contact data-science.admin@sydney.edu.au.

The seminars are held at 1:00 pm on Mondays and broadcast using Zoom. The format of the talk is approximately 25 minutes and 5 minutes of questions.

The seminar series has concluded for Semester 1 and will resume on July 29.

Upcoming Seminars in Semester 2, 2024

Judith and David Coffey Speaker

Speaker: Prof Robert Lanfear (ANU)

Abstract: How do you estimate a good phylogeny? Phylogenetic trees form the backbone of much of our understanding of evolution, so it's important we get them as right. Many of us had hoped that the recent deluge of sequence data would help to resolve most of the difficult branches in the tree of life, but this has rarely been the case. Debates rage on, fuelled by the observation that small changes in models or data often lead to dramatically different phylogenetic conclusions. In this talk, I'll introduce a number of attempts we've made recently to try and improve phylogenetic inference. This will include various approaches to improving and extending phylogenetic models, as well as ways of revealing and investigating topological variation within individual datasets. I'll also preview our plans for the future.

About the speaker: Robert Lanfear is a professor of molecular evolution and phylogenetics at the Australian National University in Canberra. He works on rates and patterns of molecular evolution, from somatic mutations that accumulate in individuals to substitutions that accumulate over millions of years. He also tries to develop and extend phylogenetic methods to better understand evolutionary history using large genomic datasets.

This event will be held in-person and online.

Join on Zoom or in-person at the Mackenzie Seminar Room, Level 6, Charles Perkins Centre.

Past Seminars

June 3 - What was old is new again: anecdotes from the analysis of new single cell technologies

Judith and David Coffey Seminar

Speaker: Prof Alicia Oshlack (Peter MacCallum Cancer Centre)

Abstract: Single cell RNA sequencing technologies have been available for a decade. However, new technologies are still being developed and refined to provide deeper and more accurate insights into single cells. New technologies such as the chromium compatible 10X FLEX are designed to help with clinical sample collection and use a probe based capture design for RNA. We are using this technology to develop a cell atlas of the pediatric airways. Along the way we have been exploring aspects of design and analysis for this technology. This includes low level analysis and QC of cells as well as probe design and gene summarisation. Other new technologies include using long-read sequencing to explore full-length isoform in single cells. We are working on approaches to simulate and analyse differential transcript usage across single cells.

About the speaker: Professor Alicia Oshlack has been at the forefront of bioinformatics research for nearly 20 years. She is the Head of the Computational Biology Program and group leader at the Peter MacCallum Cancer Centre. She is best known for her large body of work on transcriptional analysis and medical genomics. In addition, Oshlack is internationally recognised for her development of bioinformatics methods for a range of applications including single cell RNA-seq, methylation and genomic analysis. Oshlack is involved in many cutting edge collaborative projects utilising high throughput sequencing to investigate disease and development. She has published more than 120 papers and has developed more than a twenty software packages. Oshlack has been recognised by several awards including the Australian Academy of Science, Gani Medal for Human Genetics (2011), the Georgina Sweet Award for Women in Quantitative Biomedical research (2016), Senior Fellow award from the Australian Bioinformatics and Computational Biology Society (ABACBS, 2023) and was recently elected as a member of the Australian Academy of Health and Medical Science.

May 27 - Connecting the brain with body in neuroimaging research and its implication in neuropsychiatry

Speaker: Dr Ye Ella Tian (University of Melbourne)

Abstract: Integrated research into brain and body systems holds substantial clinical potential in addressing multimorbidity and physical illness burden in people with neuropsychiatric disorders. In the first part of my talk, I will introduce a multiorgan aging network to demonstrate how the aging of one organ system selectively and characteristically influences the aging of other brain and body systems. I will show multiorgan aging profiles for 16 chronic brain and body disorders and its relationship to mortality risk prediction. Paper link: https://www.nature.com/articles/s41591-023-02296-6

In the second part of my talk, I will present a multiorgan, system-wide characterization of brain and body health for common neuropsychiatric disorders. I will show that individuals diagnosed with these neuropsychiatric disorders are not only characterized by deviations from normative reference ranges for brain phenotypes but also present considerably poorer physical health across multiple body systems compared to their healthy peers. I will show that poor physical health is a more pronounced manifestation of neuropsychiatric illness compared to brain changes. To close the talk, I will call for integrated and holistic mental and physical health care in psychiatry. Paper link: https://jamanetwork.com/journals/jamapsychiatry/fullarticle/2804355

About the speaker: Dr Ye Ella Tian is an NHMRC Emerging Leadership Fellow at the Department of Psychiatry, The University of Melbourne. She is a psychiatrist and neuroscientist by training and holds a PhD in systems neuroscience. She works at the interface between neuroscience, computation and translational research of applying brain imaging techniques to clinical research. She leads the development of the Melbourne Subcortex Atlas. Her current research focuses on brain-body relationships in mental illness across the lifespan.

May 20 - A statistical approach for removing joint and individual unwanted variation from single-cell multi-omics data

Speaker: Hsiao-Chi Liao (University of Melbourne)

Abstract: Single-cell multimodal technologies provide an opportunity to study biological mechanisms in a more comprehensive manner. CITE-seq (cellular indexing of transcriptomes and epitopes) assay simultaneously measures mRNA and surface proteins at the single-cell level, and is one of the most popular single-cell multi-omics platforms. The integrated analysis of mRNA expression and protein abundance can help reveal biological insight that would not have been possible from separate analyses of each modality. Unwanted variation from sources such as shared batches and domain-specific library size effects inevitably exists in data from both domains. If not properly corrected, the unwanted variation can potentially lead to misleading conclusions being made from the downstream analyses. We propose a method for removing unwanted variation from matched single-cell multi-omics data that allows us to estimate joint and modality-specific unwanted effects. In our preliminary study with four matched single-cell multi-omics datasets, we have shown that our approach is generally competitive in terms of preserving biological signals and mitigating the undesired technical effects compared to current methods such as Seurat, and can do better when the biological and unwanted variation are associated, as it can avoid removing too much biological signal from the data.

About the speaker: Hsiao-Chi is a doctoral candidate at the School of Mathematics and Statistics, The University of Melbourne, supervised by Dr. Agus Salim, Dr. Terry Speed, and Dr. Davis McCarthy. Her research interests are integrative analysis of single-cell multi-omics datasets and methods development for analysing omics data. She is currently working on her PhD projects that aim to develop statistical methods for removing unwanted variation from proteomics and transcriptomics data.

May 13 - Genetics of sensory nutrition

Dr Daniel Hwang (University of Queensland)

Abstract: Perception of taste and smell shapes our food preferences and choices, playing a pivotal role in dietary habit. Given that dietary intake is a key risk factor of various chronic conditions, including obesity, cardioembolic disorders, and cancers, comprehending the impact of individual variations in sensory perception on eating behaviour and its subsequent effects on health is crucial. This presentation will delve into the genetics of taste perception and food intake, address current challenges and explore future directions for utilizing this knowledge in personalized nutrition and intervention strategies.

About the speaker: Daniel Hwang is an ARC DECRA Fellow at the Institute for Molecular Bioscience, The University of Queensland. He studied Biochemistry as an undergraduate at the National Taiwan University and a Master in Biotechnology at the University of Pennsylvania. Following graduation, he conducted research at the Monell Chemical Senses Center where he first developed a keen interest in genetics and the perception of smell and taste (See his taste work in The Conversation). He was later awarded scholarships to complete a master’s degree in Nutrition at the University of Washington and a PhD degree in Genetic Epidemiology at the QIMR Berghofer Medical Research Institute. Daniel develops and applies statistical methodologies to large-scale high-dimensional data to understand how genes influence human sensory perception, dietary behaviour and related health conditions. He is an EMCR member of the National Committee for Nutrition of the Australian Academy of Science and a Leadership Team member of the Global Consortium for Chemosensory Research. He is on the editorial boards of BMC Medicine and Twin Research and Human Genetics. A complete list of work can be found at: https://researchers.uq.edu.au/researcher/20291.

May 6 - Bacterial-host interactions in the context of the tumor microenvironment

Speaker: Dr Jorge Galeano (Fred Hutchinson Cancer Center)

Abstract: The tumor-associated microbiota has been gaining significant attention due to its ability to promote cancer progression in tumors from the gastrointestinal tract. Microbiome analysis has identified the microorganisms that could be associated with cancer cell malignancy during metastasis and chemoresistance. For instance, the enrichment of Fusobacterium nucleatum in the tumor tissue has been associated with poor clinical outcomes in patients with colorectal cancer. However, it is still unknown how these microorganisms are spatially distributed across the tumor tissue and with what elements of the tumor tissue they interact with to promote cancer progression. In this work, by modifying existing technologies that can map the spatial distribution of RNA transcripts and protein molecules along the tumor tissue, we found that intratumoral bacteria reside in distinct microniches that are functional different from other microcompartments from the same tumor sample. The bacteria-infected microniches were characterized to be hypoxic and largely immunosuppressive with an increased infiltration of pro-inflammatory myeloid cells such as neutrophils and macrophages and excluded from T cell. Cancer epithelial cells that resided in microniches containing bacteria exhibited limited capacity to proliferate with sever chromosome instability. We conclude that intratumoral bacteria is not randomly distributed across the tumor tissue, instead they are highly organized in distinct microniches that can modulate the biological funtion of other elements of the tumor microenvironment including the anti-tumor immune response and cancer epithelial cell compartments.

About the speaker: Jorge completed his medical degree at the National University of Colombia. Then he migrated to Sydney, Australia where he studied actin dynamics in cytotoxic T cells during antigen engagement with cancer cells at the University of Sydney. During his doctoral degree he investigated the molecular mechanisms that drive T cell recruitment in the tumor tissue at the EMBL Australia UNSW node. He is currently doing his postdoctoral training at the Fred Hutch, Seattle USA, where he is studying the influence of bacteria in promoting cancer development.

April 29 - Micro-macro adventures in cellular diversity

Speaker: Dr Fabio Zanini (UNSW)

Abstract: Single cell sequencing has recently opened a new frontier in biomedicine, enabling characterisation of cell identities and behaviours with granularity and scale. Most analyses take place at a mesoscopic scale of sorts: thousands of cells from a specific organ and organism (e.g. adult human blood) are individually sequenced and later computationally grouped (clustering, pseudotime) to gain biological insights. In this talk, I will tell two stories that challenge this paradigm in quite opposite ways. First, I will describe HyperSeq, a new experimental method combining microscopy and transcriptomics to find cellular diversity where there appears to be none: a healthy cell line. Second, I will outline atlas approximations, a desperate quest to fit all cellular diversity on Earth into a single online place.

About the speaker: Fabio Zanini is a group leader at UNSW focussing on single cell biomedicine and open source software development. After a PhD at the Max Planck Institute (Germany) and a postdoc at Stanford University (USA), he moved to Australia in 2019 to create the Data Driven Biomedicine lab (https://fabilab.org). His research covers three areas: (i) Data science on specific biomedical systems including development of the lung, viral infections, immunology, and cancer, (ii) Bioinformatic software including HTSeq 2.0 and igraph, and (iii) Innovation in single cell methods including computational algorithms and experimental protocols. Fabio's overall research passion is cellular diversity.

April 22 - Systematic comparison of sequencing-based spatial transcriptomic methods with cadasSTre and SpatialBench

Judith and David Coffey Seminar

Speaker: Prof Matthew Ritchie (WEHI)

Abstract: Sequencing-based Spatial Transcriptomics (sST) allows gene expression to be measured within complex tissue contexts. Although a wide array of sST technologies are currently available to researchers, efforts to comprehensively benchmark different platforms are currently lacking. The inherent variability across technologies and datasets poses challenges in formulating standardized evaluation metrics. To address this, we established a collection of reference tissues and regions characterized by well-defined histological architecture and other biological ground truth and used them to generate the cadasSTre and SpatialBench datasets that compare 11 sST methods. We highlight molecular diffusion as a variable parameter across different methods and tissues, significantly impacting the effective resolution. Furthermore, we observed that spatial transcriptomic data demonstrate unique attributes beyond merely adding a spatial axis to single-cell data, including an enhanced ability to capture patterned rare cell states along with specific markers, albeit being influenced by multiple factors including sequencing depth and resolution. For the 10X Visium platform, we benchmarked the performance of different sample handling approaches after preprocessing, explored spatially variable gene detection and the ability of clustering and cell deconvolution to identify expected cell types and tissue regions. Multi-sample differential expression analysis was able to recover known gene signatures related to biological sex or gene knockout. Our datasets and analyses serve as a practical guide for sST users and will be useful in future benchmarking studies.

About the speaker: Professor Matt Ritchie has been at lab head at the WEHI for the past 11 years. His team develops analysis methods and open-source software tailored to new applications of genomic technology in biomedical research. In the single-cell and spatial biology field, this work includes tools for data preprocessing (scPipe), benchmarking at scale (CellBench) and new protocols and analysis methods (FLAMES) for applying long-read sequencing to single-cell research. His most recent research is on developing benchmarking resources for sequencing-based spatial transcriptomics technologies (cadasSTre and SpatialBench). Matt completed his PhD on microarray data analysis at WEHI in 2005 under the supervision of Professor Gordon Smyth, which was followed by a period of post-doctoral research at the EBI (Hinxton, UK) and University of Cambridge before returning to WEHI as a Senior Research Officer in 2008. He is a keen advocate of open-source software, having served on both the Technical Advisory Board and Community Advisory Board of the Bioconductor project.

April 15 - High dimensional tensor methods for multi-modal single cell genomics data

Speaker: Kwangmoon Park (University of Wisconsin-Madison)

Abstract: Emerging single cell technologies that simultaneously capture long-range interactions of genomic loci together with their DNA methylation levels are advancing our understanding of 3D genome structure and its interplay with the epigenome at the single cell level. While methods to analyze data from single cell high throughput chromatin conformation capture (scHi-C) experiments are maturing, methods that can jointly analyze multiple modalities with scHi-C data are lacking. In this talk, I present two tensor modeling frameworks: Muscle and SHOPS, to jointly analyze 3D conformation and DNA methylation data measured at the single cell level. First, I present Muscle, a joint decomposition of Multiple single cell tensors. Muscle is a novel tensor decomposition method that can integrate the scHi-C and DNA methylation modalities with a direct interpretability. Next, I introduce SHOPS, Sparse Higher Order Partial Least Squares, which provides an inference on the direct association between Hi-C and DNA methylation. SHOPS is a new tensor response regression method to simultaneously achieve denoising of the scHi-C tensor and selecting the most relevant methylation sites with dimension reduction.

About the speaker: Kwangmoon Park is a Statistics Ph.D. Candidate at the University of Wisconsin-Madison. He is currently working on statistical genomics and high dimensional statistics with Professor Sündüz Keleş. Before joining UW-Madison, he earned a master’s degree in Statistics at the Yonsei University in 2020. He earned a B.A. in Economics and Statistics at Yonsei University in Korea and studied Economics as an exchange student at Erasmus Universiteit Rotterdam in the Netherlands. Kwangmoon Park is mainly interested in questions related to understanding how genes are regulated by distal regions in the genome, particularly by functional non-coding regions. For that purpose, he develops statistical tools for analyzing High-dimensional genomic data, including Hi-C and HiChIP, and for linking diverse types of genomic or epigenomic data with better statistical interpretation. The statistical methodologies he works on are related to tensor factorization/regression and dimension reduction techniques, including Partial Least Squares.

April 8 - scNovel: a scalable deep learning-based network for novel rare cell discovery in single-cell transcriptomics

Speaker: Yixuan Wang (CUHK)

Abstract: Single-cell RNA sequencing has achieved massive success in biological research fields. Discovering novel cell types from single-cell transcriptomics has been demonstrated to be essential in the field of biomedicine, yet is time-consuming and needs prior knowledge. With the unprecedented boom in cell atlases, auto-annotation tools have become more prevalent due to their speed, accuracy, and user-friendly features. However, existing tools have mostly focused on general cell type annotation and have not adequately addressed the challenge of discovering novel rare cell types. In this work, we introduce scNovel, a powerful deep learning-based neural network that specifically focuses on novel rare cell discovery. By testing our model on diverse datasets with different scales, protocols, and degrees of imbalance, we demonstrate that scNovel significantly outperforms previous state-of-the-art novel cell detection models, reaching the most AUROC performance(the only one method whose averaged AUROC results are above 94%, up to 16.26% more comparing to the second-best method). We validate scNovel's performance on a million-scale dataset to illustrate the scalability of scNovel further. Applying scNovel on a clinical COVID-19 dataset, three potential novel subtypes of Macrophages are identified, where the COVID-related differential genes are also detected to have consistent expression patterns through deeper analysis. We believe that our proposed pipeline will be an important tool for high-throughput clinical data in a wide range of applications.

About the speaker: Yixuan Wang is a second-year Ph.D. student in the Department of Computer Science and Engineering at The Chinese University of Hong Kong, co-advised by Prof. Yu Li and Prof. Irwin King. She received her B.S. degree at the Harbin Institute of Technology in 2022. She focuses on developing innovative deep learning approaches to address computational issues in the realms of biology and healthcare, with a specific emphasis on tackling challenges related to single-cell data. She has published six papers in Nature Communications, Bioinformatics, Briefings in Bioinformatics, and RECOMB.

March 25 - Benchmarking long-read RNA-sequencing analysis tools using in silico mixtures

Speaker: Dr Xueyi Dong (WEHI)

Abstract: The lack of benchmark datasets with inbuilt ground-truth makes it challenging to compare the performance of existing long-read isoform detection and differential expression analysis workflows. Here, we present a benchmark experiment using two human lung adenocarcinoma cell lines that were each profiled in triplicate together with synthetic, spliced, spike-in RNAs (“sequins”). Samples were deeply sequenced on both Illumina short-read and Oxford Nanopore Technologies long-read platforms. Alongside the ground-truth available via the sequins, we created in silico mixture samples to allow performance assessment in the absence of true positives or true negatives. Our results show that StringTie2 and bambu outperformed other tools from the 6 isoform detection tools tested, DESeq2, edgeR and limma-voom were best amongst the 5 differential transcript expression tools tested and there was no clear front-runner for performing differential transcript usage analysis between the 5 tools compared, which suggests further methods development is needed for this application.

About the speaker: Dr Xueyi Dong is a postdoctoral research officer in Chen lab in ACRF Cancer Biology and Stem Cells division, the Walter and Eliza Hall Institute of Medical Research (WEHI). She did her undergraduate in Zhejiang University in China, majored in biology science (2014-2018). She completed her PhD at WEHI in 2023 under the supervision of Prof. Matthew Ritchie, Dr. Charity Law and Prof. Gordon Smyth. Her current research primarily involves the analysis of spatial transcriptomics data and the investigation of RNA splicing.

March 18 -  Leveraging genetic diversity to fine-map causal variants of complex traits

Dr Mingxuan Cai (CUHK)

Abstract: Fine-mapping prioritizes risk variants identified by genome-wide association studies (GWASs), serving as a critical step to uncover biological mechanisms underlying complex traits. The major challenges of fine-mapping arise from the homogeneous LD patterns and unadjusted confounding bias in GWAS samples, leading to sub-optimal power and false positives. Here, we develop a statistical method for cross-population fine-mapping (XMAP) by leveraging genetic diversity and accounting for confounding bias. By using cross-population GWAS summary statistics from global biobanks and genomic consortia, we show that XMAP can achieve greater statistical power, better control of false positive rate, and substantially higher computational efficiency for identifying multiple causal signals, compared to existing methods. Importantly, we show that the output of XMAP can be integrated with single-cell datasets, which greatly improves the interpretation of putative causal variants in their cellular context at single-cell resolution.

About the speaker: Dr. Cai is an Assistant Professor at Department of Biostatistics, City University of Hong Kong. He obtained his PhD degree from The Hong Kong University of Science and Technology in 2022. His broad area of interest lies in statistical machine learning and data science with applications in genetics and genomics data analysis.

March 11 - From genetic variants to cellular function: morphological profiling genes associated with atrial fibrillation

Dr Ling Xiao (Harvard Medical School)

Abstract: Atrial fibrillation (AF) is a complex disease and the molecular mechanisms leading to AF in the general population remain unknown. Genome-wide association studies (GWAS) identified over 100 genetic loci associated with AF. However, it is still a big challenge to identify cellular programs through which genes from AF-associated variants and genetic loci modulate the risk for AF. To address this challenge, we systematically applied high-content imaging assay to analyze AF genes in a human pluripotent stem cell (hPSC) derived atrial cardiomyocyte (aCMs) cell model. We performed CRISPR based perturbations to delete AF candidate genes in Cas9 expressing hPSC derived aCMs and used Cell Painting pipeline to analyze atrial cell structures obtained with high-content imaging. Our results indicate cardiomyocyte structural abnormalities contribute to AF pathogenesis. Understanding the functions of candidate AF genes at the GWAS loci have the potential to uncover novel biological mechanisms and potential drug targets for novel therapeutics.

About the speaker: Dr. Ling Xiao is an Instructor in investigation at Cardiovascular Research Center of Massachusetts General Hospital and an Instructor in Medicine at Harvard Medical School. Her research focuses on leveraging large human genetic dataset and state-of-the-art functional genomic approaches to study cardiovascular diseases.

March 4 - Leveraging natural genetic variation to understand human biology  

Judith and David Coffey Seminar

Speaker: Prof Daniel MacArthur (Garvan Institute)

Abstract: The human population, through explosive growth, has performed a comprehensive saturation mutagenesis experiment on itself: any single base substitution that is compatible with life is expected to be present somewhere in the genome of at least one of the nearly 8 billion living humans. Our species has thus, in effect, done many of the natural experiments required to understand our own genotype-phenotype map; the goal of geneticists is to generate the right data from the right people to understand this map, and to convert it into actionable information that can be used in the prediction, diagnosis, and treatment of disease. In this presentation I will discuss the impact of large-scale genomics in international biobanks and healthcare settings on our understanding of human biology; review some of the most important goals for the field of human genetics over the next 5 years; and emphasise the urgent need for more representative resources of human genetic and genomic data to ensure the equitable benefit of future advances in genomic medicine.

About the speaker: Daniel is the Director of the Centre for Population Genomics, based jointly at the Garvan Institute of Medical Research, Sydney and the Murdoch Children’s Research Institute in Melbourne.

He completed his PhD at the University of Sydney before moving to postdoctoral studies at the Wellcome Trust Sanger Institute in Cambridge, UK, and then a faculty position at Harvard Medical School, Massachusetts General Hospital, and the Broad Institute of MIT and Harvard in Boston. In this position he co-directed the Broad Institute’s Program in Medical and Population Genetics, as well as the NIH-funded Center for Mendelian Genomics, which sequenced the exomes, genomes, and/or transcriptomes of over 10,000 individuals from families affected by severe Mendelian disease. He also led the Genome Aggregation Database (gnomAD) consortium, which produces the world’s largest publicly accessible catalogue of human genetic variation, now spanning data from more than 800,000 individual exomes and genomes.

Daniel returned to Australia in 2020 to lead the new Centre for Population Genomics (CPG), now a team of 40 researchers, software engineers, community engagement experts, and other professional staff. The Centre's mission is to establish respectful partnerships with diverse Australian communities, to work with those communities to collect and analyse genomic data at transformative scale, and to use these data to drive both novel genomic discovery and the development of equitable genomic medicine. The CPG currently leads national projects in the development of more representative resources of genetic variation in Australian communities; improving the genomic diagnosis of rare disease; and combining large-scale genetic and cellular genomic data to understand gene function.

February 26 - Single-cell and Spatial Transcriptomics Clustering with an Optimized Adaptive K-Nearest Neighbor Graph

Speaker: Dr Jia Li (Vanderbilt University Medical Center) 

Abstract: Single-cell and spatial transcriptomics have been widely used to characterize cellular landscape in complex tissues. To understand cellular heterogeneity, one essential step is to define cell types through unsupervised clustering. While typical clustering methods have difficulty in identifying rare cell types, approaches specifically tailored to detect rare cell types gain their ability at the cost of poorer performance for grouping abundant ones. Here, we developed aKNNO, a method to identify abundant and rare cell types simultaneously based on an adaptive k-nearest neighbor graph with optimization. Benchmarked on 38 simulated and 20 single-cell and spatial transcriptomics datasets, aKNNO identified both abundant and rare cell types accurately. Without sacrificing performance for clustering abundant cell types, aKNNO discovered known and novel rare cell types that those typical and even specifically tailored methods failed to detect. aKNNO, using transcriptome alone, stereotyped fine-grained anatomical structures more precisely than those integrative approaches combining expression with spatial locations and histology image. 

About the speaker: Dr Jia Li is currently a Postdoctoral Fellow in the Department of Biostatistics in Vanderbilt University Medical Center. Her research is focused on the analysis and method development for single cell RNA sequencing and spatial transcriptomics data. 

October 30 - Integrative annotation scores of variants for impact on RNA binding protein activities

Speaker: Jingqi Duan (University of Wisconsin-Madison)

Abstract: The ENCODE project generated a large collection of eCLIP-seq RNA binding protein (RBP) profiling data with accompanying transcriptome data from RNA-seq experiments of RBP knockdowns by shRNA. However, these datasets are not fully exploited to elucidate the impact of genetic variants on RBP activities. We implement INCA (Integrative annotation scores of variants for impact on RBP activities) as a multi-step genetic variant scoring approach that leverages the ENCODE RBP data together with ClinVar and integrates multiple computational approaches to aggregate evidence. INCA hinges upon evaluating the impact of the variants on the RBP activities by leveraging the genotyped cell lines that harbor these variants. We show that INCA provides critical specificity for the set of candidate variants and their linkage disequilibrium partners even after they are generically scored for impact on RBP binding. As a result, it can augment scoring of 46.2% of the candidate variants for follow-up on average.

About the speaker: Jingqi Duan is a fourth-year Ph.D. student in the Department of Statistics at the University of Wisconsin-Madison. She is currently a research assistant in the Keles Research Group. Her research focuses on advancing statistical and computation methods tailored for the analysis of high-throughput sequencing data, such as eCLIP-seq and Perturb-seq, with the aim of enhancing gene regulation analysis.

October 23 - Statistical and computational methods for spatial transcriptomics data analysis

Speaker: Dr Ying Ma (Brown University)

Abstract: Spatial transcriptomics technologies have enabled gene expression profiling on complex tissues with spatial localization information. The majority of these technologies, however, effectively measure the average gene expression from a mixture of cells of potentially heterogeneous cell types on each tissue location. Here, I develop a deconvolution method, CARD, that combines cell-type-specific expression information from single-cell RNA sequencing (scRNA-seq) with correlation in cell-type composition across tissue locations. Modeling spatial correlation allows us to borrow the cell-type composition information across locations, improving accuracy of deconvolution even with a mismatched scRNA-seq reference. CARD can also impute cell-type compositions and gene expression levels at unmeasured tissue locations to enable the construction of a refined spatial tissue map with a resolution arbitrarily higher than that measured in the original study and can perform deconvolution without a scRNA-seq reference. In a real data application on the human pancreatic ductal adenocarcinoma (PDAC) dataset, CARD identified multiple cell types and molecular markers with distinct spatial localization that define the progression, heterogeneity, and compartmentalization of pancreatic cancer. In addition, if time allows, I will also discuss my other methodological work on integrative differential expression and gene set enrichment analysis in scRNA-seq studies, integrative reference-informed tissue segmentation in SRT studies, and collaborative work on polygenic risk scores for common health-related exposure traits in the Michigan Genomics Initiative (MGI) cohort.

About the speaker: Dr. Ying Ma is an Assistant Professor at the Department of Biostatistics and the Center for Computational Molecular Biology at Brown University. Her research interests focus on developing efficient statistical learning methods to address a variety of biological problems and computational challenges in genomics and genetics. These challenges typically arise with the high-dimensional data generated by rapidly evolving sequencing technologies, e.g., single-cell RNA-seq (scRNA-seq), and spatially resolved transcriptomics (SRT). With the emergence of these large-scale data, she has been continually motivated to develop tailored statistical models to advance our understanding in cellular heterogeneity, tissue organization, and the underlying mechanisms of various types of cancers. Besides her genomics research, she also works on genetic risk prediction and polygenic risk score problems in large biobanks such as UKBiobank, and MGI.

October 9 - Modelling Host-Microbiome Interactions in Physiological and Pathological Processes

Judith and David Coffey Seminar

Speaker: Dr Elaine Holmes (Health Futures Institute, Murdoch University)

Abstract: The use of metabolic profiling to define metabolic phenotypes associated with a wide range of pathologies is expanding and demand for sensitive, high quality disease diagnostics has facilitated the development of new technological and statistical methods for extracting biomarkers from spectroscopic data obtained from biofluids such as urine, serum and stool extracts. These metabolite signatures can subsequently modelled with other ‘-omic’ data, including next generation sequencing data in order to establish connections between the gut bacteria and human (patho)physiology. Examples of urinary or faecal metabolites that are products of the microbiota, or microbiota-host interactions include phenols, indoles, bile acids, short chain fatty acids and choline derivatives, all of which can be quantitatively profiled using spectroscopic technology. Thus the metabolic phenotype can provide a window onto dynamic biochemical responses to physiological and pathological stimuli and also contains information relating to the metabolic activity and function of the gut microbiome.

In order to optimise information recovery from the spectra, analytical strategies for spectral alignment, scaling, curve resolution and quantification, statistical correlation and annotation are necessary. Some exemplar analytical pipelines are presented here with particular focus on a series of methods for enhancing biomarker detection via a family of statistical correlation algorithms. Cross-correlation of multiplatform data allows further characterisation and extraction of improved molecular descriptors of metabolites identified as candidate biomarkers, which in turn, can provide new insights into perturbed pathways and aetiopathogenetic mechanisms through correlation hierarchies of related metabolites. This systems analysis framework extends to encompass other datatypes such as metagenomic or metatranscriptomic data and can identify new correlates between datasets and establish biological coherence across metabolic pathways and networks.

About the speaker: Elaine Holmes is an ARC Laureate Fellow at Murdoch University, where she runs the Centre for Computational and Systems Medicine in the Health Futures Institute. She was elected as a Fellow of the Academy of Medical Sciences in 2018 and the Australian Academy of Science in 2022. Holmes is one of the pioneers in the development and implementation of metabolic phenotyping in translational clinical paradigms. The analytical framework conceptualised for metabolic phenotyping and biomarker discovery has been applied across several disease areas. She also co-developed the Metabolome-Wide Association Study concept and has shown that the microbial component of the metabolic profile is associated with a wide range of conditions including obesity, inflammatory bowel disease, allergies, and certain cancers. Her current focus is around computational modelling of metabolic and metagenomic data to understand the role of the gut microbiome in healthy aging with specific interest in the influence of nutrition on the microbiome.

September 25 - Statistical methods for complex kidney disease data 

Speaker: Dr Yunwei Zhang (Macquarie University)

Abstract: This talk gives an accessible and intuition-based overview of my PhD project, "Statistical methods for complex kidney disease data". I will start with highlighting some of the useful statistical modelling and data science techniques within the kidney disease domain. Motivated by improving the personalised healthcare for end-stage kidney disease patients, I will walk you through my journey of tackling the statistical challenges that arise when simulating deceased kidney donor allocation approaches, predicting post-transplant all-type survival and devising visualisation tools for individualised decision-making. This talk highlights how applied statistical and data science methods make a real impact on addressing real-world health challenges.

About the speaker: Yunwei Zhang began her PhD in July 2019 under the supervision of Jean Yang and Samuel Muller. She submitted her thesis in December 2022 and has since moved into Postdoctorate roles at the University of Melbourne and now at Macquarie University. She is interested in and works on methods that improve personalised healthcare in interdisciplinary fields by using statistics.

September 18 - Harnessing the power of artificial-intelligence for improved MRI-guided cancer radiotherapy 

Speaker: Dr David Waddington (The University of Sydney)

Abstract: Real-time tumour targeting with MRI-guidance has recently become possible with the advent of the MRI-Linac, which combines the unrivalled image quality of MRI with a linear accelerator (Linac) for x-ray radiation therapy. However, the relatively low spatio-temporal resolution of real-time MRI reduces the accuracy with which radiation beams can be adapted to tumour motion (e.g. respiration). New low-latency imaging techniques are thus essential to improving the quality of MRI-Linacs treatments. In this seminar, I will describe artificial-intelligence-based tools we have developed that can improve target tracking during radiation treatments. I will discuss the successes we have had ( and the challenges  faced) in deploying these AI tools to clinical systems.

About the speaker: Dr David Waddington is an NHMRC Emerging Leadership Fellow at the Image X Institute in the Faculty of Medicine and Health at the University of Sydney. As an early career researcher (PhD(Science), The University of Sydney, 2018), he specializes in developing new imaging technologies based on Magnetic Resonance Imaging (MRI) that can be used for the targeting of cancer therapeutics. David has published high impact, first-author research articles in multidisciplinary journals including Nature Communications and Science Advances that have led to two patent applications. He has been the recipient of prestigious academic awards including a 2013-14 Postgraduate Fulbright Scholarship (Harvard University) and a University Medal in Physics (UNSW - 2010). His work has won international conference prizes including two Best in Physic sat the American Association of Physicists in Medicine (AAPM - 2020, 2022) and two Summa Cum Laude awards at the International Society of Magnetic Resonance in Medicine (ISMRM - 2016, 2020). David has given several invited talks at international imaging conferences and presented at public events such as TEDx.

September 4 - Understanding cancer in adolescents and young adults through clinical panel sequencing 

Speaker: Dr Siyuan Zheng (UT Health, San Antonio)

Abstract: Cancer poses unique challenges in adolescents and young adults (AYAs) for several reasons. First, because of their relatively young age, they are often diagnosed late. Second, AYAs often face reproductive and developmental related issues that are usually of less concerns for older cancer patients. Third, long-term health problems associated with cytotoxic treatments can cause morbidity affecting decades of patients’ life. Forth, for unknown reasons, AYAs are less enrolled in clinical trials in the US. Finally, research attentions are more directed to pediatric and older adult patients. Many cancer hospitals have not yet established AYA specific clinical care programs. To improve our understanding of AYA cancers, we analyzed clinical panel sequencing data from AACR GENIE. GENIE is a consortium effort that comprise sequencing data for more than 100,000 patients. The dataset is highly heterogeneous, but meanwhile represents a rich resource for genomic studies due to its sample size and diversity. In this presentation, I will discuss our findings on the genomic and clinical disparities of AYA cancers.

About the speaker: Dr. Zheng is an Assistant Professor and PI at Greehey Children’s Cancer Research Institute, the University of Texas Health Science Center at San Antonio. He received his B.S. in Biochemistry from Nanjing University, China, and PhD in Bioinformatics from Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, China. After his PhD, he received postdoctoral training at Vanderbilt University and UT MD Anderson Cancer Center. During his time at MD Anderson, he participated in TCGA and led the TCGA adrenal cancer project. At UT Health, his group mines cancer genomic datasets to understand cancer aneuploidy, telomeres, and cancer disparity. He is a CPRIT Scholar in Cancer Research and a recipient of the UT Rising STARs Award in 2019. More information about him can be found at on his Github site.

August 28 - Polygenic risk score analysis with the addition of higher order interactions provides insight into protective and risk components of type 2 diabetes 

Speaker: Keri Multerer (Victoria University of Wellington)  
Abstract: Polygenic risk scores (PRS) based on genome wide association studies (GWAS) have been studied since about 2008. The PRS concept is based on obtaining individual DNA sequence information that is then compared for disease association using summary statics (derived from GWAS) in public databases. However, current PRS are not yet robust enough to be generally used in a clinical setting for flagging high risk individuals. Using data available in the UK Biobank with type 2 diabetes as the disease model, we have developed novel methods to incorporate higher order (epistatic) interaction weights to be included in PRS. This will help us to better understand if and by how much higher order interactions explain individual genetic risk. Currently we have improved feature selection for both main effect and epistatic interactions without sacrificing interpretability of the results leading to novel insights into the risk and protective variants driving type 2 diabetes.  
About the speaker: Keri Multerer is a 3rd year PhD student at Victoria University of Wellington in New Zealand under co-supervision of Andrew Munkacsi and Paul Atkinson. She earned a MSc in Genetics from George Washington University in Washington DC and worked in cancer research at Fred Hutchinson Cancer Research Center in Seattle before taking time off to raise three daughters. Teaching herself to code (in python) with specialised machine learning courses, Keri has integrated these two passions in her PhD thesis contributing to the fields of public health and personalised medicine. 

August 21 - Genome Canada Transplant Consortium (GCTC) computational approaches in kidney transplantation 

SpeakerDr Oliver P. Günther (Günther Analytics; for the Genome Canada Transplant Consortium/Vancouver Immunology Lab)

Abstract: Chronic kidney disease is a major societal challenge, costing Canada over $30 billion/year. Transplantation can restore life-long health with tremendous cost savings, but antibody-mediated rejection (AMR) and complications of long-term immune suppression cause premature graft loss or early death in 50% of recipients, with increased costs of care. There are no proven treatments for AMR so prevention and early detection are crucial. Two Genome Canada Transplant Consortium (GCTC) projects will be presented: (1) Use of HLA genotyping and donor-recipient eplet-matching strategies in organ allocation simulation models, and (2) Longitudinal monitoring of gene expression in the peripheral blood of transplant patients to identify immune quiescence or activity. Results from simulations are used to inform organ allocation strategies involving eplet-based matching in Canada while results from the longitudinal genomics analysis will be used to inform AMR-monitoring strategies in combination with other data sets.

About the speaker: Dr Günther has worked with the Immunology Lab at Vancouver General Hospital/University of British Columbia, Canada, as a consultant for the past 5 years, providing customized data analysis, modeling and simulation for projects related to kidney transplantation. He received a Ph.D. in physics from the Goethe University Frankfurt, Germany in 1998. Since then he has focused on data analysis, visualization, modeling and simulation, as well as algorithm and software development, including positions at the University of British Columbia, the British Columbia Centre for Disease Control, and the Prevention of Organ Failure Centre of Excellence in Vancouver. Since 2014, Dr Günther provides consulting services through Gunther Analytics.August 14 - Learning consistent subcellular landmarks to quantify changes in multiplexed protein maps 

August 14 - Learning consistent subcellular landmarks to quantify changes in multiplexed protein maps

Speaker: Dr Scott Berry (UNSW)

Abstract: Highly multiplexed imaging holds enormous promise for understanding how spatial context shapes the activity of the genome and its products at multiple length scales. We have recently developed a deep-learning framework called CAMPA (Conditional Autoencoder for Multiplexed Pixel Analysis) to learn representations of molecular pixel-profiles that are consistent across heterogeneous cell populations and experimental perturbations. CAMPA identifies consistent subcellular landmarks, which can be quantitatively compared in terms of their sizes, shapes, molecular compositions, and relative spatial organisation. Using high-resolution multiplexed immunofluorescence, this reveals how subcellular organisation changes upon perturbation of RNA production, RNA processing, or cell size, and uncovers links between the molecular composition of membraneless organelles and cell-to-cell variability in bulk RNA synthesis rates. By capturing interpretable cellular phenotypes, we anticipate that CAMPA will accelerate the systematic mapping of multiscale atlases of biological organisation to identify the rules by which context shapes physiology and disease.

About the speaker: Scott has a background in Theoretical Physics and Molecular Biology. He studied a PhD at the John Innes Centre in Norwich, UK, on mechanisms of epigenetic memory in plants, before moving to the University of Zurich in Switzerland as an HFSP and EMBO postdoctoral fellow. In Zurich, Scott worked on mechanisms of mRNA concentration homeostasis in mammalian cells and experimental and computational approaches for acquiring and analysing highly multiplexed image data. In 2021, Scott started his group at Single Molecule Science, the EMBL Australia node at the University of New South Wales in Sydney. His group works on quantitative regulation of gene expression at the single-cell level, primarily employing microscopy and systems biology approaches – including mathematical modelling.