Read the Publication

 This week we profile a recent publication in Cell from Dr. Gordon Robertson
and colleagues at the Michael Smith Genome Sciences Centre.

Can you provide a brief overview of the work you do for the TCGA?

In a recent Science in the City article on the TCGA uveal melanoma (UVM) publication in Cancer Cell, I outlined TCGA’s goals, structure, projects, and analysis working groups (AWGs); the BC Cancer Agency’s Genome Sciences Centre (BCGSC) delivering microRNA-sequencing data for ~11.5 thousand tumour samples into the TCGA Research Network; and the miRNA-related analysis that the BCGSC’s TCGA team delivered into marker papers for each of the 33 cancer projects.

Below, as I did for UVM, I’ll sketch some of the analyses and collaborations from the bladder cancer (BLCA) project.

Each TCGA AWG faced a specific competitive publication context and opportunity to contribute. For the TCGA muscle-invasive bladder cancer (BLCA) project, the Research Network had published in 2014 an initial marker paper on a cohort of 131 primary tumours. The cohort had grown to include 412 primary tumours, with 15 adjacent tissue normals, and RPPA protein data for 343 primary tumours. Given the larger cohort, we understood that we should be able to extend the 2014 insights.

As we started miRNA-seq data analysis at the BCGSC, unsupervised consensus clustering quickly identified candidate sets of molecular subtypes. We focused on a 4-cluster solution that separated the mRNA basal-squamous subtype into two subsets. As we had done in other TCGA projects, we then identified miRNA mature strands that were differentially abundant across mRNA subtypes and miRNA subtypes, including adjacent tissue normals. We identified miRNAs whose abundance was influenced by somatic copy number alterations. Luda Danilova at Johns Hopkins (Baltimore MD) identified miRNAs whose abundance was influenced by DNA methylation.

Starting in 2015 with the 80-tumour UVM project, Ewan Gibb and I had brought long noncoding RNAs (lncRNAs) into a number of TCGA marker papers, expanding the BCGSC’s contributions from miRNAs to ‘noncoding RNAs’ that included both miRNAs and lncRNAs. We anticipated that lncRNA expression could be more specific to different biological states than coding genes, and the BLCA cohort was large enough to yield results that were consistent with this. Specifically, two of our lncRNA subtypes subdivided the mRNA luminal subtype; one of these lncRNA subtypes was strongly enriched in FGFR3 mutations while strongly depleted in TP53 mutations, and had better survival.

Mauro Castro (Bioinformatics and Systems Biology Laboratory, Federal University of Parana Polytechnic Center, Curitiba, Brazil) and I been collaborating on analysis methods for post-transcriptional regulation. His January 2016 publication on the n»2000 METABRIC breast cancer cohort suggested that, for for regulators like ESR1, inferred activity profiles across a cohort can be more informative than the regulator’s expression profile. In the spring of 2016, we invited Mauro to join the AWG. He and Vinicius Chagas quickly showed that this ‘regulon’ analysis appeared effective with the BLCA cohort. As a first step in applying regulon analysis in bladder cancer, we identified 23 hormone receptors and transcription factors that were associated with this cancer in the literature, and demonstrated using their inferred regulon activities to characterize molecular differences across the lncRNA subtypes and the miRNA subtypes.

As the regulon analyses matured, Mauro, working with Benilton Carvalho (Department of Statistics, University of Campinas, Sao Paulo, Brazil), developed methods for doing multivariate survival analyses with the project’s ~200 clinical and molecular covariates.  Towards the end of the work on the manuscript, Tara Lichtenberg (Biospecimen Core Resource, The Research Institute at Nationwide Children’s Hospital, Columbus, OH) updated the survival data from the latest clinical data files available, and Mauro regenerated multivariate penalized Cox regression results from the updated data. The results indicated that subtypes for mutational processes, mRNAs, lncRNAs, and miRNAs were independent factors in survival.

Finally, during TCGA, the BCGSC had grown to regularly contribute profiles for bacterial and viral signals in RNA and DNA sequencing datasets, to identify when a virus has integrated into the human genome, and when ‘chimeric’ viral-human transcripts are being expressed. These analyses depended, first, on Sara Sadeghi running the BBT sequence classifier, whose speed let her screen 408 mRNA, 412 WES and 136 WGS datasets. Then, for the subset of datasets that Sara’s results indicated were positive for HPV, HHV4 and 5, and Polyomavirus, Karen Mungall’s de novo assembly team assessed evidence for genomic viral integration in RNA and DNA data. The microbe results indicated that viral integration may contribute to BLCA in some cases.

What is the significance of the findings of the current publication?

Bladder cancer is the ninth most common cancer globally, and is the most expensive to treat (Conteras-Sanz, 2017, PMID: 27597124). On diagnosis, approximately 75% of bladder cancers are non-muscle-invasive, 50% of which are low grade, while 25% are muscle-invasive, which tend to be high grade (Kamat 2016, PMID: 27345655). The TCGA cohort consisted of muscle-invasive tumours.

“Bladder cancer is not one but multiple diseases,” said lead corresponding author Dr. Seth Paul Lerner, Professor of Urology and Beth and Dave Swalm Chair in Urologic Oncology at Baylor College of Medicine, who is the director of Urologic Oncology and the Multidisciplinary Bladder Cancer Program. “Under the microscope, invasive bladder cancer from different patients may look very similar. We know there are many variations though that may affect response to treatment and long-term survival.” He noted that the results provide “a much broader and refined understanding of the molecular underpinnings associated with these variations.”

“One prominent feature in this study is that we report survival analysis, which we did not report in the 2014 paper,” said Lerner. “Now, we are able to show that mutation signatures, molecular subtypes, in addition to known clinical and pathological factors, have a very clear influence on overall patient survival.” Given the study’s rich molecular data and large cohort, there are likely substantial opportunities to generate additional molecular insights by extending the multivariate survival analyses reported.

The researchers propose that the information learned about bladder cancer be taken into consideration when designing precision medicine clinical trials. This study and future research on the cancer subtypes may one day help physicians better match patients with specific, personalized therapies tuned to target the particular molecular signatures and other biological characteristics of their cancer.

“Bladder cancer causes an estimated 150,000 deaths worldwide per year, and we are behind other cancer fields in terms of the clinical applications of its molecular data and biology,” Lerner said. “However, we can begin to see how we can use this information in the future to provide the best treatment for each patient.”

What other projects are you currently working on?

TCGA manuscripts on mesothelioma and testicular germ cell cancer are in submission / review processes. They may require revisions.

A number of AWGs are working in advanced stages on ‘PanCancer’ manuscripts that involve analyses done across all TCGA cancer projects, or across sets of related projects (e.g. gastrointestinal cancers, gynecological cancers, renal cancers). These tend can require calculating on ten thousand tumour datasets, per genomic platform. Reanne Bowlby is the BCGSC lead for miRNA and other analyses for many of the PanCan projects; for certain of these projects, I’m involved in miRNA analysis, and, through Ewan Gibb, in lncRNA analyses.

In 2016, the BCGSC and the Institute for Systems Biology (ISB) in Seattle were jointly awarded NIH funding that allows a collaborative BCGSC/ISB team to participate in the ongoing Genome Data Analysis Network (GDAN) that is administered by the Center for Cancer Genomics (CCG). In this project, I’m collaborating with the ISB’s Theo Knijnenberg, Varsha Dhankani, Sheila Reynolds and Ilya Shmulevich to make miRNA analyses available in ISB’s cloud computing infrastructure. We are working to put miRNA-related analyses on the cloud and make them generally available from the cloud.

Read the Publication

This article was contributed by Dr. Gordon Robertson, who thanks to Dr. Ewan Gibb (GenomeDx, Vancouver BC), Dr. Mauro AA Castro (Bioinformatics and Systems Biology Laboratory, Federal University of Parana Polytechnic Center, Curitiba, Brazil), Karen L Mungall, and Dr. Seth Paul Lerner (Professor of Urology and Beth and Dave Swalm Chair in Urologic Oncology at Baylor College of Medicine, Houston TX).