Presented here is a protocol to discover the overexpressed driver genes maintaining the established cancer stem-like cells derived from colorectal HT29 cells. RNAseq with available bioinformatics was performed to investigate and screen gene expression networks for elucidating a potential mechanism involved in the survival of targeted tumor cells.
Cancer stem cells play a vital role against clinical therapies, contributing to tumor relapse. There are many oncogenes involved in tumorigenesis and the initiation of cancer stemness properties. Since gene expression in the formation of colorectal cancer-derived tumorspheres is unclear, it takes time to discover the mechanisms working on one gene at a time. This study demonstrates a method to quickly discover the driver genes involved in the survival of the colorectal cancer stem-like cells in vitro. Colorectal HT29 cancer cells that express the LGR5 when cultured as spheroids and accompany an increase CD133 stemness markers were selected and used in this study. The protocol presented is used to perform RNAseq with available bioinformatics to quickly uncover the overexpressed driver genes in the formation of colorectal HT29-derived stem-like tumorspheres. The methodology can quickly screen and discover potential driver genes in other disease models.
Colorectal cancer (CRC) is a leading cause of death with high prevalence and mortality worldwide1,2. Due to gene mutations and amplifications, cancer cells grow without proliferative control, which contributes to cell survival3, anti-apoptosis4, and cancer stemness5,6,7. Within a tumor tissue, tumor heterogeneity allows tumor cells to adapt and survive during therapeutic treatments8. Cancer stem cells (CSCs), with a higher rate of self-renewal and pluripotency than differential cancer types, are principally responsible for tumor recurrence9,10 and metastatic CRC11. CSCs present more drug resistance12,13,14 and anti-apoptosis properties15,16, thus surviving tumor chemotherapies.
Here, in order to investigate the potential mechanism for stemness in the selected CRC stem cells, RNAseq was performed to screen differentially expressed genes in tumor spheroids. The cancer cells can form spheroids (also called tumorspheres) when grown in low adherence conditions and stimulated by growth factors added to the cultured medium, including EGF, bFGF, HGF, and IL6. Therefore, we selected CRC HT29 tumor cells that resist chemotherapies with an increase in phosphorylated STAT3 when treated with oxaliplatin and irinotecon17. In addition, HT29 expressed higher stemness markers when cultured in the described culture conditions. The HT29-derived CSC model expressed higher amounts of leucine-rich repeat-containing G-protein-coupled receptor 5 (LGR5)18, a specific marker of CRC stem cells19,20. Moreover, CD133, considered a general biomarker for cancer stem cells, is also highly expressed in the HT29 cell line21. This protocol's purpose is to discover groups of driver genes in the established cancer stem-like tumorspheres based on bioinformatics datasets as opposed to investigating individual oncogenes22. It investigates potential molecular mechanisms through RNAseq analysis followed by available bioinformatics analyses.
Next generation sequencing is a high-throughput, easily available, and reliable DNA sequencing method based on computational help, used to comprehensively screen driver genes for guiding tumor therapies23. The technology is also used for detecting gene expression from reverse transcription of an isolated RNA sample24. However, when screening with RNAseq, the most important genes to target with therapy may not have the highest expression differential between experimental and control samples. Therefore, some bioinformatics were developed for classifying and identifying genes based on current datasets such as KEGG25, GO26,27, or PANTHER28, including Ingenuity Pathway Analysis (IPA)29 and NetworkAnalyst30. This protocol shows the integration of RNAseq and NetworkAnalyst to quickly discover a group of genes in the selected HT29-derived spheroids compared to parental HT29 cells. Application of this method to other disease models is also suggested for discovering differences in important genes.
Compared to investigation of individual gene expression, a high-throughput technique provides advantages to find potential driver genes easily for tumor precision medicine. With useful datasets such as KEGG, GO, or PANTHER, specific genes can be identified based on the disease models, signaling pathways, or specific functions, and this allows quickly focusing on specific, important genes, saving time and research costs. A similar application is used in previous studies14,18,31. Particularly, a tumor is more complicated because different types of tumors express distinguishing genes and pathways for survival and proliferation. Therefore, this protocol can pick up genes distinguishing different tumor types under different circumstances. There is the potential to find effective strategies against cancers by understanding the mechanism of specific gene expression.
1. Cell culture and tumorsphere formation
2. RNA isolation
NOTE: Use a commercial kit (see Table of Materials) with a rapid column for RNA isolation following the manufacturer’s instructions.
3. RNAseq profiling and bioinformatics analysis
NOTE: RNAseq analysis was performed commercially (see Table of Materials) to investigate the differential genes in the HT29-derived tumorspheres compared to parental HT29 cells.
4. Driver gene selection
To establish the model for investigating the mechanism in cancer stem cells, colorectal HT29 cells were used to culture the cancer stem-like tumorspheres in vitro in a low-attachment plate containing B27, EGF, bFGF, HGF, and IL6. The tumorspheres >100 µm in diameter were formed in 7 days (Figure 1A). The tumorspheres were trypsinized to single cells and analyzed using flow cytometry to detect LGR5 and CD133 expression. LGR5 increased in the HT29-drived tumorspheres from 1.1% to 11.4% and the cells were detected using flow cytometry (Figure 1B). Another stemness marker, CD133, also increased from 61.8% to 81.1% in the cultured HT29-derived tumorspheres compared to parental HT29 cells (Figure 1C). Then, the tumorspheres were ready for RNAseq investigation.
RNAseq was used to investigate the gene expression profile in the HT29-derived tumorspheres compared to the parental HT29 cells. The error rate in nucleotide reading was 0.03% for both samples. The total gene mapping rate was 87.87% for HT29 and 87.25% for HT29-derived tumorspheres. The fragments per kilobase of transcript per million (FPKM), normalizing the detective counts for indicating the transcript (mRNA) expression, interval between 0 and 1 was 75.87% for HT29 cells, and 77.16% for HT29-derived tumorspheres. The results were suitable for consequent sequencing. After sequencing reads, the genes with log2 fold change >1 in upregulation and <-1 in downregulation with p value < 0.05 shown by the heatmap (Figure 2A,Table 1,Table 2) were selected. There were 79 upregulated genes and 33 downregulated genes selected. In addition, the Volcano plot using log2 fold change and p value (-log10 p value) was used to distinguish the significant genes between HT29-derived tumorspheres and parental HT29 cells (Figure 2B). Based on the preliminary selection, three potentially upregulated genes were identified, including ACSS2, HMGCS1, and PCSK9, in the HT29-derived tumorspheres (Figure 2A). Furthermore, in order to identify the driver genes not solely according to the log2 fold change, NetworkAnalyst was used. The upregulated genes with log2 fold change >1 with p value <0.05 were analyzed and resulted in 10 seed genes cross-linking the gene networks, including HSPA5, HSP90AA1, BRCA1, SFN, E2F1, CYCS, CDC6, ALYREF, SQSTM1, and TOMM40 (Figure 3A). To identify the function and significance of the seed genes, the classification interface could be used to perform PANTHER BP to determine the genes involved in anti-apoptosis in HT29-derived tumorspheres. The results indicated that HSPA5 and SQSTM1 were associated with negative regulation of apoptosis (Figure 3A). Moreover, the selected 10 genes were consequently validated; expression increased in the HT29-derived tumorspheres as confirmed using qPCR (Figure 3B).
Figure 1: Establishment of cancer stem-like tumorsphere in vitro. (A) HT29 was used to form tumorspheres, and (B) LGR5 was detected in the HT29-derived tumorspheres (Scale bar = 100 µm). (C) CD133 was used as another marker to identify the stemness characters in the established tumorspheres. Please click here to view a larger version of this figure.
Figure 2: Heatmap and Volcano plot were used to select the differential genes in the HT29-derived tumorspheres. Log2 fold change >1 and <-1 with read count >100, p value < 0.05 were used to exclude nonsignificant genes and make sure the data were accurate. Please click here to view a larger version of this figure.
Figure 3: NetworkAnalyst was used to identify the driver genes in the HT29-derived tumorspheres. (A) The upregulated genes were selected and analyzed consequently, and a classification interface, PANTHER BP, provided the potential functions for the differential driver genes. (B) Then, qPCR was used to validate the 10 genes upregulated in HT29-derived tumorspheres compared to parental HT29 cells. Please click here to view a larger version of this figure.
Table 1: The upregulated genes in HT29-serived tumorspheres compared to parental HT29 cells analyzed by RNAseq. Please click here to download this table.
Table 2: The downregulated genes in HT29-serived tumorspheres compared to parental HT29 cells analyzed by RNAseq. Please click here to download this table.
In this study, cultured cancer stem-like tumorspheres were used as a model in analyzing RNAseq data with available bioinformatics. For a disease model, HT29-derived tumorspheres were used. Because the tumorspheres have drug resistance against tumor therapies, the established model can be used to investigate the detailed mechanisms of resistance by investigating differences in gene expression. Moreover, genomic technology using RNAseq with available bioinformatics provides rapid understanding of the study model so the genes potentially involved can be validated with higher confidence. Also, the kind of genes involved in the formation of tumorspheres can be identified.
RNA quality is critical for the RNAseq analysis32. Ensure that the sample has RIN >7, because it increases certainty between mapping reads, mapping genes, and FPKM. When analyzing RNAseq data, IPA29 and NetworkAnalyst30 were available to identify potential genes and signaling pathways. However, it is essential to rule out the unnecessary genes according to the following parameters: genes showing a >1 log2 fold change with read counts >100 in the experimental group, and genes <-1 log2 fold change with read count > 100 in the control group. Higher read counts are easier for consequent validation using qPCR or Western blots.
Based on the understanding of biological functions and processes, there are many bioinformatics tools allowing rapid investigation of the potential mechanisms of diseases of interest. Combined with a high-throughput tool to screen for gene expression such as RNAseq, potential mechanisms regulating the development of the studied diseases can be proposed and investigated. Here, the method for investigating the formation of CRC stem-like tumorspheres derived from HT29 revealed that SQSTM1 and HSPA5 were the target genes upregulated and involved in anti-apoptosis in tumorspheres. Therefore, more experiments can be designed to investigate the detailed mechanism of these genes, which results in more confidence and efficacy when conducting research studies.
Here, only the upregulated genes in the tumorspheres were analyzed because upregulation was considered to be induced by the addition of the growth factors. Otherwise, if the experiment used gene knockdown in the tumor cells, the downregulated genes would be considered as targets that can be selected for investigation using the bioinformatics. Although the methodology is rapid and dependable, subsequent validation is still needed via qPCR and Western blots. It is also suggested that more cell lines be used for gene expression validation, especially for clinical samples.
The authors have nothing to disclose.
The authors thank the Radiation Biology Core Laboratory of Institute for Radiological Research, Chang Gung Memorial Hospital, for technical support. This study was supported by grants from Chang Gung Memorial hospital (CMRPD1J0321), Cheng Hsin General Hospital (CHGH 106-06), and Mackay Memorial Hospital (MMH-CT-10605 and MMH-106-61). Funding bodies did not have any influence in the design of the study and data collection, analysis and interpretation of data or in writing the manuscript.
iRiS Digital Cell Imaging System | Logos Biosystems, Inc | I10999 | for observing the formation of tumorspheres |
Flow cytometry | BD biosciences | FACSCalibur | for detecting the LGR5 and CD133 in the tumorspheres |
anti-LGR5-PE | Biolegend | 373803 | LGR5 detection reagent |
anti-CD133-PE | Biolegend | 372803 | CD133 detection reagent |
EGF | GenScript | Z00333 | for culture of tumorspheres |
bFGF | GenScript | Z03116 | for culture of tumorspheres |
HGF | GenScript | Z03229 | for culture of tumorspheres |
IL6 | GenScript | Z03034 | for culture of tumorspheres |
PureLink RNA extraction kit | Invitrogen | 12183025 | isolate total RNA for RNAseq analysis |
RNAseq performance | Biotools, Taiwan | RNAseq analysis is done commerially by Biotools, Ttaiwan | |
NetworkAnalyst | Institute of Parasitology, McGill University, Montreal, Quebec, Canada | http://www.networkanalyst.ca/ | |
Prism | GraphPad Software | a statistical analysis software |