This article details the use of an anchored multiplex polymerase chain reaction-based library preparation kit followed by next-generation sequencing to assess for oncogenic gene fusions in clinical solid tumor samples. Both wet-bench and data analysis steps are described.
Gene fusions frequently contribute to the oncogenic phenotype of many different types of cancer. Additionally, the presence of certain fusions in samples from cancer patients often directly influences diagnosis, prognosis, and/or therapy selection. As a result, the accurate detection of gene fusions has become a critical component of clinical management for many disease types. Until recently, clinical gene fusion detection was predominantly accomplished through the use of single-gene assays. However, the ever-growing list of gene fusions with clinical significance has created a need for assessing fusion status of multiple genes simultaneously. Next generation sequencing (NGS)-based testing has met this demand through the ability to sequence nucleic acid in massively parallel fashion. Multiple NGS-based approaches that employ different strategies for gene target enrichment are now available for use in clinical molecular diagnostics, each with its own strengths and weaknesses. This article describes the use of anchored multiplex PCR (AMP)-based target enrichment and library preparation followed by NGS to assess for gene fusions in clinical solid tumor specimens. AMP is unique among amplicon-based enrichment approaches in that it identifies gene fusions regardless of the identity of the fusion partner. Detailed here are both the wet-bench and data analysis steps that ensure accurate gene fusion detection from clinical samples.
The fusion of two or more genes into a single transcriptional entity can occur as the result of large scale chromosomal variations including deletions, duplications, insertions, inversions, and translocations. Through altered transcriptional control and/or altered functional properties of the expressed gene product, these fusion genes can confer oncogenic properties to cancer cells1. In many cases, fusion genes are known to act as primary oncogenic drivers by directly activating cellular proliferation and survival pathways.
The clinical relevance of gene fusions for cancer patients first became apparent with the discovery of the Philadelphia chromosome and the corresponding BCR-ABL1 fusion gene in chronic myelogenous leukemia (CML)2. The small molecule inhibitor imatinib mesylate was developed to specifically target this fusion gene and demonstrated remarkable efficacy in BCR-ABL1-positive CML patients3. Therapeutic targeting of oncogenic gene fusions has also been successful in solid tumors, with inhibition of ALK and ROS1 fusion genes in non-small cell lung cancer serving as primary examples4,5. Recently, the NTRK inhibitor larotrectinib was FDA approved for NTRK1/2/3 fusion-positive solid tumors, regardless of disease site6. Beyond therapy selection, gene fusion detection also has roles in disease diagnosis and prognosis. This is particularly prevalent in various sarcoma and hematologic malignancy subtypes that are diagnostically defined by the presence of specific fusions and/or for which presence of a fusion directly informs prognosis7,8,9,10,11. These are but a few of the examples of the clinical application of gene fusion detection for cancer patients.
Due to the critical role in clinical decision making, accurate gene fusion detection from clinical samples is of vital importance. Numerous techniques have been applied in clinical laboratories for fusion or chromosomal rearrangement analysis including: cytogenetic techniques, reverse transcription polymerase chain reaction (RT-PCR), fluorescence in situ hybridization (FISH), immunohistochemistry (IHC), and 5’/3’ expression imbalance analysis (among others)12,13,14,15. Presently, the rapidly expanding list of actionable gene fusions in cancer has resulted in the need to assess fusion status of multiple genes simultaneously. Consequently, some traditional techniques that can only query one or a few genes at a time are becoming inefficient approaches, especially when considering that clinical tumor samples are often very finite and not amenable to being divided among several assays. Next generation sequencing (NGS), however, is an analysis platform that is well suited for multi-gene testing, and NGS-based assays have become common in clinical molecular diagnostic laboratories.
Currently used NGS assays for fusion/rearrangement detection vary in regard to the input material used, the chemistries employed for library preparation and target enrichment, and the number of genes queried within an assay. NGS assays can be based on RNA or DNA (or both) extracted from the sample. Although RNA-based analysis is hampered by the tendency of clinical samples to contain highly degraded RNA, it circumvents the need to sequence large and often repetitive introns that are the targets of DNA-based fusion testing but have proven to be difficult for NGS data analysis16. Target enrichment strategies for RNA-based NGS assays can be largely divided into hybrid capture or amplicon-mediated approaches. While both strategies have been successfully utilized for fusion detection, each contains advantages over the other17,18. Hybrid capture assays generally result in more complex libraries and reduced allelic dropout, whereas amplicon-based assays generally require lower input and result in less off-target sequencing19. However, perhaps the primary limitation of traditional amplicon-based enrichment is the need for primers to all known fusion partners. This is problematic since many clinically important genes are known to fuse with dozens of different partners, and even if primer design allowed for detection of all known partners, novel fusion events would remain undetected. A recently described technique termed anchored multiplex PCR (or AMP for short) addresses this limitation20. In AMP, a ‘half-functional’ NGS adapter is ligated to cDNA fragments that are derived from input RNA. Target enrichment is achieved by amplification between gene specific primers and a primer to the adapter. As a result, all fusions to genes of interest, even if a novel fusion partner is involved, should be detected (see Figure 1). This article describes the use of the ArcherDx FusionPlex Solid Tumor kit, an NGS-based assay that employs AMP for target enrichment and library preparation, for the detection of oncogenic gene fusions in solid tumor samples (see Supplementary Table 1 for complete gene list). The wet-bench protocol and data analysis steps have been rigorously validated in a Clinical Laboratory Improvement Amendments (CLIA)-certified laboratory.
1. Library preparation and sequencing
2. Data analysis
Shown in Figure 3, Figure 4 and Figure 5 are screenshots from the analysis user interface demonstrating results from a lung adenocarcinoma sample. In Figure 3, the sample summary is shown (top) that lists the called strong evidence fusions, as well as the QC status (circled in red). The ADCK4-NUMBL fusion (of which 3 isoforms are listed) is immediately ignored because it is a persistent transcriptional readthrough event (noted by the broken circle icons next to the listings). The bottom of Figure 3 is a screenshot of the Read Statistics page. Particularly informative metrics are circled in green. The critical QC metric that determines the pass/fail nature of the sample is highlighted in red (this is the metric that dictates pass/fail status in the summary above). If this value is below 20, a negative fusion result is deemed ‘uninformative’. Figure 4 and Figure 5 are screenshots of the fusion schematics (top) and JBrowse views of fusion supporting reads (bottom) for the two fusions that warranted further investigation. The KIF5B-RET fusion (Figure 4) demonstrates a high number of supporting reads, a high % of reads from the primer supporting the fusion, and a high number of start sites. Additionally, several contiguous exons of the fusion partner (KIF5B) are included in the alignment, the fusion was verified to be in frame, and the reads supporting the fusion are generally free of mismatch. For these reasons, the fusion is deemed real and reportable. The LOC101927681-PDGFRB fusion (Figure 5) demonstrates lower supporting metrics. Furthermore, the portion of the reads mapping to the partner are relatively short and contain a high error rate, which strongly suggests misalignment and an artifactual fusion call. Finally, intronic sequence of PDGFRB is included, further suggesting an artifact (most, but not all, legitimate fusions are comprised solely of sequence that maps to coding regions of both genes). For these reasons, this fusion is deemed an artifact and not reportable.
Figure 1: Schematic representation of AMP approach. Traditional amplicon-mediated approaches for target enrichment are limited by the fact that primers are needed for all fusion partners. Thus, novel fusion partners will not be detected. In AMP, the opposing primer is specific to the adapter, thus novel partners are detected. Please click here to view a larger version of this figure.
Figure 2: Correlation between PreSeq QC score and post-sequencing QC. The PreSeq Ct and average unique start sites per GSP2 Control values were correlated for a series of 100 samples. Generally, as expected, low PreSeq scores correlate with higher SS/GSP2 values (both are indicators of good quality RNA). Please click here to view a larger version of this figure.
Figure 3: Example sample summary and read statistic views. Top: view of a sample summary in the user interface with strong evidence fusions listed. Bottom: view of the read statistics page in the user interface. Please click here to view a larger version of this figure.
Figure 4: Example of a legitimate fusion call. Top: view of the fusion schematic in the user interface. Bottom: JBrowse view of individual reads supporting the fusion. Please click here to view a larger version of this figure.
Figure 5: Example of an artifactual fusion call. Top: view of the fusion schematic in the user interface. Bottom: JBrowse view of individual reads supporting the fusion. Please click here to view a larger version of this figure.
AKT3 | EWSR1 | NOTCH1 | PRKCA |
ALK | FGFR1 | NOTCH2 | PRKCB |
ARHGAP26 | FGFR2 | NRG1 | RAF1 |
AXL | FGFR3 | NTRK1 | RELA |
BRAF | FGR | NTRK2 | RET |
BRD3 | INSR | NTRK3 | ROS1 |
BRD4 | MAML2 | NUMBL | RSPO2 |
EGFR | MAST1 | NUTM1 | RSPO3 |
ERG | MAST2 | PDGFRA | TERT |
ESR1 | MET | PDGFRB | TFE3 |
ETV1 | MSMB | PIK3CA | TFEB |
ETV4 | MUSK | PKN1 | THADA |
ETV5 | MYB | PPARG | TMPRSS2 |
ETV6 | |||
The assay includes gene-specific primers to various exons (and some introns) in the above 53 genes. |
Supplementary Table 1: List of genes for which gene-specific primers are included in the assay (to various exons and introns).
Anchored multiplex PCR-based target enrichment and library preparation followed by next-generation sequencing is well suited for multiplexed gene fusion assessment in clinical tumor samples. By focusing on RNA input rather than genomic DNA, the need to sequence large and repetitive introns is avoided. Additionally, since this approach amplifies gene fusions regardless of the identity of the fusion partner, novel fusions are detected. This is a critical advantage in the clinical realm, and there have been many examples of actionable novel gene fusions identified through AMP reported in the literature21,22,23,24,25.
Since the assay is RNA based, it is critical to preserve RNA quality in samples during processing. It is also critical to determine which samples produced RNA sequencing that is too poor to trust negative fusion results. This is achieved by assessment of sequencing data from primers to four housekeeping genes. If sequencing of these genes is poor, then negative fusion results are deemed uninformative. In addition, given the complexity and multitude of wet-bench steps in the assay, it is important to include a fusion-positive control in every assay run. By doing so, compromised assay runs become apparent through analysis of expected fusion events in the control.
As with all amplicon-based approaches, AMP is highly reliant on individual primer performance. When assessing multiple exons of multiple genes, it is inevitable that some primers will not perform as well as others. Therefore, it is critical for users to know where the assay underperforms due to primer inefficiency. Additionally, NGS-based assays require complex bioinformatic data analysis. If the algorithms employed are not thoughtfully designed, false-negative and false-positive results are likely. It is very important that all gene fusions called by analysis algorithms be manually inspected by the user.
With an ever-growing list of actionable gene fusions that should be assessed in clinical tumor specimens, use of multiplexed assays like AMP will continue to increase in clinical laboratories. Future applications of the technique will likely focus on combining fusion and mutation assessment within a single assay. Regardless of the molecular assay approach, users must always be aware of assay limitations and should always establish quality control metrics to guide data interpretation.
The authors have nothing to disclose.
This work was supported by the Molecular Pathology Shared Resource of the University of Colorado (National Cancer Institute Cancer Center Support Grant No. P30-CA046934) and by the Colorado Center for Personalized Medicine.
10 mM Tris HCl pH 8.0 | IDT | 11-05-01-13 | Used for TNA dilution |
1M Tris pH 7.0 | Thermo Fisher | AM9850G | Used in library pooling |
25 mL Reagent Reservoir with divider | USA Scientific | 9173-2000 | For use with multi-channel pipetters and large reagent volumes |
96-well TemPlate Semi-Skirt 0.1mL PCR plate-natural | USA Scientific | 1402-9700 | Plate used for thermocycler steps |
Agencourt AMPure XP Beads | Beckman Coulter | A63881 | Used in purification after several assay steps |
Agencourt Formapure Kit | Beckman Coulter | A33343 | Used in TNA extraction |
Archer FusionPlex Solid Tumor kit | ArcherDX | AB0005 | This kit contains most of the reagents necessary to perform library preperation for Illumina sequencing (kits for Ion Torrent sequencing are also available) |
Cold block, 96-well | Light Labs | A7079 | Used for keeping samples chilled at various steps |
Ethanol | Decon Labs | DSP-MD.43 | Used for bead washes |
Library Quantification for Illumina Internal Control Standard | Kapa Biosystems | KK4906 | Used for library quantitation |
Library Quantification Primers and ROX Low qPCR mix | Kapa Biosystems | KK4973 | Used for library quantitation |
Library Quantification Standards | Kapa Biosystems | KK4903 | Used for library quantitation |
Magnet Plate, 96-well (N38 grade) | Alpaqua | A32782 | Used in bead purificiation steps |
MBC Adapters Set B | ArcherDX | AK0016-48 | Adapters that contain sample-specific indexes to enable multiplex sequencing |
Micro Centrifuge | USA Scientific | 2641-0016 | Used for spinning down PCR tubes |
MicroAmp EnduraPlate Optical 96 well Plate | Thermo-Fisher | 4483485 | Used for Pre-Seq QClibrary quantitation |
Microamp Optical Film Compression Pad | Applied Biosystems | 4312639 | Used for library quantitation |
Mini Plate Spinner | Labnet | MPS-1000 | Used for collecing liquid at bottom of plate wells |
MiSeq Reagent Kit v3 (600 cycle) | Illumina | MS-102-3003 | Contains components necessary for a MiSeq sequencing run |
MiSeqDx System | Illumina | NGS Sequencing Instrument | |
Model 9700 Thermocycler | Applied Biosystems | Used for several steps during assay | |
nuclease free water | Ambion | 9938 | Used as general diluent |
Optical ABI 96-well PCR plate covers | Thermo-Fisher | 4311971 | Used for Pre-Seq QClibrary quantitation |
PCR Workstation Model 600 | Air Clean Systems | BZ10119636 | Wet-bench assay steps performed in this 'dead air box' |
Proteinase K | Qiagen | 1019499 | Used in TNA extraction |
QuantStudio 5 | Applied Biosystems | LSA28139 | qPCR instrument used for PreSeq and library quantitation |
Qubit RNA HS Assay Kit | Life Technologies | Q32855 | Use for determing RNA concentration in TNA samples |
RNase Away | Fisher | 12-402-178 | Used for general RNase decontamination of work areas |
Seraseq FFPE Tumor Fusion RNA Reference Material v2 | SeraCare | 0710-0129 | Used as the assay positive control |
Sodium Hydroxide | Fisher | BP359-212 | Used in clean-up steps and for sequencing setup |
SYBR Green Supermix | Bio Rad | 172-5120 | Component of PreSeq QC Assay |
TempAssure PCR 8-tube Strips | USA Scientific | 1402-2700 | Used for reagent and sample mixing etc. |
Template RT PCR film | USA Scientific | 2921-7800 | Used for covering 96-well plates |
U-Bottom 96-well Microplate | LSP | MP8117-R | Used during bead purification |