Typically, cancer is thought to arise due to DNA mutations, dysregulated transcription and/or aberrant signalling. Recently, it has become clear that dysregulated mRNA processing, mRNA export and translation also contribute to malignancy. RNA processing events result in major modifications to the physical nature of mRNAs such as the addition of the methyl-7-guanosine cap, the removal of introns and the addition of polyA tails. mRNA processing is a critical determinant for the protein-coding capacity of mRNAs since these physical changes impact the efficiency by which a given transcript can be exported to the cytoplasm and translated into protein. While many of these mRNA metabolism steps were considered constitutive housekeeping activities, they are now known to be highly regulated with combinatorial and multiplicative impacts i.e. one event will influence the capacity to undergo others. Furthermore, alternative splicing and/or cleavage and polyadenylation can produce transcripts with alternative messages and new functionalities. The coordinated processing of groups of functionally related RNAs can potently re-wire signalling pathways, modulate survival pathways and even re-structure the cell. As postulated by the RNA regulon model, combinatorial regulation of these groups is achieved by the presence of shared cis-acting elements (known as USER codes) which recruit machinery for processing, export or translation. In all, dysregulated RNA metabolism in cancer gives rise to an altered proteome that in turn elicits biological responses related to malignancy. Studies of these events in cancer revealed new mechanisms underpinning malignancies and unearthed novel therapeutic opportunities. In all, cancer cells coopt RNA processing, export and translation to support their oncogenic activity.
Introduction
Global analyses of cancer transcriptomes have yielded important insights into its pathogenesis. However, the transcriptome does not always predict the proteome [1]. This disconnect can arise because of the reprograming of RNA processing, export and/or translation which in turn enables the mRNA instructions sent from the DNA to be hijacked. Specifically, these events modify the physical nature of the transcripts and/or regulate the conversion of these messages into protein [2]. In cancer, mRNA metabolism can be reprogrammed to generate a proteome that drives malignancy even in the absence of altered mRNA levels or genetic mutation. In this review, the major mRNA processing steps, mRNA export and translation are described in relationship to their ability to subvert the mRNA message and thereby alter the proteome. As predicted by the RNA regulon model, the mRNA processing, export and translation of groups of mRNAs, often encoding factors in the same pathways, can be regulated simultaneously to elicit powerful biological responses [3,4]. These RNAs are selected based on cis-acting elements referred to as USER (untranslated sequence for regulation) codes which recruit the necessary machinery. Due to space limitations some important RNA modifications are not covered here but are reviewed elsewhere e.g. [5].
Starting the message: capping and re-capping RNAs
Typically, the first step in mRNA processing is referred to as ‘capping' and involves the addition of the methyl 7-guanosine (m7G) ‘cap' onto the 5′ end of coding and some non-coding RNAs [6–8]. While this step is usually considered to occur co-transcriptionally, re-capping after decapping can also occur in the cytoplasm [9–11]. Most subsequent mRNA processing steps rely on the cap to engage cap-binding proteins which permit recruitment of relevant processing machinery [2]. Two well-established cap-binding factors are the eukaryotic translation initiation factor eIF4E and the cap-binding complex (CBC) which is comprised of the cap-binding protein CBP20 (NCBP2) and its co-factor CBP80 (NCBP1) [2]. While it was once considered that CBC was restricted to the nucleus and eIF4E to the cytoplasm, it is now clear that these factors are found in both compartments where they act in a variety of cap-dependent RNA processing events [2]. In mammals, capping requires two enzymes RNGTT (RNA guanyltransferase and 5′ phosphatase) and RNMT (RNA guanine-7-methyltransferase) [8]. Indeed, these proteins are required for cell survival highlighting the importance of this modification [8]. RNGTT removes the 5′ phosphate on the 5′ end of the transcript followed by the addition of a guanosine forming a 5′–5′ pyrophosphate linkage. Then, RNMT methylates the cap guanylate at the 7-position; this activity is stimulated by its co-factor, RAM [8].
Intriguingly, capping of specific transcripts is modulated during development, differentiation and dysregulated in primary cancer specimens [7,12–14]. Interestingly, RNAs can undergo cycles of decapping and re-capping suggestive of cap homeostasis [10,11,15]. Newly developed methods revealed that capping levels for subsets of RNAs is much lower (∼30%) than the anticipated 100% [8,9,14]. The three best-defined factors known to regulate capping are Myc, RNMT and eIF4E, whereby each employs a distinct stratagem (Figure 1). Myc recruits RNGTT and RNMT to transcriptional sites of specific transcripts as well as increases levels of S-adenosyl hydrolase to neutralize inhibition of RNMT by the SAH by-product of methylation reactions [7]. RNMT overexpression only elevates capping of selected RNAs; to date this specificity is not yet understood [7]. Regulation of capping plays physiological roles. During differentiation of embryonic stem cells, RNMT and RAM elevation is required for the expression of pluripotency associated genes; then, ERK1/2 phosphorylates RAM to trigger its degradation which in turn represses these genes allowing differentiation [6]. eIF4E modulates capping likely by two mechanisms: (1) Increased nuclear export and translation efficiency of RNAs that encode the capping machinery thereby elevating levels of the resultant proteins and (2) Physical interaction with the RNMT protein [14,16]. eIF4E does not increase the RNA levels of RNMT or RNGTT. Similar to RNMT and MYC regulation of capping, the capping of selected (not all) transcripts is modulated by eIF4E. Many of eIF4E-capping target transcripts encode cancer related proteins such as Myc, MDM2, CTNNB1, CCND1 and interestingly RNMT [8,14]. A cap sensitivity element (CapSE) was identified in the 3′ UTR of model RNAs and this USER code engaged RNMT presumably to enhance eIF4E-dependent capping for some RNAs [14]. There are likely other such elements yet to be identified. The first study to show capping was dysregulated in any malignancy was in primary high-eIF4E AML specimens. Here, increased capping of specific transcripts was observed as was elevation of RNMT and RNGTT levels [14]. Collectively, these observations position capping as an important control point modulating the ultimate protein-coding capacity of mRNAs as well as impacting the biochemical activity of some non-coding RNAs [7,12–14]. In this way, these factors control the capability of transcripts to undergo other RNA processing steps as well as resultant protein-coding capacity.
Revising the message: RNA splicing
After transcripts are capped, many mammalian coding and some non-coding RNAs undergo splicing. mRNA splicing involves the removal of introns and joining of flanking exons in pre-mRNAs [17,18]. Splicing provides an important capacity to diversify the proteome and ncRNA isoforms [17,19,20]. The splicing machine, known as the spliceosome, is comprised of ∼150 proteins and five U-rich small nuclear RNAs (UsnRNAs) [17,18]. Approximately 80% of splicing occurs co-transcriptionally and the remaining ∼20% post-transcriptional splicing likely transpires in nuclear speckles [21,22]. Interestingly for a given RNA, some introns can be spliced co-transcriptionally and others post-transcriptionally [21]. Approximately 95% of multi-exonic transcripts undergo alternative splicing (AS) [18,19,23]. AS events include the skipping of exons, inclusion of mutually exclusive exons, use of alternative splice sites and intron retention [17,24]. All of these can alter the embedded coding information which can ultimately result in production of proteins with variant functions (Figure 1). Indeed, AS can generate proteins with opposing functions to their constitutive counterparts, alter localization of RNAs or protein products, and/or produce RNAs that are rapidly degraded resulting in reduced protein levels. For instance, the long isoform of MCL1 is anti-apoptotic and the short form is pro-apoptotic [24]. The use of alternative 3′ splice sites switch VEGF isoforms from angiogenic to anti-angiogenic [24].
Splicing is regulated during differentiation and development. Notably, the proteomic outputs of dysregulated splicing contribute to the hallmarks of cancer [24]. Mutation or dysregulation of splicing factors contributes to hematologic malignancies and solid tumours [19,25–27]. For example, AML is amongst the cancers with the largest number of AS events [25,26,28]. The most frequently mutated splicing factors in heme-malignancies are SF3B1, SRSF2, and U2AF1. These mutations are associated with progression from myelodysplastic syndromes to AML and are also prevalent in types of bladder cancer and glioma [19,25–28]. Interestingly, abnormal mRNA splicing is common in AML even in the absence of SF mutations suggesting other mechanisms also drive alternative splicing [29]. Some splicing factors are elevated rather than mutated in a diverse set of solid tumours including breast, colon, and lung [19]. Importantly, modulation of the core splicing machinery often elicits splicing alterations to specific pre-mRNAs, i.e. does not have global impacts [30]. Further, mutations within specific pre-mRNAs e.g. at splice sites, also contribute to AS and disease [24,25,28]. Recently it was found that eIF4E can impact splicing likely through its capacity to increase production of splice factors through elevation of their nuclear export of the corresponding mRNAs, and moreover, for some cases, via physical interaction with pre-mRNAs and spliceosome components [31]. eIF4E-dependent splicing impacts ∼1000 events in model U2Os cells and ∼4000 in AML primary specimens [31]. Many of these transcripts encode factors involved in malignancy.
Splicing dysregulation is the subject of therapeutic development in several malignancies. In this regard, there has been intense focus on the inhibition of SF3B1 [26,32] a core component of the spliceosome. Importantly, SF3B1 inhibitors such as Spliceostatin A or Pladienolide B only alter ∼10% of splicing events [25,26,32]. These inhibitors often target tumour cells with SF3B1 mutations more so than other tumour types, or normal cells, thereby providing a therapeutic window [25,33,34]. SF3B1 was targeted in early phase clinical trials in epithelial malignancies with E-7071 which was generally well tolerated but the trials were halted due to partial vision loss in two patients [35,36]. A next generation inhibitor (H3B-8800) is currently in trial for AML (ClinicalTrials.gov NCT02841540). A landmark use of splicing therapeutics is showcased in spinal muscular atrophy (SMA). Here, the therapeutic is an antisense-oligonucleotide (ASO) that targets splicing of SMN2 thereby reconstituting functional SMN protein activity in SMA patients [24,37]. This allows patients to regain mobility and the treatment received FDA approval. Investigations are underway into the utility of using a similar strategy in cancer cells vis-à-vis redirecting functionalities of proteins via antisense directed alteration of their splicing [24].
Ending the message: cleavage and polyadenylation
The formation of polyadenylated tails on 3′ end of mRNAs is another key step in RNA processing which also can impact the mRNA composition as well as its capacity to be exported from the nucleus and translated in the cytoplasm. This process, known as cleavage and polyadenylation (CPA) is comprised of two steps. First the transcript undergoes cleavage downstream from a polyadenylation signal (PAS) within the 3′ UTR of mRNAs followed by the addition of the polyA-tail via polyA polymerases [38,39]. Cleavage is considered a nuclear process; by contrast, polyA tail lengths can be modulated in the nucleus or cytoplasm. While the cleavage event is generally considered to be co-transcriptional, there are ample examples of post-transcriptional cleavage indicating that RNAs can be further modified after transcription. At least 70% of mammalian RNAs undergo alternative polyadenylation (APA) [40]. APA includes: (1) Altered PAS cleavage efficiency, (2) The cleavage of alternative PAS's within the 3′UTR, introns or exons and/or (3) Alterations in polyA-tail length [38–40] (Figure 1). APA can affect RNA stability, export, localization, translation efficiency, and sometimes protein product functionality by altering the coding sequence via production of C-terminal truncations [38–42].
In cancer, APA generally results in mRNAs with shorter 3′UTRs providing a mechanism to evade regulation by microRNAs or RNA binding proteins (RBPs) through the removal of key regulatory sequences [38–40,43]. These shorter UTRs are usually generated through selection of PAS's close to the stop codon [43]. However, shorter 3′UTRs are not always oncogenic and thus effects are context specific. Moreover, global shifts in 3′UTR length of the entire transcriptome are not observed during oncogenic transformation; but rather, the relative abundance of 3′UTR isoforms shift for specific mRNAs [42,44]. In colorectal cancers, the progression from adenomas to carcinomas is characterized by elevated expression of the APA machinery [38,45]. For instance, elevation of the cleavage co-factor Fip1L drives increased PAS cleavage efficiency of RNAs acting in self-renewal promoting this biological activity [38,46]. MTOR activation leads to shortening of 3′UTRs [42]. eIF4E overexpression alters production of CPA factors, physically interacts with some cleavage machinery [47] and impacts on the APA of many transcripts including in AML specimens (unpublished observations). These are just a few examples of the contribution of APA dysregulation to oncogenic capacity.
Passport control: retention or export?
The capacity of a transcript to be exported from the nucleus reflects the balance between retention and export. For instance, some mature mRNAs are retained in the nucleus for stockpiling so they are available for rapid export and translation once cellular conditions are suitable; or in some cases, retention is a means to auto-regulate factor levels e.g. the splicing factor SRSF1 [48]. Incompletely processed mRNAs can be retained as part of nuclear surveillance mechanisms to prevent aberrant protein production. Retention mechanisms are typically mRNA and cell context specific. For instance, some transcripts are retained at paraspeckles while partially spliced mRNAs can be held at nuclear speckles where they await post-transcriptional splicing. Retained mRNAs tend to be longer and can contain specific sequences linked to their retention [48].
Nuclear mRNA export controls the accessibility of mature transcripts to the translation machinery in the cytoplasm and thus contributes to increased protein levels without increasing the transcript levels or, a priori, translation efficiency [2,49–51]. RNAs are actively transported through the nuclear pore complex (NPC). mRNAs bind appropriate nuclear receptors and export factors which in turn associate with the nuclear basket of the NPC, transit through the central channel and exit via the cytoplasmic fibrils [49–51]. Bulk mRNA export employs the nuclear receptor NXF1 and its co-factor NXT1 (also known as TAP and p15, respectively). Typically, CBC-bound mRNAs bind the large export complex known as TREX and are then recruited by NXF1/NXT1 to the NPC for transit to the cytoplasm.
There are alternative routes through the NPC which can rely on mRNA containing specific USER codes [3], which permit engagement of selected export factors to facilitate their transit [49–51]. In this way, factors can promote the export of functionally related groups of mRNAs based on shared USER codes (Figure 1). For instance, both CRM1/XPO1 and NXF1/NXT1 act as nuclear receptors to chaperone mRNAs through the NPC [49–51]. In the case of eIF4E, it promotes the export of groups of cancer-causing RNAs via CRM1/XPO1 whereby these mRNAs typically have an eIF4E sensitivity element (4ESE) in their 3′UTR [52–55]. The 4ESE element is bound by the leucine rich pentatricopeptide repeat containing protein LRPPRC an assembly platform binding eIF4E, CRM1 and 4ESE RNA [52,55]. CRM1 can also host CBC-bound mRNAs in some instances [49,50]. Export of selected mRNAs can be enhanced under certain biological contexts. For instance, eIF4E elevation leads to increased export of mRNAs involved in oncogenic processes such as those that can alter the cell surface architecture supporting increased cell motility and invasion [56]. CRM1/XPO1 dysregulation is found in many cancers, which can in turn drive inappropriate RNA and protein export [57]. Repressing this export pathways is clinically useful. Indeed, the CRM1 inhibitor Selinxor is approved for treatment in relapsed multiple myeloma and in trial for other malignancies.
Decoding the message: translation
Once RNAs are released into the cytoplasm, many undergo translation while some transit to cytoplasmic bodies where they are sequestered for later translation or undergo degradation [58]. Clearly, the above processing steps will impact the composition of the protein generated as well as its translation efficiency. Increased translation efficiency is defined as more ribosomes loaded per transcript allowing more protein to be produced per RNA (Figure 1). Here, the focus is on translation initiation where eIF4E, in a complex known as eIF4F, recruits many capped mRNAs to the ribosome [58]. The eIF4F complex includes eIF4E, the eIF4A helicase which unwinds mRNAs, and eIF4G1 which acts as an assembly platform. eIF4F scans the 5′UTR of the target mRNA until it reaches a start codon. Under certain conditions, alternative start sites can be utilized to generate proteins with altered N-termini or sequence relative to their cognate counterparts. eIF4E overexpression is sufficient to increase translation efficiency of specific mRNAs typically with long, structured 5′ UTRs [58,59] (Figure 1). Other mRNAs such as ACTIN and GAPDH are not affected by eIF4E overexpression [58,59]. Importantly other translation initiation complexes recruit subsets of mRNAs to preferentially enhance their translation efficiency and thus their protein production [58]. For example, CBC in association with an eIF4G1 substitute, CTIF, play roles in translation of selected mRNAs in response to stresses and during the pioneer round of translation [2,58,60–62]. eIF3d recruits selected capped mRNAs using a distinct translation initiation complex including DAP5/eIF4G2 (rather than eIFG1) which impact ∼20–30% of the translatome of breast cancer cells [63]. Other specialized translation initiation complexes employ PARN or LARP to engage select mRNAs [64,65]. In all, translation responds to various environmental cues to modulate the production of specific factors without altering the entire proteome.
Controlling the message: therapeutic targeting
eIF4E has been a focal point for development of cancer therapeutics targeting RNA metabolism. Given its multiple cap-dependent functions, its targeting should have potent impacts (Figure 1). Supporting this direction, eIF4E is highly elevated in primary cancer specimens at the protein level and its activities in capping, splicing, export and/or translation are dysregulated in these specimens e.g. [14,31,66,67]. Strategies for its pharmacological inhibition can be divided into four categories: (1) m7G cap-competitors, (2) Antisense oligonucleotides (ASO) to reduce eIF4E levels, (3) MNK inhibitors to reduce eIF4E phosphorylation, and (4) Factors that interfere with the eIF4E–eIF4G translation initiation complex [68–72]. Cap-competitors or genetic reduction in eIF4E impacts on all its RNA processing activities making these strategies likely very potent. eIF4E phosphorylation inhibits RNA export [73] and may have impacts on translation although this is highly debated [74,75]. Phosphorylation may also modulate other eIF4E activities although this is yet to be determined. Thus, it is not yet known if inhibition of its phosphorylation will impact on multiple eIF4E activities. The last group including 4GI-1 may only impact eIF4E-dependent translation which may limit its clinical utility; however, its ability to inhibit other eIF4E activities has not yet been investigated and given the proximity of 4GI-1 to the cap-binding site [68] may well have inhibit other functions.
The first-in-class inhibitor administered to patients to target eIF4E was the cap-competitor ribavirin [76]. Ribavirin directly binds eIF4E and competes for cap-binding as seen by NMR, biophysical, biochemical and mass spectrometry methods [71,72,77,78]. RNAi-mediated reduction in eIF4E reduces ribavirin activity supporting that ribavirin acts through eIF4E [79]. Ribavirin impairs eIF4E-dependent splicing, RNA export and translation, and in animal models targets eIF4E activity, reduces tumours and metastases, and increases survival [31,56,66,79–84]. Its impacts on capping and APA are not yet known. In three clinical trials in poor prognosis AML patients, ribavirin targeted eIF4E which correlated with objective clinical responses including remissions ([76,85] and ClinicalTrials.govNCT02073838). These first studies serve as a proof-of-principle that eIF4E can be targeted in humans and that this targeting can lead to clinical benefit. Unfortunately, clinical trials employing ASOs were not successful as eIF4E levels were not decreased in patients [86]. Clinical studies using MNK inhibitors are ongoing in prostate cancer in combination with several other agents (ClinicalTrials.govNCT04261218). Whether the effects of MNK inhibitors are due to inhibition of eIF4E and/or reduction in phosphorylation of other MNK targets will be difficult to dissect clinically. To date, the last class of eIF4E inhibitor, 4GI-1 has not been tested in patients; however its use impaired proliferation of breast cancer stem cells [87] supporting this direction.
Selecting the message: RNA regulons and USER codes
A common theme emerging from the above discussion is that factors intricately involved in processing events can impact selected mRNA targets. Understanding the principles of selectivity in recruiting mRNAs is critical. How do certain factors select one set of mRNAs? How does one factor influence the same mRNA at one processing level but not at another? The specificity is best viewed through the lens of the mRNA regulon model [3,4]. In this model, mRNAs with specific USER codes recruit factors that act in a selected process. Thus, the correct combination of USER codes is required for combinatorial regulation of mRNAs at multiple levels (Figure 1). Further multiple USER codes can be needed. For example, eIF4E and CBC bind the m7G cap, a feature common to most mature mRNAs. Here, additional USER codes recruit relevant factors to customize processing, export and translation of the transcript. Indeed, mRNAs may switch between CBC and eIF4E cap-chaperones to modulate their processing and thus their ultimate product [2]. In this way, USER codes imbue selectivity of an mRNA to the given processing step under specific conditions. mRNAs with the same USER codes can be co-regulated enabling them to be similarly processed, exported and translated. Furthermore, co-regulation of the RNA binding proteins (RBPs) that bind USER codes is another potent way to co-regulate mRNAs in these groups [88]. These sorts of mechanisms also play important roles in genetic compensation observed in knockdown or knockout experiments [88]. In all, RNA regulons provide a powerful means by which to coordinate post-transcriptional processing in combinatorial and multiplicative ways which in turn potently drive important cellular processes in development, differentiation and malignancy [3,4].
One example of the biological impact of coordinated processing involves the capacity of eIF4E to alter the cell surface architecture which in turn is required for eIF4E-dependent cell motility. For example, the glycosaminoglycan hyaluronan (HA) and its receptor CD44 underpin tumour-stroma interactions and are important for association with the microenvironment and cell motility [89]. Interestingly, eIF4E elevates the RNA export of all the enzymes involved in the biosynthesis of HA as well as CD44 [56]. This enhanced RNA processing driven by eIF4E increased levels of both the HA sugar and the CD44 protein [56]. Notably, eIF4E does not alter the transcript levels of any of these factors. HA is usually extruded into the extracellular space where it is a major component of the microenvironment [89]. However in the high-eIF4E context, cells become encapsulated in coats of HA and CD44 on their cell surface [56]. Genetic or pharmacological targeting of HA or CD44 production in these high-eIF4E cells impairs eIF4E-mediated cell motility [56]. In all this provides an example of an eIF4E driven RNA regulon that underpins motility, an important feature of cancer cells.
Perspectives
Cancer cells can coopt RNA processing to reprogramme the proteome.
These changes can reprogramme entire biochemical networks, re-write protein functionalities, and can imbue cells with increased oncogenic capacity.
Diverted RNA processing can lead to wide-scale changes to the proteome. This information is not captured by traditional RNA-Seq methods which employ transcript levels as surrogates of protein levels.
Some factors such as eIF4E and CBC are positioned to combinatoraly re-write and amplify messages by influencing multiple processing events for groups of RNAs acting in the same biochemical pathways and thereby elicit potent biological outcomes.
Efforts to better understand the genetic flow of information through the RNA processing steps with a focus on identifying the USER codes to better predict functional outcomes from transcriptome analysis.
Targeting RNA processing can lead to clinical benefit and needs to be incorporated more broadly into therapeutic strategies.
Competing Interests
The author declares that there are no competing interests associated with this manuscript.
Funding
KLBB is grateful for funding from the National Institutes for Health, Canadian Institute for Health Research, Leukemia and Lymphoma Society and holds a Canada Research Chair.
Author Contributions
Conception, writing and figure preparation were carried out by KLBB.
Acknowledgements
Many thanks for critical reading of the manuscript by Drs Mehdi Ghram and Jean-Clement Mars.