Abstract
Background: Osteosarcoma is the most common primary bone malignancy in children and adolescents. In order to find factors related to its recurrence, and thus improve recovery prospects, a powerful clinical signature is needed. Long noncoding RNAs (lncRNAs) are essential in osteosarcoma processes and development, and here we report significant lncRNAs to aid in earlier diagnosis of osteosarcoma.
Methods: A univariate Cox proportional hazards regression analysis and a multivariate Cox regression analysis were used to analyze osteosarcoma patients’ lncRNA expression data from the Therapeutically Applicable Research To Generate Effective Treatments (TARGET), a public database.
Results: A lncRNA signature consisting of three lncRNAs (RP1-261G23.7, RP11-69E11.4 and SATB2-AS1) was selected. The signature was used to sort patients into high-risk and low-risk groups with meaningful recurrence rates (median recurrence time 16.80 vs. >128.22 months, log-rank test, P<0.001) in the training group, and predictive ability was validated in a test dataset (median 16.32 vs. >143.80 months, log-rank test, P=0.006). A multivariate Cox regression analysis showed that the significant lncRNA was an independent prognostic factor for osteosarcoma patients. Functional analysis suggests that these lncRNAs were related to the PI3K-Akt signaling pathway, the Wnt signaling pathway, and the G-protein coupled receptor signaling pathway, all of which have various, important roles in osteosarcoma development. The significant 3-lncRNA set could be a novel prediction biomarker that could aid in treatment and also predict the likelihood of recurrence of osteosarcoma in patients.
Introduction
Osteosarcoma is a common malignant bone tumor, which is predominant in childhood and adolescence [1]. Multidrug chemotherapy and surgical removal of primary tumors have shown ameliorating effects [2]. Although improvements in therapeutic strategies have been achieved, the prognosis remains poor for most patients with recurrent osteosarcoma. It is therefore imperative to identify effective and novel recurrent biomarkers and therapeutic targets for osteosarcoma.
Long noncoding RNAs (lncRNAs) are a novel class of RNA molecules, defined as transcripts >200 nucleotides that lack protein-coding potential. There have been many studies on the mechanistic role of lncRNA in osteosarcoma. Dysregulation of lncRNAs is implicated in different physiological processes in osteosarcoma, such as cell growth and metastasis [3–5]. For example, nine lncRNAs (91H, FGFR3-AS1, BCAR4, TUG1, UCA1, HIF2PUT, HOTTIP, HULC, and MALAT-1) are up-regulated in osteosarcoma, which is considered oncogenic for osteosarcoma [6]. The lncRNAs MALAT1 and HNF1A-AS1 promote cell proliferation and metastasis in osteosarcoma by activating the PI3K/Akt [7] and Wntβ-catenin [8] signaling pathways, respectively. Moreover, lncRNA TUG1 down-regulation inhibits proliferation of osteosarcoma cells and promotes apoptosis [9].
There have also been many lncRNA studies focusing on clinical prognosis in osteosarcoma. Dysregulation of lncRNA MEG3 could act as a potential predictor of the progression and poor prognosis of osteosarcoma [10], and overexpression of lncRNAs UCA1 and TUG1 are essential in the poor prognosis of osteosarcoma [11,12]. In addition, SNHG4 promotes tumor growth and is a predictor of poor survival and recurrence in human osteosarcoma [13]. HNF1A-AS1 is a negative prognostic biomarker that promotes tumorigenesis and progression in osteosarcoma [14]. However, prognostic signatures of recurrence are still rare, meaning that the identification of such signatures is necessary to improve clinical treatment.
In the present study, we identified an lncRNA signature in osteosarcoma from the Therapeutically Applicable Research To Generate Effective Treatments (TARGET) database, and the signature was an independent prognostic factor for osteosarcoma. The presence of the signature allowed patients to be separated into high-risk and low-risk groups, and this signature may enable the development of novel recurrence biomarkers and therapeutic targets for osteosarcoma.
Materials and methods
Clinical cohorts and RNA-Seq data
Clinical information (age, gender) of 92 patients with osteosarcoma was obtained from the TARGET data portal (https://ocg.cancer.gov/), including 47 recurrent samples, 32 non-recurrent samples, and 13 censored samples. The RNA expression data (level 3) of 92 osteosarcoma samples were downloaded from the freely available TARGET data portal. The RNA sequence data from the 92 samples was derived from the IlluminaHiSeq_RNASeq sequencing platforms, and the differentially expressed mRNAs and lncRNAs were obtained by using the edgeR software [15] to further analyze the recurrence vs. non-recurrence data. Fold changes (log2 absolute) ≥2 and P<0.05 indicated a statistically significant difference.
Construction of co-expression network
We examined the correlation between the expression levels of the 42 differentially expressed lncRNAs and each differentially expressed mRNA using two-sided Pearson correlation coefficients and the z-test. The mRNAs positively or negatively correlated with the 42 lncRNAs were considered to be lncRNA-related mRNAs (|Pearson correlation coefficient| > 0.4 [16–20] and P-value <0.05). The lncRNA-mRNA co-expression network was then constructed using Cytoscape V.3.8.5 (https://cytoscape.org/).
Construction of an lncRNA signature in the training dataset
Statistical analysis
We adopted the median risk score as a cut-off value in the training dataset [22], using it to divide the osteosarcoma patients into low- and high-risk groups. We used Kaplan–Meier survival analysis to estimate recurrence time and compare recurrence curves between the two groups, and a two-sided log-rank test to assess the statistical significance. We then validated the prognostic ability of the lncRNA signature in the test dataset using the receiver operating characteristic (ROC) values and Kaplan–Meier survival analysis. We carried out data stratification analysis and multivariate Cox regression analysis in order to identify whether the lncRNA were an independent factor, and to assess the signature within the available data. We considered a value of P<0.05 as significant. All analyses were performed with the R statistical program (version 3.5.1), using the survival and pROC packages downloaded from Bioconductor (http://www.bioconductor.org/).
Construction of the lncRNA-function network
The Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) and gene ontology functional enrichment analyses (GO) were carried out to better illustrate the potential functional mechanism of recurrence-related lncRNA modules of osteosarcoma. The DAVID Bioinformatics Tool (https://david.ncifcrf.gov/), which consists of an integrated biological information base and analytical tools aimed at systematically extracting biological meaning from gene/protein lists was used. We first screened the three lncRNAs correlated with differentially expressed mRNAs (|Pearson correlation coefficient| > 0.4 and P-value < 0.05), and then screened the mRNAs involved in the functional analysis. These database inputs and screening parameters allowed for construction of the lncRNA-function network using Cytoscape, version.3.8.5 (https://cytoscape.org/).
Results
Patients’ characteristics
All expression data used in the present study were from patients clinically and pathologically diagnosed with osteosarcoma. The median age was 15 years (4–40 years) for the enrolled patients. Other clinical information about the patients is in Table 1. Subsequently, we randomly divided the 92 patients into two groups (training group, n=61; test group, n=31) to seek and validate the prognostic lncRNA found in the osteosarcoma patients. The selection scheme of the recurrent lncRNA signature can be found in Figure 1.
Flowchart of the study
Characteristic . | Training set . | Testing set . | Total . |
---|---|---|---|
Gender | |||
Male | 37 | 15 | 160 |
Female | 24 | 16 | 27 |
Age | |||
Median | 15 | 15 | 15 |
Range | 4–40 | 6–21 | 4–40 |
Vital status | |||
Non-recurrence | 28 | 16 | 44 |
Recurrence | 33 | 15 | 48 |
Characteristic . | Training set . | Testing set . | Total . |
---|---|---|---|
Gender | |||
Male | 37 | 15 | 160 |
Female | 24 | 16 | 27 |
Age | |||
Median | 15 | 15 | 15 |
Range | 4–40 | 6–21 | 4–40 |
Vital status | |||
Non-recurrence | 28 | 16 | 44 |
Recurrence | 33 | 15 | 48 |
Identification of significantly differentially expressed mRNAs and lncRNAs
A total of 19495 mRNAs and 14589 lncRNAs were identified from the TARGET database. Using |Fold changes| ≥2 and P<0.05 as cutoffs, we identified 184 differentially expressed mRNAs (71 down-regulated and 113 up-regulated) and 42 differentially expressed lncRNAs (22 down-regulated and 20 up-regulated) (Supplementary Table S1).
Construction of the lncRNA-mRNA co-expression network in osteosarcoma
In order to better study the role of differentially expressed lncRNAs in osteosarcoma and to further elucidate the interaction between these differentially expressed mRNAs, we constructed an osteosarcoma lncRNA-mRNA related regulatory network. We used the 36 differentially expressed lncRNAs and 101 differentially expressed mRNAs to construct this network (Figure 2 and Supplementary Table S2), and screening criteria were the |Pearson correlation coefficient| > 0.4 and P-value <0.05. Using these criteria, we found that six up-regulated lncRNAs (RP1-261G23.7, RP11-81A22.5, RP11-69E11.4, RP11-817J15.3, SATB2-AS1 and CTB-4E7.1) functioned in regulating many mRNAs, suggesting an important regulatory role for these lncRNAs in osteosarcoma.
The lncRNA-mRNA co-expression network
Construction of the recurrent lncRNA signature with the training dataset
Determining the survival power of the lncRNA signature in the training and test datasets
The selected lncRNA signature returned a risk score for each patient. We used the median risk score to divide the training group patients into either a low-risk group (n=31) or a high-risk group (n=30). Using the Kaplan–Meier survival analysis, we found that the low-risk group patients had a significantly lower recurrence rate than those in the high-risk group (median recurrence time: >128.22 vs. 16.80 months, log-rank test P<0.001; Figure 3A). The 5-year recurrence rate of the patients in the high-risk group was less than 20%, while that of the patients in the low-risk group was greater than 60%.
lncRNA signature predicts recurrence of patients with osteosarcoma
The recurrent risk score model used to calculate the lncRNA signature-based risk scores of the 31 patients in the test dataset was also used to validate the prediction power of the lncRNA signature. Patients were divided into low-risk and high-risk groups using the median risk score as a demarcation point, in the same fashion as the training dataset. Kaplan–Meier curves were used to show the low-risk and high-risk groups in the test dataset (Figure 3B), and the results were similar to those from the training dataset. In the test dataset, the low-risk group had a significantly higher recurrence rate than the high-risk group (median recurrence time: >143.80 vs. 16.32, log-rank test P=0.006). The recurrence rate of patients in the high-risk group was less than 20% at 5 years, in contrast with more than 60% in the low-risk group.
The survival prediction power of the lncRNA signature in the training and test groups
We performed a separate ROC analysis to test the prediction power of the lncRNA signature, which considered the larger area under the ROC curve as a better model for predicting recurrence of the osteosarcoma. In the training dataset, the predictive ability of the 3-lncRNA signature was high (AUCSignature = 0.68, Figure 3C), which further demonstrated that this signature was a novel and highly accurate biomarker of recurrence. A similar result was seen in the test group (AUCSignature = 0.75, Figure 3D).
The selected 3-lncRNA signature is an independent prognostic factor
A multivariate Cox regression analysis using the signature-based risk score and other clinical characteristics (e.g. age and gender) demonstrated that the prognostic power of the lncRNA signature risk score for the prediction of recurrence was independent of clinical characteristics for the entire dataset (High-risk group vs. Low-risk group, HR = 2.06, 95% CI: 1.53–2.77, P<0.001, n=92, Table 2).
Variables . | . | HR . | 95% CI of HR . | P . | |
---|---|---|---|---|---|
. | . | . | Lower . | Upper . | . |
Univariable analysis | |||||
Gender | Male vs. female | 1.24 | 0.69 | 2.21 | 0.47 |
Age | >15 vs. ≤15 | 0.96 | 0.90 | 1.02 | 0.17 |
LncRNA signature | High risk vs. low risk | 2.10 | 1.57 | 2.81 | 0.00 |
Multivariable analysis | |||||
Gender | Male vs. female | 1.17 | 0.63 | 2.16 | 0.62 |
Age | >15 vs. ≤15 | 0.96 | 0.89 | 1.02 | 0.18 |
LncRNA signature | High risk vs. low risk | 2.06 | 1.53 | 2.77 | 0.00 |
Variables . | . | HR . | 95% CI of HR . | P . | |
---|---|---|---|---|---|
. | . | . | Lower . | Upper . | . |
Univariable analysis | |||||
Gender | Male vs. female | 1.24 | 0.69 | 2.21 | 0.47 |
Age | >15 vs. ≤15 | 0.96 | 0.90 | 1.02 | 0.17 |
LncRNA signature | High risk vs. low risk | 2.10 | 1.57 | 2.81 | 0.00 |
Multivariable analysis | |||||
Gender | Male vs. female | 1.17 | 0.63 | 2.16 | 0.62 |
Age | >15 vs. ≤15 | 0.96 | 0.89 | 1.02 | 0.18 |
LncRNA signature | High risk vs. low risk | 2.06 | 1.53 | 2.77 | 0.00 |
Functional prediction of differentially expressed mRNAs related to 3 lncRNAs
KEGG and GO analyses were used to investigate the potential involvement of the 3 lncRNAs in biological processes associated with osteosarcoma development. We performed functional analysis with co-expressed mRNAs of 3 lncRNAs in each lncRNA module (|Pearson correlation coefficient| > 0.4 and P-value <0.05, Supplementary Table S6). The 3 lncRNAs-functional network (Figure 4 and Supplementary Table S6) was created using Cytoscape. The results indicated that the 3 lncRNA signature was related to a collagen trimer, endoplasmic reticulum lumen, and protein digestion and absorption. We also found that only RP1-261G23.7 was involved in the Wnt signal pathway, which is in a variety of biological processes, including cell cycle regulation and cancer. RP11-69E11.4 was involved in the G-protein coupled receptor signaling pathway, which is linked to the activation of adenylyl cyclase activity and subsequent cAMP/PKA/CREB signaling [23] as potential drivers of tumorigenesis. Finally, SATB2-AS1 is involved in the PI3K-Akt signaling pathway, which is essential in cell proliferation and the biological characteristics of malignant cells [24].
The lncRNA-function network
Discussion
Individuals with recurrent osteosarcoma have less than a 20% chance of long-term survival, despite aggressive therapies [25]. Identification of a clinically relevant signature for prognosis is of critical need for patients. Recent studies suggested that lncRNAs could be good candidates for tumor signatures as they possess high sensitivity and high specificity [26–30]. Increasing data have demonstrated that lncRNAs are associated with osteosarcoma prognosis. For example, up-regulated FOXC2-AS1 was correlated with longer survival time [31], while HULC was associated with distant metastasis and clinical stage, and with shorter overall survival time [32,33]. Moreover, BANCR was associated with large tumor size, distant metastasis, and advanced clinical stage, and was an independent predictor of poor survival rate [34–36]. However, lncRNA signatures and their relation to recurrence of osteosarcoma has rarely been investigated.
In our study, different statistical methods were used to identify a 3-lncRNA signature associated with the recurrence of osteosarcoma in patients. We found that the lncRNA signature was an independent predictor in patient recurrence. A multivariate Cox regression analysis was used to assess the independence of the selected signature in predicting recurrence of osteosarcoma. With age and gender as covariables, the risk scores of patients based on the signature maintained an independent association with recurrence. Taken together, these results suggested that the prognostic power of the lncRNA signature for predicting recurrence of osteosarcoma in patients was independent of other clinical characteristics.
Moreover, the expression levels of RP1-261G23.7, RP11-69E11.4, and SATB2-AS1 were increased in recurrent osteosarcoma samples vs. non-recurrent samples. These lncRNAs also had important regulatory roles in osteosarcoma based on the lncRNA-mRNA co-expression network findings. Finally, we analyzed the functions of these lncRNAs, and the results showed that RP1-261G23.7 was involved in growth factor activity, single organismal cell–cell adhesion, the G-protein coupled receptor signaling pathway, and cell adhesion.
Vascular endothelial growth factor A (VEGF-A) is known to function directly in blood vessel formation and maintenance, and is a promising therapeutic target for activation in cancer. RP1-261G23.7 is antisense VEGF-A lncRNA, and a recent report showed that RP1-261G23.7 functions as a VEGF-A promoter enhancer-like element, possibly by acting as a local scaffold for proteins [37]. RP11-69E11.4 was involved in the Wnt signaling pathway, which is sensitive to alterations at various nodes of the pathway, and these alterations were identified in multiple tumor types after hyperactivation. These alterations converge into increased tumorigenicity, sustained proliferation and enhanced metastatic potential [24]. In addition, SATB2-AS1 is specifically involved in the PI3K-Akt signaling pathway, which can affect the epithelial–mesenchymal transition in numerous ways to influence tumor aggressiveness [38]. SATB2-AS1 was reported as overexpressed in osteosarcoma, and was associated with increased cell proliferation and growth [39]. These results indicated that recurrent lncRNA signatures executed a variety of biological functions in osteosarcoma tumorigenesis and development. There has been relatively little research on the three signature lncRNAs, and our results provide a foundation for further investigation into the function of the three lncRNAs.
There are a few limitations of this research regarding osteosarcoma. Most importantly, the specific prediction mechanism of the 3 lncRNAs in osteosarcoma requires further study. Moreover, the lncRNA signature has not yet been tested prospectively in multiple clinical trials. Despite these drawbacks, the significant and consistent correlation of our lncRNA signature with recurrence in two independent datasets indicated that it was a potentially powerful prognostic signature for osteosarcoma.
In conclusion, our data clearly indicate that the lncRNA signature can predict the recurrence of osteosarcoma in patients. The signature provides accurate prediction, which is clinically very significant.
Author Contribution
Qingshan Guo contributed to the conception and design of the study. Tang Ying contributed to the collection of the clinical and TARGET data of the patients. Tang Ying, Jin-ling Dong, and Cen Yuan contributed to the interpretation of the data (statistical and computational analysis) and wrote the manuscript. Peng Li, Qingshan Guo, and Tang Ying contributed to the review and revision of the manuscript. All authors read and approved the final manuscript.
Funding
This work was supported by the National Natural Science Foundation of China [grant number 81472596].
Competing Interests
The authors declare that there are no competing interests associated with the manuscript.
Abbreviations
References
Author notes
The first author.