Integrative genomics analysis of eQTL and GWAS summary data identifies PPP1CB as a novel bone mineral density risk genes

The workflow of prioritizing the BMD-associated genes

Table 1

Significant pathways enriched by identified BMD-associated genes based on discovery dataset

Pathway ID	Pathway terms	Associated genes proportion	P value	Corrected P value	Associated genes
GO:0002829	Negative regulation of type 2 immune response	0.200	0.00018	0.0018	ARG2, SOCS5, STAT6
GO:1904294	Positive regulation of ERAD pathway	0.200	0.00018	0.0018	ATXN3, EDEM2, XBP1
GO:0071498	Cellular response to fluid shear stress	0.143	0.00050	0.0045	KLF2, SOCS5, XBP1
GO:0000188	Inactivation of MAPK activity	0.111	0.00106	0.0085	DUSP1, DUSP12, DUSP3
GO:1901800	Positive regulation of proteasomal protein catabolic process	0.047	0.00122	0.0085	ATXN3, DAB2, EDEM2, SOCS5, XBP1
KEGG:03320	PPAR signaling pathway	0.056	0.00210	0.0126	CPT1A, FABP7, FADS2, NR1H3
KEGG:04142	Lysosome	0.041	0.00235	0.0118	ACP2, AP3D1, CTSB, GUSB, IDUA
GO:0048246	Macrophage chemotaxis	0.079	0.00287	0.0115	LGALS3, MMP28, S100A8
GO:1904375	Regulation of protein localization to cell periphery	0.043	0.00529	0.0159	CIB1, DAB2, EPB41L2, SPTBN1
GO:0030170	Pyridoxal phosphate binding	0.050	0.01034	0.0207	GCAT, KYAT3, PYGM
GO:1901880	Negative regulation of protein depolymerization	0.048	0.01131	0.0113	CIB1, SPTBN1, TRIOBP

Pathway ID	Pathway terms	Associated genes proportion	P value	Corrected P value	Associated genes
GO:0002829	Negative regulation of type 2 immune response	0.200	0.00018	0.0018	ARG2, SOCS5, STAT6
GO:1904294	Positive regulation of ERAD pathway	0.200	0.00018	0.0018	ATXN3, EDEM2, XBP1
GO:0071498	Cellular response to fluid shear stress	0.143	0.00050	0.0045	KLF2, SOCS5, XBP1
GO:0000188	Inactivation of MAPK activity	0.111	0.00106	0.0085	DUSP1, DUSP12, DUSP3
GO:1901800	Positive regulation of proteasomal protein catabolic process	0.047	0.00122	0.0085	ATXN3, DAB2, EDEM2, SOCS5, XBP1
KEGG:03320	PPAR signaling pathway	0.056	0.00210	0.0126	CPT1A, FABP7, FADS2, NR1H3
KEGG:04142	Lysosome	0.041	0.00235	0.0118	ACP2, AP3D1, CTSB, GUSB, IDUA
GO:0048246	Macrophage chemotaxis	0.079	0.00287	0.0115	LGALS3, MMP28, S100A8
GO:1904375	Regulation of protein localization to cell periphery	0.043	0.00529	0.0159	CIB1, DAB2, EPB41L2, SPTBN1
GO:0030170	Pyridoxal phosphate binding	0.050	0.01034	0.0207	GCAT, KYAT3, PYGM
GO:1901880	Negative regulation of protein depolymerization	0.048	0.01131	0.0113	CIB1, SPTBN1, TRIOBP

Validation of BMD-associated risk genes

To further validate our results in the discovery stage, we replicated the findings of Sherlock integrative analysis with the use of an independent of Dixon et al. eQTL dataset (Replication dataset: N = 400). By using the same parameter settings, Sherlock integrative analysis based on independent eQTL data identified 98 significant genes (Supplementary Table S3; Bonferroni corrected P < 0.05) and 2031 suggestive genes (P < 0.05). Among these 147 significant genes identified from the discovery stage, there were 48 genes validated in the replication stage (P < 0.05). In both the discovery and validation stage, there were 11 BMD-associated genes reached Bonferroni correction significance, namely CPT1A, CKB, MRPL21, ZFP57, ACP2, PPP1CB, ST7L, EPB41L2, PBX1, AMT, and VNN3 (corrected P values < 0.05; Supplementary Figure S1). Multiple cis-SNPs in these identified genes convey risk to BMD (Supplementary Tables S2 and S4).

MAGMA-based gene enrichment analysis of BMD

To ensure the reliability of these identified BMD-risk genes, we carried out an independent technical approach of MAGMA gene-based analysis. We found a number of 1106 genes were significantly associated with the phenotype of BMD based on Bonferroni correction for multiple testing (corrected P < 0.05). The top-ranked association signals were of CPED1 (corrected P = 7.98 × 10⁻²⁰²), CCDC170 (corrected P = 1.78 × 10⁻¹⁸²), and WNT16 (corrected P = 3.96 × 10⁻¹³⁷). There were 36 significant genes from MAGMA analysis overlapped with genes identified from Sherlock analysis (correlated P < 0.05 of genes from the discovery stage; Figure 2A and Table 2). We observed that none of these 36 identified genes show significant associations in the MAGMA analysis of Null GWAS (Table 2). In addition, a number of 86 MAGMA-identified significant genes were overlapped with Sherlock-identified genes (P < 0.05 of genes from the discovery stage; Supplementary Figure S2).

Convergent evidence of BMD-associated genes from independent datasets

Figure 2

(A) Venn plot of BMD-relevant genes based on three datasets: Sherlock-identified genes from the discovery dataset (corrected P<0.05); Sherlock-identified genes from the validation dataset (P<0.05); MAGMA-identified genes from the GWAS dataset (corrected P<0.05). (B) Computer-based permutation analysis (100,000 times) of the counts of risk genes from Zeller eQTL dataset (corrected P<0.05; discovery stage) overlapped with that from Dixon eQTL dataset (P<0.05; Replication stage). (C) Computer-based permutation analysis (100,000 times) of the counts of risk genes from Zeller eQTL dataset (corrected P<0.05; discovery stage) overlapped with that from MAGMA-based dataset (corrected P<0.05; technical validation stage).

Convergent evidence of BMD-associated genes from independent datasets

(A) Venn plot of BMD-relevant genes based on three datasets: Sherlock-identified genes from the discovery dataset (corrected P<0.05); Sherlock-identified genes from the validation dataset (P<0.05); MAGMA-identified genes from the GWAS dataset (corrected P<0.05). (B) Computer-based permutation analysis (100,000 times) of the counts of risk genes from Zeller eQTL dataset (corrected P<0.05; discovery stage) overlapped with that from Dixon eQTL dataset (P<0.05; Replication stage). (C) Computer-based permutation analysis (100,000 times) of the counts of risk genes from Zeller eQTL dataset (corrected P<0.05; discovery stage) overlapped with that from MAGMA-based dataset (corrected P<0.05; technical validation stage).

Table 2

Identification of BMD-associated risk genes from multiple omics datasets

Gene	LBF	Sherlock-identified P value (discovery dataset; corrected P<0.05)	Sherlock-identified P value (replication dataset; P<0.05)	MAGMA-identified P value (BMD GWAS; corrected P<0.05)	MAGMA-identified P value (Null GWAS; corrected P<0.05)
MSRA	10.59	7.87E-07	1.52E-06	5.29E-46	Non-significance
BLK	9.14	7.87E-07	9.26E-05	4.61E-35	Non-significance
ACP2	7.50	7.87E-07	7.59E-07	4.35E-14	Non-significance
EPB41L2	7.48	7.87E-07	7.59E-07	1.96E-18	Non-significance
ACSS2	6.84	7.87E-07	1.58E-03	6.81E-12	Non-significance
ZFP57	6.82	7.87E-07	7.59E-07	2.18E-06	Non-significance
MARK3	6.76	7.87E-07	1.82E-05	1.66E-37	Non-significance
FDFT1	6.66	7.87E-07	6.68E-05	5.00E-18	Non-significance
PPP1CB	6.62	7.87E-07	7.59E-07	4.01E-13	Non-significance
SPTBN1	6.61	7.87E-07	9.72E-05	9.60E-72	Non-significance
AMT	6.59	7.87E-07	7.59E-07	3.43E-11	Non-significance
MPHOSPH9	6.53	7.87E-07	8.81E-05	2.85E-09	Non-significance
CPT1A	6.49	7.87E-07	7.59E-07	1.92E-15	Non-significance
XBP1	6.43	7.87E-07	1.52E-06	1.49E-13	Non-significance
CLDN23	6.38	7.87E-07	1.67E-05	8.60E-23	Non-significance
H1F0	6.36	7.87E-07	4.28E-04	7.94E-08	Non-significance
SSH2	6.27	7.87E-07	9.11E-05	4.85E-16	Non-significance
LACTB2	6.27	7.87E-07	2.43E-05	2.19E-16	Non-significance
VNN3	6.25	7.87E-07	7.59E-07	8.52E-07	Non-significance
DGKQ	6.12	7.87E-07	6.83E-05	3.96E-16	Non-significance
LAMB2	6.08	7.87E-07	2.83E-02	6.99E-12	Non-significance
TRIOBP	5.96	7.87E-07	1.03E-04	2.42E-06	Non-significance
TPCN2	5.96	7.87E-07	1.52E-05	1.57E-21	Non-significance
BAG5	5.94	7.87E-07	1.52E-06	1.95E-25	Non-significance
MRPL21	5.84	1.57E-06	7.59E-07	1.67E-10	Non-significance
CTSB	5.79	1.57E-06	2.43E-05	8.22E-19	Non-significance
NME4	5.71	1.57E-06	2.58E-05	6.83E-13	Non-significance
TRIP11	5.70	1.57E-06	1.52E-06	1.11E-08	Non-significance
PIGN	5.55	1.57E-06	1.52E-06	2.99E-07	Non-significance
ATXN3	5.46	1.57E-06	4.56E-04	8.28E-09	Non-significance
ST7L	5.40	1.57E-06	7.59E-07	2.38E-17	Non-significance
WARS2	5.35	1.57E-06	1.52E-05	2.64E-11	Non-significance
CLMN	5.14	1.57E-06	2.38E-03	1.55E-08	Non-significance
CKB	5.14	1.57E-06	7.59E-07	9.61E-38	Non-significance
UBE2E3	5.06	1.57E-06	1.93E-04	2.85E-08	Non-significance
CIB1	5.06	1.57E-06	1.52E-06	5.56E-07	Non-significance

Gene	LBF	Sherlock-identified P value (discovery dataset; corrected P<0.05)	Sherlock-identified P value (replication dataset; P<0.05)	MAGMA-identified P value (BMD GWAS; corrected P<0.05)	MAGMA-identified P value (Null GWAS; corrected P<0.05)
MSRA	10.59	7.87E-07	1.52E-06	5.29E-46	Non-significance
BLK	9.14	7.87E-07	9.26E-05	4.61E-35	Non-significance
ACP2	7.50	7.87E-07	7.59E-07	4.35E-14	Non-significance
EPB41L2	7.48	7.87E-07	7.59E-07	1.96E-18	Non-significance
ACSS2	6.84	7.87E-07	1.58E-03	6.81E-12	Non-significance
ZFP57	6.82	7.87E-07	7.59E-07	2.18E-06	Non-significance
MARK3	6.76	7.87E-07	1.82E-05	1.66E-37	Non-significance
FDFT1	6.66	7.87E-07	6.68E-05	5.00E-18	Non-significance
PPP1CB	6.62	7.87E-07	7.59E-07	4.01E-13	Non-significance
SPTBN1	6.61	7.87E-07	9.72E-05	9.60E-72	Non-significance
AMT	6.59	7.87E-07	7.59E-07	3.43E-11	Non-significance
MPHOSPH9	6.53	7.87E-07	8.81E-05	2.85E-09	Non-significance
CPT1A	6.49	7.87E-07	7.59E-07	1.92E-15	Non-significance
XBP1	6.43	7.87E-07	1.52E-06	1.49E-13	Non-significance
CLDN23	6.38	7.87E-07	1.67E-05	8.60E-23	Non-significance
H1F0	6.36	7.87E-07	4.28E-04	7.94E-08	Non-significance
SSH2	6.27	7.87E-07	9.11E-05	4.85E-16	Non-significance
LACTB2	6.27	7.87E-07	2.43E-05	2.19E-16	Non-significance
VNN3	6.25	7.87E-07	7.59E-07	8.52E-07	Non-significance
DGKQ	6.12	7.87E-07	6.83E-05	3.96E-16	Non-significance
LAMB2	6.08	7.87E-07	2.83E-02	6.99E-12	Non-significance
TRIOBP	5.96	7.87E-07	1.03E-04	2.42E-06	Non-significance
TPCN2	5.96	7.87E-07	1.52E-05	1.57E-21	Non-significance
BAG5	5.94	7.87E-07	1.52E-06	1.95E-25	Non-significance
MRPL21	5.84	1.57E-06	7.59E-07	1.67E-10	Non-significance
CTSB	5.79	1.57E-06	2.43E-05	8.22E-19	Non-significance
NME4	5.71	1.57E-06	2.58E-05	6.83E-13	Non-significance
TRIP11	5.70	1.57E-06	1.52E-06	1.11E-08	Non-significance
PIGN	5.55	1.57E-06	1.52E-06	2.99E-07	Non-significance
ATXN3	5.46	1.57E-06	4.56E-04	8.28E-09	Non-significance
ST7L	5.40	1.57E-06	7.59E-07	2.38E-17	Non-significance
WARS2	5.35	1.57E-06	1.52E-05	2.64E-11	Non-significance
CLMN	5.14	1.57E-06	2.38E-03	1.55E-08	Non-significance
CKB	5.14	1.57E-06	7.59E-07	9.61E-38	Non-significance
UBE2E3	5.06	1.57E-06	1.93E-04	2.85E-08	Non-significance
CIB1	5.06	1.57E-06	1.52E-06	5.56E-07	Non-significance

Furthermore, we found that no matter significant or suggestive genes identified in the Sherlock discovery stage were significantly higher overlapped with genes identified in the Sherlock validation stage and MAGMA validation (permutation P < 1.0 × 10⁻⁵; see Figure 2B,C and Supplementary Figure S3A,B). More interestingly, comparing the MAGMA results of Null GWAS, we observed that Sherlock-identified genes from the discovery and validation stage have significantly high overlap rates with MAGMA-identified genes from BMD GWAS than that from Null GWAS (Figure 3A,B).

Genes identified from Sherlock Bayesian analysis validated by MAGMA-identified genes

Figure 3

(A) Sherlock-identified genes from Zeller eQTL dataset (discovery stage) were higher overlapped with MAGMA-identified genes from BMD GWAS (Dataset #1) than that from Null-based GAWS (Dataset #2) at five different P-value thresholds of 0.05, 0.01, 0.001, 0.0001, and 0.00001. (B) Sherlock-identified genes from Dixon eQTL dataset (validation stage) were higher overlapped with MAGMA-identified genes from BMD GWAS (Dataset #1) than that from Null-based GAWS (Dataset #2) at five different P-value thresholds of 0.05, 0.01, 0.001, 0.0001, and 0.00001.

Genes identified from Sherlock Bayesian analysis validated by MAGMA-identified genes

(A) Sherlock-identified genes from Zeller eQTL dataset (discovery stage) were higher overlapped with MAGMA-identified genes from BMD GWAS (Dataset #1) than that from Null-based GAWS (Dataset #2) at five different P-value thresholds of 0.05, 0.01, 0.001, 0.0001, and 0.00001. (B) Sherlock-identified genes from Dixon eQTL dataset (validation stage) were higher overlapped with MAGMA-identified genes from BMD GWAS (Dataset #1) than that from Null-based GAWS (Dataset #2) at five different P-value thresholds of 0.05, 0.01, 0.001, 0.0001, and 0.00001.

PPI analysis of 36 BMD-associated risk genes identified by Sherlock analysis

To determine whether these 36 BMD-associated risk genes identified by Sherlock and MAGMA analysis interact with each other, we conducted a PPI network-based analysis using interactions of genetic interaction, prediction links, co-expression, co-localization, and physical interactions based on the GeneMANIA database [35]. Figure 4 shows that these BMD-associated risk genes are constructed a biological network, indicating there exist highly biological connections between these identified genes. For example, the hub genes of PPP1CB, XBP1, FDFT1, and SPTBN1 have the most number of interactions with both predicted genes and BMD-associated genes (Figure 4). Additionally, the hub gene of XBP1 shows strong evidence of physical interactions with BMD-associated genes of H1F0 and predicted genes of ZNF440, SRSF1, and H1FX based on the interactions from BIOGRID and IREF databases [37,38]. The hub gene of PPP1CB has genetic interactions with BMD-associated genes of SPTBN1, TRIOP, and SSH2 (Figure 4) based on a genome-wide human genetic interaction map [39].

Protein–protein interactions between proteins encoded by 36 identified BMD-risk genes based on the application of the GeneMANIA database

Figure 4

The 36 BMD-risk genes were provided as query (red nodes) and a number of additional genes were predicted to be linked (green nodes). The predicted attributes included co-localization, genetic interactions, predicted links, co-expression, and physical interactions.

Protein–protein interactions between proteins encoded by 36 identified BMD-risk genes based on the application of the GeneMANIA database

The 36 BMD-risk genes were provided as query (red nodes) and a number of additional genes were predicted to be linked (green nodes). The predicted attributes included co-localization, genetic interactions, predicted links, co-expression, and physical interactions.

Differential expression analysis of identified genes in high BMD and low BMD subjects

Furthermore, we tested the co-expression patterns of these identified genes between high BMD and low BMD subjects based on two independent RNA expression datasets (i.e. GSE2208 and GSE7429). Based on the Pearson correlation analysis, we found that the co-expression patterns of these genes showed obvious differences between high and low BMD (Figures 5A,B and 6A,B). Subsequently, we first performed a differential gene expression (DGE) analysis to test the expression level of these genes in high BMD and low BMD subjects in the GSE2208 dataset. We found the genes of PPP1CB (P = 0.0026), FDFT1 (P = 0.0057), and XBP1 (P = 0.0023) showed significant up-regulated expression in low BMD subjects compared with high BMD subjects (Figure 5C–E). Consistently, we found the PPP1CB gene showed significantly higher expression in low BMD subjects than that in high BMD subjects with the use of an independent dataset of GSE7429 (P = 0.043; Figure 6C). In the dataset of GSE7429, we also detected that LAMB2 (P = 0.022) and ATXN3 (P = 0.038) were significantly differential expressed between low BMD and high BMD subjects (Figure 6D,E). In addition, several genes showed marginal evidence of differences between low BMD and high BMD subjects in both datasets (Supplementary Figures S4 and S5).

Differential expression profiles of 36 genes between high BMD and low BMD subjects in the GSE2208 dataset

Figure 5

(A) Co-expression patterns of 36 risk genes in low BMD subjects in the GSE2208 dataset. (B) Co-expression patterns of 36 risk genes in high BMD subjects in the GSE2208 dataset. (C) Boxplot of the different gene expression of PPP1CB between high BMD and low BMD subjects in the GSE2208 dataset. (D) Boxplot of the different gene expression of FDFT1 between high BMD and low BMD subjects in the GSE2208 dataset. (E) Boxplot of the different gene expression of XBP1 between high BMD and low BMD subjects in the GSE2208 dataset.

Differential expression profiles of 36 genes between high BMD and low BMD subjects in the GSE2208 dataset

(A) Co-expression patterns of 36 risk genes in low BMD subjects in the GSE2208 dataset. (B) Co-expression patterns of 36 risk genes in high BMD subjects in the GSE2208 dataset. (C) Boxplot of the different gene expression of PPP1CB between high BMD and low BMD subjects in the GSE2208 dataset. (D) Boxplot of the different gene expression of FDFT1 between high BMD and low BMD subjects in the GSE2208 dataset. (E) Boxplot of the different gene expression of XBP1 between high BMD and low BMD subjects in the GSE2208 dataset.

Differential expression profiles of 36 genes between high BMD and low BMD subjects in the GSE7429 dataset

Figure 6

(A) Co-expression patterns of 36 risk genes in low BMD subjects in the GSE7429 dataset. (B) Co-expression patterns of 36 risk genes in high BMD subjects in the GSE7429 dataset. (C) Boxplot of the different gene expression of PPP1CB between high BMD and low BMD subjects in the GSE7429 dataset. (D) Boxplot of the different gene expression of FDFT1 between high BMD and low BMD subjects in the GSE7429 dataset. (E) Boxplot of the different gene expression of XBP1 between high BMD and low BMD subjects in the GSE7429 dataset.

https://doi.org/10.1359/jbmr.061113

Differential expression profiles of 36 genes between high BMD and low BMD subjects in the GSE7429 dataset

(A) Co-expression patterns of 36 risk genes in low BMD subjects in the GSE7429 dataset. (B) Co-expression patterns of 36 risk genes in high BMD subjects in the GSE7429 dataset. (C) Boxplot of the different gene expression of PPP1CB between high BMD and low BMD subjects in the GSE7429 dataset. (D) Boxplot of the different gene expression of FDFT1 between high BMD and low BMD subjects in the GSE7429 dataset. (E) Boxplot of the different gene expression of XBP1 between high BMD and low BMD subjects in the GSE7429 dataset.

Replication and refinement of eQTL and GWAS results for PPP1CB

Furthermore, we concentrated our analysis on the hub gene of PPP1CB, which is a hub gene in our PPI network analysis. There were two eSNPs of rs9309664 (P_eQTL = 1.42 × 10⁻¹⁷ and P_GWAS = 1.40 × 10⁻¹¹; see Supplementary Table S2) and rs7475 (P_eQTL = 2.10 × 10⁻⁶ and P_GWAS = 1.70 × 10⁻⁷; see Supplementary Table S4) in PPP1CB identified. To validate our results, we used the tool of HaploReg [40] to examine the association of two eSNPs (rs9309664 and rs7475) with the expression of PPP1CB in independent datasets. The result confirmed that rs9309664 is significantly associated with PPP1CB expression in blood cells from two independent datasets (P = 6.19 × 10⁻⁶ and P = 3.4 × 10⁻²⁶; Supplementary Table S5), as well as rs7475 showed significant association with PPP1CB expression in a glioblastoma cell line from an independent dataset (P = 0.00024; Supplementary Table S5).

Discussion

Hitherto, more than 200 BMD-associated loci have been identified by GWAS studies [6–8,10]. However, the underlying biological mechanism of these risk SNPs and genes are still intractability.

Due to the influence of genetic LD among SNPs, these documented BMD-associated loci usually contain numerous highly LD SNPs with similar significant association signals. Thus, to confirm the exact causal SNPs of BMD-associated loci appeared to be a tough job. In view of a great number of detected susceptibility SNPs were mapped into the non-coding genomic regions, we reasonably inferred that these reported SNPs confer susceptibility to BMD via regulation of gene expression level. To address this issue, an effective and powerful inference method is warranted to be developed to combine the multi-omics information from genetic associations of existing GWAS summary data and independent eQTL data. In the current investigation, we employed a Bayesian integrative approach to combine the genetic associations from the large-scale GWAS of BMD (N = 142,487) and two independent eQTL datasets. In total, we identified 36 promising genes with eSNPs potentially implicated in BMD risk. To further prioritize the most convincing gene, we carried out several bioinformatics analyses, including MAGMA gene-based analysis, pathway-based enrichment analysis, in silico permutation analysis, PPI network-based analysis, and DGE analysis. Our results revealed that several genes, especially the PPP1CB gene, may represent authentic susceptibility genes for BMD.

The standard process of GWAS-based data analysis commonly uses millions of genetic SNPs to identify a number of common SNPs that showed significant associations with traits of interest such as osteoporosis. Nevertheless, the power of GWAS has been conspicuously limited by the enormous number of examined variants. GWAS may ignore the comprehensively integrated effects of modest variants or genes, which not reach a genome-wide significance but still importantly implicated in the pathology of osteoporosis. Furthermore, only based on the findings of GWAS analysis is impossible to infer whether the detected disease-associated SNPs contain regulatory functions. Thus, Sherlock integrative analysis is a good and effective method for combining the information of GWAS with eQTL data and has been applied to identify many novel risk genes of many complex diseases [15,19–22], which cannot be detected by GWAS alone.

With the use of Sherlock integrative analysis and MAGMA gene-based analysis, there were 36 BMD-associated risk genes identified with eSNPs at a genome-wide significance; for example, ACP2 (cis-rs11039035, P_GWAS = 1.10 × 10⁻¹¹; cis-rs2290148, P_GWAS = 2.40 × 10⁻⁸), EPB4L2 (cis-rs9375797, P_GWAS = 1.70 × 10⁻⁸; cis-rs4897473, P_GWAS = 2.40 × 10⁻¹⁵), VNN3 (cis-rs9402490, P_GWAS = 9.40 × 10⁻⁶⁷; cis-rs1856293, P_GWAS = 2.30 × 10⁻¹⁰), and CKB (cis-rs12894275, P_GWAS = 7.20 × 10⁻⁹; cis-rs2071407, P_GWAS = 1.70 × 10⁻¹²). Growing evidence indicates that disease-related genes encode functional proteins, which commonly generate a high interaction network [14,33,34,41]. By conducting a PPI network-based analysis, we found that these 36 identified genes appeared to be highly connected to each other. There were several genes of PPP1CB, XBP1, FDFT1, and SPTBN1 with the most number of interactions as the hub genes of the enriched network. For the hub gene of PPP1CB, there were two significant eSNPs of rs9309644 (P_eQTL = 1.42 × 10⁻¹⁷ and P_GWAS = 1.40 × 10⁻¹¹) and rs7475 (P_eQTL = 2.10 × 10⁻⁶ and P_GWAS = 1.70 × 10⁻⁷) identified. Consistently, these two eSNPs were reported to be significantly associated with PPP1CB expression in the other three previous studies [42–44]. Furthermore, based on two independent expression datasets, we found PPP1CB showed significantly higher expression in the low BMD group than that in the high BMD group. These findings indicate that the PPP1CB gene plays an important role in low BMD subjects, which is a strong risk factor for osteoporosis and a key factor for its diagnosis and treatment [2].

The protein encoded by PPP1CB gene, which is located on the chromosome of 2p23.2, is one of the three catalytic subunits of protein phosphatase 1 (PP1). Phosphoprotein phosphatase 1 (PPP1), a major type 1 serine/threonine phosphatase, is ubiquitously expressed and regulates various cellular functions including glycogen metabolism, cell division, and muscle contractility [45–48]. In recent, a study [49] showed a role for PPP1CB that it is the myosin light chain phosphatase responsible for Ca²⁺-transient rise and enhanced cell shortening in cardiomyocytes. Additionally, Cho et al. [45] have reported that PPP1CB regulates adipocyte differentiation partially by targeting the transcription factor C/EBPδ. Protein phosphatases are pivotal enzymes in diverse eukaryotic physiological processes including cell differentiation and proliferation, cell–cell communication, and regulation of transmembrane and intracellular signaling pathways. Dysregulation of these proteins expression or activity has been documented to be linked with numerous human diseases [45,50,51], including cancer, obesity, and osteoporosis.

In sum, our present study highlights 36 convincing genes associated with BMD risk and further provides strong evidence to support that PPP1CB represents a genuine BMD-associated risk gene with eSNPs conferring susceptibility to osteoporosis. Through integrating the GWAS-based genetic associations with gene expression data, we demonstrated a plausible elucidation of the biological function of genetic variants on osteoporosis susceptibility. Current integrative genomics analysis offers an effective pipeline for incorporating SNPs across the whole genome to genes via cis- and/or trans-regulatory effects on transcriptional abundance, which is more powerful for interpreting the biological mechanism of functional SNPs than a pure GWAS method. Identification of risk genes from our results give a group of new molecular targets for subsequent functional experiments and therapeutic testing in future, which will improve the translation of pharmacogenomics testing into clinical practice. Furthermore, more related molecular studies are warranted to reveal the biological mechanism of PPP1CB gene implicated in osteoporosis.

Competing Interests

The authors declare that there are no competing interests associated with the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China [grant number 81873320].

Author Contribution

Y.Z., L.Y., and Y.S. conducted data collection and bioinformatics analysis and wrote the manuscript. J.W. conceived the study and helped write the manuscript.

Ethics Approval

This study does not contain any studies with human subjects or animals performed by any of the authors.

Consent for Publication

The authors declare that we gave consent for publication.

Abbreviations

BMD: bone mineral density
DGE: differential gene expression
eQTL: expression quantitative trait loci
eSNP: expression-associated single-nucleotide polymorphism
GEFOS: the Genetic Factors for Osteoporosis
GHS: the Gutenberg Heart study
GO: Gene Ontology
GWAS: genome-wide association study
KEGG: Kyoto Encyclopedia of Genes and Genomes
LD: linkage disequilibrium
MAGMA: Multi-marker Analysis of GenoMic Annotation
PPI: protein–protein interaction

References

1.

Burge

R.

et al.

(

2007

)

Incidence and economic burden of osteoporosis-related fractures in the United States, 2005-2025

.

J. Bone Miner. Res.

22

,

465

–

475

[PubMed]

https://doi.org/10.1001/jama.298.23.2761

2.

Cauley

J.A.

et al.

(

2007

)

Long-term risk of incident vertebral fractures

.

JAMA

298

,

2761

–

2767

[PubMed]

https://doi.org/10.1002/jbmr.1537

3.

Liu

C.T.

et al.

(

2012

)

Heritability of prevalent vertebral fracture and volumetric bone mineral density and geometry at the lumbar spine in three generations of the Framingham study

.

J. Bone Miner. Res.

27

,

954

–

958

[PubMed]

https://doi.org/10.1038/nature14878

4.

Zheng

H.F.

et al.

(

2015

)

Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture

.

Nature

526

,

112

–

117

[PubMed]

https://doi.org/10.1210/edrv.23.3.0464

5.

Peacock

M.

et al.

(

2002

)

Genetics of osteoporosis

.

Endocr. Rev.

23

,

303

–

326

[PubMed]

https://doi.org/10.1038/nrg3228

6.

Richards

J.B.

,

Zheng

H.F.

and

Spector

T.D.

(

2012

)

Genetics of osteoporosis from genome-wide association studies: advances and challenges

.

Nat. Rev. Genet.

13

,

576

–

588

[PubMed]

https://doi.org/10.1038/ng.2249

7.

Estrada

K.

et al.

(

2012

)

Genome-wide meta-analysis identifies 56 bone mineral density loci and reveals 14 loci associated with risk of fracture

.

Nat. Genet.

44

,

491

–

501

[PubMed]

8.

Rivadeneira

F.

et al.

(

2009

)

Twenty bone-mineral-density loci identified by large-scale meta-analysis of genome-wide association studies

.

Nat. Genet.

41

,

1199

–

1206

[PubMed]

https://doi.org/10.1038/s41467-019-09302-x

9.

Chesi

A.

et al.

(

2019

)

Genome-scale Capture C promoter interactions implicate effector genes at GWAS loci for bone mineral density

.

Nat. Commun.

10

,

1260

[PubMed]

https://doi.org/10.1038/ng.3949

10.

Kemp

J.P.

et al.

(

2017

)

Identification of 153 new loci associated with heel bone mineral density and functional involvement of GPC6 in osteoporosis

.

Nat. Genet.

49

,

1468

–

1475

[PubMed]

https://doi.org/10.1038/ng.3314

11.

Nelson

M.R.

et al.

(

2015

)

The support of human genetic evidence for approved drug indications

.

Nat. Genet.

47

,

856

–

860

[PubMed]

https://doi.org/10.1016/j.gene.2017.08.015

12.

Choi

Y.J.

et al.

(

2017

)

Transcriptional profiling of human femoral mesenchymal stem cells in osteoporosis and its association with adipogenesis

.

Gene

632

,

7

–

15

[PubMed]

https://doi.org/10.1302/2046-3758.610.BJR-2017-0113.R1

13.

Wang

W.

et al.

(

2017

)

Integrative analysis of GWAS, eQTLs and meQTLs data suggests that multiple gene sets are associated with bone mineral density

.

Bone Joint Res.

6

,

572

–

576

[PubMed]

14.

Calabrese

G.M.

et al.

(

2017

)

Integrating GWAS and Co-expression Network Data Identifies Bone Mineral Density Genes SPTBN1 and MARK3 and an Osteoblast Functional Module

.

Cell Syst.

4

,

46e4

–

59e4

https://doi.org/10.1016/j.ajhg.2013.03.022

15.

He

X.

et al.

(

2013

)

Sherlock: detecting gene-disease associations by matching patterns of expression QTL and GWAS

.

Am. J. Hum. Genet.

92

,

667

–

680

[PubMed]

https://doi.org/10.1038/mp.2012.37

16.

Ayalew

M.

et al.

(

2012

)

Convergent functional genomics of schizophrenia: from comprehensive understanding to genetic risk prediction

.

Mol. Psychiatry

17

,

887

–

905

[PubMed]

https://doi.org/10.1038/ng1589

17.

Schadt

E.E.

et al.

(

2005

)

An integrative genomics approach to infer causal associations between gene expression and disease

.

Nat. Genet.

37

,

710

–

717

[PubMed]

https://doi.org/10.1038/ng.3538

18.

Zhu

Z.

et al.

(

2016

)

Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets

.

Nat. Genet.

48

,

481

–

487

[PubMed]

https://doi.org/10.1038/s41467-018-03247-3

19.

Yang

C.P.

et al.

(

2018

)

Comprehensive integrative analyses identify GLT8D1 and CSNK2B as schizophrenia risk genes

.

Nat. Commun.

9

,

838

[PubMed]

https://doi.org/10.1038/s41598-019-41434-4

20.

Lee

M.G.

et al.

(

2019

)

Integrative Genome-Wide Association Studies of eQTL and GWAS Data for Gout Disease Susceptibility

.

Sci. Rep.

9

,

4981

[PubMed]

https://doi.org/10.1016/j.jpsychires.2016.09.011

21.

Huo

Y.X.

et al.

(

2016

)

Identification of SLC25A37 as a major depressive disorder risk gene

.

J. Psychiatr. Res.

83

,

168

–

175

[PubMed]

22.

Zhong

J.

et al.

(

2019

)

Integration of GWAS and brain eQTL identifies FLOT1 as a risk gene for major depressive disorder

.

Neuropsychopharmacology

44

,

1542

–

1551

https://doi.org/10.1371/journal.pone.0010693

23.

Zeller

T.

et al.

(

2010

)

Genetics and beyond–the transcriptome of human monocytes and disease susceptibility

.

PLoS ONE

5

,

e10693

[PubMed]

https://doi.org/10.1038/ng2109

24.

Dixon

A.L.

et al.

(

2007

)

A genome-wide association study of global gene expression

.

Nat. Genet.

39

,

1202

–

1207

[PubMed]

https://doi.org/10.1073/pnas.0903103106

25.

Hindorff

L.A.

et al.

(

2009

)

Potential etiologic and functional implications of genome-wide association loci for human diseases and traits

.

Proc. Natl. Acad. Sci. U.S.A.

106

,

9362

–

9367

[PubMed]

https://doi.org/10.1371/journal.pcbi.1004219

26.

de Leeuw

C.A.

et al.

(

2015

)

MAGMA: generalized gene-set analysis of GWAS data

.

PLoS Comput. Biol.

11

,

e1004219

[PubMed]

https://doi.org/10.1016/j.ajhg.2009.09.012

27.

Landi

M.T.

et al.

(

2009

)

A genome-wide association study of lung cancer identifies a region of chromosome 5p15 associated with risk for adenocarcinoma

.

Am. J. Human Genet.

85

,

679

–

691

[PubMed]

https://doi.org/10.1093/bioinformatics/btp101

28.

Bindea

G.

et al.

(

2009

)

ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks

.

Bioinformatics

25

,

1091

–

1093

[PubMed]

https://doi.org/10.1101/gr.1239303

29.

Shannon

P.

et al.

(

2003

)

Cytoscape: a software environment for integrated models of biomolecular interaction networks

.

Genome Res.

13

,

2498

–

2504

[PubMed]

https://doi.org/10.1093/nar/28.1.27

30.

Kanehisa

M.

and

Goto

S.

(

2000

)

KEGG: kyoto encyclopedia of genes and genomes

.

Nucleic Acids Res.

28

,

27

–

30

[PubMed]

https://doi.org/10.1038/75556

31.

Ashburner

M.

et al.

(

2000

)

Gene Ontology: tool for the unification of biology

.

Nat. Genet.

25

,

25

[PubMed]

https://doi.org/10.1038/npp.2015.218

32.

Akula

N.

et al.

(

2016

)

An Integrative Genomic Study Implicates the Postsynaptic Density in the Pathogenesis of Bipolar Disorder

.

Neuropsychopharmacology

41

,

886

–

895

[PubMed]

https://doi.org/10.1038/s41598-017-01856-4

33.

Ma

Y.

and

Li

M.D.

(

2017

)

Establishment of a Strong Link Between Smoking and Cancer Pathogenesis through DNA Methylation Analysis

.

Sci. Rep.

7

,

1811

[PubMed]

https://doi.org/10.1016/j.cell.2013.06.049

34.

Gulsuner

S.

et al.

(

2013

)

Spatial and temporal mapping of de novo mutations in schizophrenia to a fetal prefrontal cortical network

.

Cell

154

,

518

–

529

[PubMed]

https://doi.org/10.1093/nar/gkq537

35.

Warde-Farley

D.

et al.

(

2010

)

The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function

.

Nucleic Acids Res.

38

,

W214

–

W220

[PubMed]

https://doi.org/10.1359/jbmr.080105

36.

Xiao

P.

et al.

(

2008

)

In vivo genome-wide expression study on human circulating B cells suggests a novel ESR1 and MAPK3 network for postmenopausal osteoporosis

.

J. Bone Miner. Res.

23

,

644

–

654

[PubMed]

https://doi.org/10.1016/j.cell.2010.01.044

37.

Ravasi

T.

et al.

(

2010

)

An atlas of combinatorial transcriptional regulation in mouse and man

.

Cell

140

,

744

–

752

[PubMed]

https://doi.org/10.1016/j.cell.2015.09.053

38.

Hein

M.Y.

et al.

(

2015

)

A human interactome in three quantitative dimensions organized by stoichiometries and abundances

.

Cell

163

,

712

–

723

[PubMed]

https://doi.org/10.1101/gr.104216.109

39.

Lin

A.

et al.

(

2010

)

A genome-wide map of human genetic interactions inferred from radiation hybrid genotypes

.

Genome Res.

20

,

1122

–

1132

[PubMed]

https://doi.org/10.1093/nar/gkv1340

40.

Ward

L.D.

and

Kellis

M.

(

2016

)

HaploReg v4: systematic mining of putative causal variants, cell types, regulators and target genes for human complex traits and disease

.

Nucleic Acids Res.

44

,

D877

–

D881

[PubMed]

https://doi.org/10.1371/journal.pgen.1001353

41.

Chen

M.

,

Cho

J.

and

Zhao

H.

(

2011

)

Incorporating biological pathways via a Markov random field model in genome-wide association studies

.

PLos Genet.

7

,

e1001353

[PubMed]

https://doi.org/10.1186/1755-8794-2-1

42.

Heap

G.A.

et al.

(

2009

)

Complex nature of SNP genotype effects on gene expression in primary human leucocytes

.

BMC Med. Genomics

2

,

1

[PubMed]

https://doi.org/10.1371/journal.pgen.1002197

43.

Fehrmann

R.S.

et al.

(

2011

)

Trans-eQTLs reveal that independent genetic variants associated with a complex phenotype converge on intermediate genes, with a major role for the HLA

.

PLos Genet.

7

,

e1002197

[PubMed]

https://doi.org/10.1093/nar/gks280

44.

Li

G.

et al.

(

2012

)

Identification of allele-specific alternative mRNA processing via transcriptome sequencing

.

Nucleic Acids Res.

40

,

e104

[PubMed]

https://doi.org/10.1016/j.bbrc.2015.10.004

45.

Cho

Y.L.

et al.

(

2015

)

Phosphoprotein phosphatase 1CB (PPP1CB), a novel adipogenic activator, promotes 3T3-L1 adipogenesis

.

Biochem. Biophys. Res. Commun.

467

,

211

–

217

[PubMed]

https://doi.org/10.1016/j.trsl.2014.07.001

46.

Korrodi-Gregorio

L.

,

Esteves

S.L.

and

Fardilha

M.

(

2014

)

Protein phosphatase 1 catalytic isoforms: specificity toward interacting proteins

.

Transl. Res.

164

,

366

–

391

[PubMed]

https://doi.org/10.1126/science.275.5305.1475

47.

Printen

J.A.

,

Brady

M.J.

and

Saltiel

A.R.

(

1997

)

PTG, a protein phosphatase 1-binding protein with a role in glycogen metabolism

.

Science

275

,

1475

–

1478

[PubMed]

https://doi.org/10.1111/j.1432-1033.1992.tb17508.x

48.

Alessi

D.

et al.

(

1992

)

The control of protein phosphatase-1 by targetting subunits. The major myosin phosphatase in avian smooth muscle is a novel form of protein phosphatase-1

.

Eur. J. Biochem.

210

,

1023

–

1035

[PubMed]

https://doi.org/10.1093/cvr/cvq252

49.

Aoyama

H.

et al.

(

2011

)

Isoform-specific roles of protein phosphatase 1 catalytic subunits in sarcoplasmic reticulum-mediated Ca(2+) cycling

.

Cardiovasc. Res.

89

,

79

–

88

[PubMed]

https://doi.org/10.3109/03008207.2015.1125480

50.

Broome

D.T.

and

Datta

N.S.

(

2016

)

Mitogen-activated protein kinase phosphatase-1: function and regulation in bone and related tissues

.

Connect. Tissue Res.

57

,

175

–

189

[PubMed]

https://doi.org/10.1016/j.biocel.2017.10.002

51.

Meeusen

B.

and

Janssens

V.

(

2018

)

Tumor suppressive protein phosphatases in human cancer: Emerging targets for therapeutic intervention and tumor stratification

.

Int. J. Biochem. Cell Biol.

96

,

98

–

134

[PubMed]