AF4 belongs to a family of proteins implicated in childhood lymphoblastic leukaemia, FRAXE (Fragile X E site) mental retardation and ataxia. AF4 is a transcriptional activator that is involved in transcriptional elongation. Although AF4 has been implicated in MLL (mixed-lineage leukaemia)-related leukaemogenesis, AF4-dependent physiological mechanisms have not been clearly defined. Proteins that interact with AF4 may also play important roles in mediating oncogenesis, and are potential targets for novel therapies. Using a functional proteomic approach involving tandem MS and bioinformatics, we identified 51 AF4-interacting proteins of various Gene Ontology categories. Approximately 60% participate in transcription regulatory mechanisms, including the Mediator complex in eukaryotic cells. In the present paper we report one of the first extensive proteomic studies aimed at elucidating AF4 protein cross-talk. Moreover, we found that the AF4 residues Thr220 and Ser212 are phosphorylated, which suggests that AF4 function depends on phosphorylation mechanisms. We also mapped the AF4-interaction site with CDK9 (cyclin-dependent kinase 9), which is a direct interactor crucial for the function and regulation of the protein. The findings of the present study significantly expand the number of putative members of the multiprotein complex formed by AF4, which is instrumental in promoting the transcription/elongation of specific genes in human cells.
INTRODUCTION
AF4 is the prevalent (37%) MLL (mixed-lineage leukaemia) fusion gene associated with spontaneous acute lymphoblastic leukaemia [1,2]. The AF4 gene transcript is ubiquitously expressed in all types of haemopoietic cells and in other human tissues, including brain [2,3]. The AF4 protein is a member of the ALF [AF4/LAF4 (lymphoid nuclear protein related to AF4)/FMR2 (Fragile X E mental retardation syndrome)] family of nuclear proteins, which include AF4, AF5q31, LAF4 and FMR2 [4–8]. AF4, AF5q31 and LAF4 form fusion genes with MLL in leukaemia [9]. There are three regions that are conserved in ALF family members: the N-terminal homology domain, the ALF domain, which contains a proline/serine-rich region, and the C-terminal homology domain (Figure 1) [9]. Furthermore, all of the ALF family members, except FMR2, have a transactivation domain [9]. The ALF domain seems to promote ALF protein degradation through the proteasome pathway by mediating their interaction with SIAH (seven in absentia homologue) ubiquitin ligases [10,11]. A murine AF4-knockout model demonstrated that AF4 is important for normal lymphocyte development and cell growth [12]. Furthermore, Af4 was identified as the disease gene in the robotic mouse, a dominant N-ethyl-N-nitrosourea mutant that, in addition to defects in early T-cell maturation, develops ataxia because of Purkinje cell degeneration in the cerebellum [13]. This finding suggested that AF4 plays a hitherto unknown role in a function of Purkinje cells that is essential for the control of balance and motor co-ordination [14]. Moreover, AF4, also through the interaction with the P-TEFb (positive transcription elongation factor b) [15], ENL and/or AF9 [5], was found to have transcriptional regulatory properties that entail elongation and chromatin remodelling involving Pol II (RNA polymerase II) [15]. Of note, the ENL family proteins ENL and AF4 associate with DOT1L, the histone methyltransferase that modifies H3K79 and marks actively transcribed genes [15]. A recent study demonstrated that AF4 belongs to a higher-order multiprotein complex, which is constituted by at least P-TEFb, ENL and AF5q31 (the ‘AEP complex’). This multiprotein complex is recruited by wild-type MLL only on some of its target promoters [i.e. HOXA9 (homeobox A9) and MEIS1 (myeloid ecotropic viral integration site 1 homologue)] [16] by context-dependent mechanisms that are still unknown. In contrast, chimaeric oncoproteins originating from the fusion of MLL with AEP components (MLL–AF4, MLL–AF5q31 and MLL–ENL) constitutively form hybrid complexes to cause sustained expression of MLL-target genes that leads to leukaemic transformation of haemopoietic cells [16].
Schematic representation of the flagged AF4 constructs used in the present study
In the present study, we used functional proteomics procedures to look for proteins that interact with human AF4 in an attempt to gain further insights into its function and the regulatory mechanism in which it is involved in physiological, as well as in pathological, transcriptional pathways.
EXPERIMENTAL
Plasmids
The full-length AF4 cDNA (GenBank® accession number NM_005935) and partial AF4 cDNAs, designated AF4-1 [bp (base pairs) 4–1950; aa (amino acids) 2–650], AF4-1.1 (bp 4–833; aa 2–277), AF4-1.2 (bp 696–1497; aa 232–499) and AF4-1.3 (bp 1384–1950; aa 462–650), were cloned into the N-terminal p3X-FLAG 7.1 vector (Sigma–Aldrich), to obtain recombinant proteins tagged with a FLAG epitope at the N-terminus (Figure 1). Primer sequences and cycling conditions are available upon request from the corresponding author.
Antibodies
The antibodies used were: mouse monoclonal anti-FLAG M2 and anti-α-tubulin (Sigma–Aldrich); rabbit polyclonal anti-AF4 for immunoprecipitation and anti-AF4 for Western blotting (Bethyl Laboratories); rabbit polyclonal anti-CDK9 (cyclin-dependent kinase 9), anti-ELL (eleven-nineteen lysine-rich leukaemia), anti-YWHAQ, anti-YWHAE, anti-MED1 (MED is Mediator complex subunit) (Santa Cruz Biotechnology); goat polyclonal anti-MED7, anti-CRSP3 (also known as MED23), anti-SIAH-1, anti-MED27, anti-MED24, anti-MED17, anti-MED6 and anti-MED26 (Santa Cruz Biotechnology); anti-mouse and anti-rabbit secondary antibodies (GE Healthcare); and anti-goat secondary antibody (Santa Cruz Biotechnology).
Cell culture and transfection
HEK (human embryonic kidney)-293 cells (A.T.C.C. number CRL-1573) were grown in DMEM (Dulbecco's modified Eagle's medium; Lonza), supplemented with 10% FBS (fetal bovine serum; Lonza) and 10 ml/l penicillin/streptomycin (Sigma–Aldrich). Cells were seeded for 24 h before transfection at approximately 6×104/cm2 confluency and transfected for 48 h using the calcium phosphate method. 697 cells (human pre-B lineage leukaemia) [DSMZ (Deutsche Sammlung von Mikroorganismen und Zellkulturen) number ACC 42] were cultured in RPMI 1640 medium (Lonza) supplemented with 20% FBS (Lonza), 2 mM L-glutamine (Sigma–Aldrich) and 10 ml/l penicillin/streptomycin (Sigma–Aldrich).
Cell lysis, protein extraction and immunoprecipitation
HEK-293 cells were transfected with either a recombinant or empty vector (mock control) and lysed in immunoprecipitation buffer A [10% glycerol, 50 mM Tris/HCl (pH 8), 150 mM NaCl, 0.1% Nonidet P40, 0.5 M EDTA and 10 μl/ml PICM (protease inhibitor cocktail for mammalian tissues; Sigma-Aldrich)]. 697 cells were lysed in immunoprecipitation buffer B [50 mM Tris/HCl (pH 8), 150 mM NaCl, 10 mM KCl, 1.5 mM MgCl2, 10 mM sodium fluoride, 1.5 mM MgCl2, 1 mM sodium orthovanadate, 0.2 mM EDTA, 0.5% Nonidet P40 and 10 μl/ml PICM].
For the FLAG immunoprecipitation assays, the lysate was incubated for 1 h at 4 °C with anti-FLAG M2–agarose affinity gel (Sigma–Aldrich) using 40 μl per 10 mg of total proteins. Immunocomplexes were eluted from the anti-FLAG affinity gel using 1 μl of elution buffer [immunoprecipitation buffer A with 200 μg/ml 3X-FLAG peptide (Sigma–Aldrich)] per μl of gel. For the immunoprecipitation assays using specific antibodies directed against endogenous interactors, lysates either from transfected HEK-293 or 697 cells were incubated overnight at 4 °C with 1–3 μg of a specific antibody, i.e. anti-MED7, anti-CRSP3, anti-CDK9, anti-ELL, anti-YWHAQ, anti-YWHAE or anti-SIAH-1 and, only for lysate from 697 cells, anti-AF4 antibody for immunoprecipitation, per 10 mg of total proteins. Subsequently, the mixture from HEK-297 cells was incubated with 30 μl of Protein A/G PLUS–agarose (Santa Cruz Biotechnology) per 1 μg of antibody, and the immunocomplexes were washed several times with immunoprecipitation buffer A. Otherwise, the mixture from 697 cells was incubated with 5 μl of Protein A–Sepharose 4 Fast Flow (GE Healthcare) per 1 μg of antibody, and the Sepharose-bound immunocomplexes were washed several times with immunoprecipitation buffer B. As a negative control, we used cell extracts incubated with serum IgGs (goat or rabbit, Sigma–Aldrich), processed in a similar manner to the samples.
Western blot analysis
Either 40 μg of WCE (whole-cell extract) from HEK-293 cells, or 40 μg of WCE from 697 cells, or 15 μl of sample from each immunoprecipitation experiment were loaded on to SDS/PAGE and then transferred on to a nitrocellulose membrane for Western blot analysis. Blots were incubated with the following primary antibodies: anti-FLAG-M2 (1:5000 dilution) or anti-α-tubulin (1:1000 dilution); anti-AF4 for Western blotting (1:8000 dilution); anti-CDK9, anti-YWHAQ, anti-YWHAE, anti-SIAH-1 (all at a 1:1000 dilution); anti-ELL (1:500 dilution); anti-MED1, anti-MED7, anti-CRSP3, anti-MED27, anti-MED24, anti-MED17, anti-MED6 and anti-MED26 (all at a 1:200 dilution). Membranes were then incubated with HRP (horseradish peroxidase)-conjugated appropriate secondary antibodies, i.e. anti-mouse or anti-rabbit (1:5000 dilution) or anti-goat (1:8000 dilution). The bands were visualized with the ECL (enhanced chemiluminescence) or ECL Plus detection system (GE Healthcare).
Preparative immunoaffinity purification
AF4-1-interacting proteins were purified using anti-FLAG-M2 affinity gel (Sigma–Aldrich) (as described above) on 50 mg of WCE from HEK-293 cells transfected with the FLAG–AF4-1 construct. The same purification protocol was also carried out with an equal amount of extract from HEK-293 cells that were transfected with the empty p3X-FLAG vector (mock control). The proteins eluted from the agarose beads were chloroform/methanol-precipitated and the pellets were dried [17].
SDS/PAGE, in-gel digestion and MS analysis
The protein pellets obtained by affinity purification from transfected and from mock control cells were resuspended in SDS/PAGE buffer and fractionated by SDS/PAGE (10% gels). Protein electrophoretic patterns were visualized using Gel Code Blue Stain Reagent (Pierce). The two gel lanes were cut to generate 38 slices (~2 mm) per lane. Subsequent in-gel digestion and nano-LC-ESI (liquid chromatography-electrospray ionization)-MS/MS (tandem MS) analyses of each of the protein slices was performed as described previously [18,19].
Raw data from nano-LC-ESI-MS/MS analyses were converted into a peak list by using LC-MSD Trap 6.0 Build 485.0 (Agilent Technologies) to create a Mascot format text. Proteins were identified in-house by means of Mascot software version 2.1 (Matrix Science) [20]. The protein search was governed by the following parameters: non-redundant protein sequence data base (NCBInr, July 2009 – 20070908 database with 7415798 sequences and 2558340887 residues downloaded; Sprot, July 2009 – 57.6 database with 495880 sequences and 174780353 residues downloaded; the protein sequence entries actually searched numbered 187996 for NCBInr and 20331 for Sprot because of the taxonomy restriction to Homo sapiens); specificity of the proteolytic enzyme used for hydrolysis (trypsin); taxonomic category of the sample (H. sapiens); no protein molecular mass was considered; up to one missed cleavage; carbamidomethyl cysteine was set as a fixed modification, and protein N-acetylation, N-terminal pyroglutamate, oxidized methionine residues, and phosphorylation of serine, threonine or tyrosine were set as variable modifications; a precursor peptide maximum mass tolerance of 100 p.p.m. and a maximum fragment mass tolerance of 150 p.p.m. According to the probability-based Mowse score [20], the ion score is −10×Log(P), where P is the probability that the observed match is a random event. Individual scores >38 indicate identity or extensive homology (P<0.05). Individual MS/MS spectra for peptides with a Mascot score equal to 38 were inspected manually. All of the peptides were inspected for redundancy and represent sequences that are unique within the database used. Phosphopeptide spectra with a minimum Mascot score of 38 (P<0.05) were considered for further manual data interpretation. Phosphorylation site assignments by Mascot were validated and verified manually using the MS-Product tool (University of California San Francisco, Mass Spectrometry Facility, http://prospector.ucsf.edu/) with loss of H3PO4 and multiple losses enabled as an additional option, focusing on the occurrence of b- and y-ions, internal fragments, and neutral losses of H2O, NH3 or H3PO4 from these ions and the parent ion respectively.
In silico analysis
To define an in silico ‘interactome’ map of AF4, we screened the HPRD (Human Protein Reference Database; freely available at http://www.hprd.org) for AF4 interactions [21]. We created a list of the known protein–protein interactions of AF4, and a list of primary (direct) and secondary (indirect) interactions was reported for each identified protein. In April 2011, the HPRD database contained 30047 human protein entries, 39194 protein–protein interactions and 453521 PubMed links. We used the HPRD, the BIND (Biomolecular Interaction Network Database; http://bond.unleashedinformatics.com/Action), DAVID bioinformatics resources (http://www.david.abcc.ncifcrf.gov) and STRING (http://string.embl.de) for protein clustering.
Supporting information available (http://www.BiochemJ.org/bj/438/bj4380121add.htm)
Identification of AF4-1-interacting proteins by nanoLC/ESI-MS/MS, along with identified peptide sequences, precursor mass, charge state, mass errors, single peptide MASCOT score and sequence coverage (Supplementary Table S1). Analysis of control lane by nanoLC/ESI-MS/MS, along with identified peptide sequences, precursor mass, charge state, mass errors, single peptide MASCOT score, sequence coverage (Supplementary Table S2). Confirmation of identified phosphorylation sites (Supplementary Table S3). In silico analysis of AF4 interactors by the Human Protein Reference Database (http://www.hprd.org) (Supplementary Table S4).
RESULTS AND DISCUSSION
We have used a functional proteomics approach to identify, in the HEK-293 cell line, proteins that interact with AF4, a member of the ALF family that supplies most of the partners of the oncogenic fusion proteins involved in childhood acute lymphoblastic leukaemia [1]. The present study represents one of the first global functional proteomic analyses to determine the identity of proteins that form a multimeric complex with AF4.
In contrast with HeLa, SAOS, K562 and SK-NB-E cell lines, the human HEK-293 cell line efficiently expressed our ‘baits’, the recombinant proteins FLAG–AF4-1 and FLAG–AF4 (Figure 1). Moreover, as reported in the HPRD and Unigene databases, the AF4 transcript is expressed in kidney, and HEK-293 cells have served previously as a heterologous expression system to study mouse recombinant AF4 [15]. We transfected HEK-293 cells with the full-length AF4 and with the AF4-1 cDNA constructs (Figure 2A). Co-immunoprecipitation with the flagged full-length AF4 protein yielded protein amounts that were sufficient for Western blot analysis, but not for an exhaustive functional proteomic analysis. Consequently, we used the FLAG–AF4-1 construct for large-scale immunoaffinity purification, and then validated the results in Western blot experiments with the full-length recombinant protein. It is of note that FLAG–AF4-1 encodes the N-terminal half of the protein containing the transactivation domain (Figure 1), but not the binding domains for AF9 and ENL, the other two most common MLL fusion partners [22].
Expression of flagged proteins and affinity co-immunoprecipitations
Proteins from total lysates of HEK-293 cells transfected with FLAG–AF4-1 and with the empty vector (mock control) were immunoprecipitated, and immunocomplexes were fractionated by SDS/PAGE (10% gels) (Figure 2B). From each gel lane, we obtained 38 peptide mixtures, each of which was analysed in duplicate by MS. Peptide mixtures from the control lane were always injected before peptides from the sample lane to avoid carry-over.
Our analysis, although unable to distinguish between direct and indirect interactors, revealed 51 proteins, most of which were not previously known to be molecular partners of AF4. They are listed in Table 1, grouped according to their function. Supplementary Table S1 (at http://www.BiochemJ.org/bj/438/bj4380121add.htm) shows, for each protein entry, the peptide sequence identified, the precursor mass (m/z), the charge state (z), the mass errors (p.p.m.) on the precursor peptide, the Mascot score for each peptide and the protein sequence coverage. Each protein is reported only with the gene symbol from NCBI source to eliminate redundancy.
Gene symbol . | Accession number . | Protein . |
---|---|---|
Transcription-regulatory proteins | ||
PPARBP, MED1 | gi∣2765322 | Activator-recruited cofactor 205 kDa component |
MED12 | gi∣4827042 | Mediator of RNA polymerase II transcription, subunit 12 homologue |
CRSP2, MED14 | gi∣4580326 | Cofactor required for Sp1 transcriptional activation, subunit 2, 150 kDa |
CRSP3, MED23 | gi∣28558969 | Cofactor required for Sp1 transcriptional activation, subunit 3, 130 kDa |
THRAP4, MED24 | gi∣8699628 | Vitamin D receptor-interacting protein complex component DRIP100 |
CRSP6, MED17 | gi∣28558975 | Cofactor required for Sp1 transcriptional activation, subunit 6, 77 kDa |
CRSP7, MED26 | gi∣28558977 | Cofactor required for Sp1 transcriptional activation, subunit 7, 70 kDa |
RBBP4 | gi∣13111851 | Retinoblastoma-binding protein 4 |
RUVBL2 | gi∣13111851 | 48 kDa TATA-box-binding protein-interacting protein |
EAF1 | gi∣27370592 | ELL-associated factor 1 |
MED4 | gi∣7141320 | p36 TRAP/SMCC/PC2 subunit (Mediator of RNA Pol II transcription, subunit 4 homologue) |
CRSP8, MED27 | gi∣7141322 | Cofactor required for Sp1 transcriptional activation, subunit 8, 34 kDa |
MED7 | gi∣13528909 | Cofactor required for Sp1 transcriptional activation, subunit 9 |
MED6 | gi∣3329506 | RNA polymerase transcriptional regulation mediator |
MED8 | gi∣33988564 | Mediator of RNA polymerase II transcription subunit 8 activator-recruited cofactor 32 kDa component |
TRFP, MED20 | gi∣4323033 | Trf (TATA binding protein-related factor)-proximal homologue |
MED21 | gi∣1515377 | RNA polymerase II holoenzyme component SRB7 |
ELL | gi∣10130023 | RNA polymerase II elongation factor ELL |
PCQAP | gi∣14043091 | MED15 |
TCEA1 | gi∣313223 | Transcription elongation factor A (SII), 1 |
DNA-directed RNA polymerase proteins | ||
POLR2A | gi∣36124 | RNA polymerase II largest subunit |
POLR2B | gi∣4505941 | RNA polymerase II second largest subunit |
POLR2C | gi∣2920711 | Polymerase (RNA) II (DNA-directed) polypeptide C, 33kDa |
DNA-binding proteins | ||
MCM3 | gi∣1552242 | DNA polymerase α-holoenzyme-associated protein P1 (p102 protein) |
RUVBL1 | gi∣15277588 | 49 kDa TATA box-binding protein-interacting protein |
RNA-binding proteins | ||
NCL | gi∣189306 | Nucleolin |
PRPF31 | gi∣40254869 | PRP31 pre-mRNA processing factor 31 homologue |
SNRPD2 | gi∣29294624 | Small nuclear ribonucleoprotein D2 polypeptide 16.5 kDa |
Serine/threonine kinase protein | ||
CDK9 | gi∣12805029 | Cyclin-dependent kinase 9 (CDC2-related kinase) |
Serine/threonine phosphatase protein | ||
PPP2R1A | gi∣178663 | Protein phosphatase 2 (formerly 2A) regulatory subunit A (PR 65) α isoform |
Translation regulator protein | ||
EEF1B2 | gi∣12652911 | Eukaryotic translation elongation factor 1 β1 |
Auxiliary transport protein | ||
GDI2 | gi∣285975 | Rab GDP dissociation inhibitor β |
Receptor signalling complex scaffold proteins | ||
YWHAE | gi∣12655169 | 14-3-3 protein, ϵ isoform (protein kinase C inhibitor protein 1) |
YWHAQ | gi∣55594676 | 14-3-3 protein θ (14-3-3 protein T-cell) |
SPIN | gi∣5730065 | SPINL |
AP3S1 | gi∣4502861 | Adapter-related protein complex 3 σ1 subunit |
Signal transduction proteins | ||
CCNT1 | gi∣2981196 | CDK9 associated C-type cyclin |
S100A8 | gi∣21614544 | S100 calcium-binding protein A8 |
S100A9 | gi∣4506773 | S100 calcium-binding protein A9 |
Transmembrane receptor protein tyrosine kinase protein | ||
FGFR2 | gi∣29432 | Fibroblast growth factor (FGR) receptor |
Lipid kinase proteins | ||
PIP5K2B | gi∣1857637 | Phosphatidylinositol-4-phosphate 5-kinase type II, β |
PIP5K2C | gi∣21322230 | Phosphatidylinositol-4-phosphate 5-kinase, type II, γ |
Heterotrimeric G-protein GTPase protein | ||
GNB4 | gi∣12654119 | Guanine-nucleotide-binding protein (G-protein), β polypeptide 4 |
Guanyl-nucleotide exchange factor protein | ||
ARHGEF4 | gi∣8809845 | Rho guanine-nucleotide-exchange factor (GEF) 4 |
Enzyme: dehydrogenase | ||
GAPDH | gi∣31645 | Glyceraldehyde-3-phosphate dehydrogenase |
Enzyme: synthase | ||
PTS | gi∣4506331 | 6-pyruvoyltetrahydropterin synthase |
Unknown | ||
SMAD9 | gi∣13959539 | MAD mothers against decapentaplegic homologue 9 |
LSM14A | gi∣16877144 | LSM14 homologue A (RNA-associated protein 55) |
DRG1 | gi∣17939479 | Developmentally regulated GTP-binding protein 1 (neural precursor cell expressed developmentally down-regulated 3) |
HCCA2 | gi∣55249549 | HCCA2 protein (MOB2) |
C20orf11 | gi∣21594655 | Chromosome 20 open reading frame 11 |
Gene symbol . | Accession number . | Protein . |
---|---|---|
Transcription-regulatory proteins | ||
PPARBP, MED1 | gi∣2765322 | Activator-recruited cofactor 205 kDa component |
MED12 | gi∣4827042 | Mediator of RNA polymerase II transcription, subunit 12 homologue |
CRSP2, MED14 | gi∣4580326 | Cofactor required for Sp1 transcriptional activation, subunit 2, 150 kDa |
CRSP3, MED23 | gi∣28558969 | Cofactor required for Sp1 transcriptional activation, subunit 3, 130 kDa |
THRAP4, MED24 | gi∣8699628 | Vitamin D receptor-interacting protein complex component DRIP100 |
CRSP6, MED17 | gi∣28558975 | Cofactor required for Sp1 transcriptional activation, subunit 6, 77 kDa |
CRSP7, MED26 | gi∣28558977 | Cofactor required for Sp1 transcriptional activation, subunit 7, 70 kDa |
RBBP4 | gi∣13111851 | Retinoblastoma-binding protein 4 |
RUVBL2 | gi∣13111851 | 48 kDa TATA-box-binding protein-interacting protein |
EAF1 | gi∣27370592 | ELL-associated factor 1 |
MED4 | gi∣7141320 | p36 TRAP/SMCC/PC2 subunit (Mediator of RNA Pol II transcription, subunit 4 homologue) |
CRSP8, MED27 | gi∣7141322 | Cofactor required for Sp1 transcriptional activation, subunit 8, 34 kDa |
MED7 | gi∣13528909 | Cofactor required for Sp1 transcriptional activation, subunit 9 |
MED6 | gi∣3329506 | RNA polymerase transcriptional regulation mediator |
MED8 | gi∣33988564 | Mediator of RNA polymerase II transcription subunit 8 activator-recruited cofactor 32 kDa component |
TRFP, MED20 | gi∣4323033 | Trf (TATA binding protein-related factor)-proximal homologue |
MED21 | gi∣1515377 | RNA polymerase II holoenzyme component SRB7 |
ELL | gi∣10130023 | RNA polymerase II elongation factor ELL |
PCQAP | gi∣14043091 | MED15 |
TCEA1 | gi∣313223 | Transcription elongation factor A (SII), 1 |
DNA-directed RNA polymerase proteins | ||
POLR2A | gi∣36124 | RNA polymerase II largest subunit |
POLR2B | gi∣4505941 | RNA polymerase II second largest subunit |
POLR2C | gi∣2920711 | Polymerase (RNA) II (DNA-directed) polypeptide C, 33kDa |
DNA-binding proteins | ||
MCM3 | gi∣1552242 | DNA polymerase α-holoenzyme-associated protein P1 (p102 protein) |
RUVBL1 | gi∣15277588 | 49 kDa TATA box-binding protein-interacting protein |
RNA-binding proteins | ||
NCL | gi∣189306 | Nucleolin |
PRPF31 | gi∣40254869 | PRP31 pre-mRNA processing factor 31 homologue |
SNRPD2 | gi∣29294624 | Small nuclear ribonucleoprotein D2 polypeptide 16.5 kDa |
Serine/threonine kinase protein | ||
CDK9 | gi∣12805029 | Cyclin-dependent kinase 9 (CDC2-related kinase) |
Serine/threonine phosphatase protein | ||
PPP2R1A | gi∣178663 | Protein phosphatase 2 (formerly 2A) regulatory subunit A (PR 65) α isoform |
Translation regulator protein | ||
EEF1B2 | gi∣12652911 | Eukaryotic translation elongation factor 1 β1 |
Auxiliary transport protein | ||
GDI2 | gi∣285975 | Rab GDP dissociation inhibitor β |
Receptor signalling complex scaffold proteins | ||
YWHAE | gi∣12655169 | 14-3-3 protein, ϵ isoform (protein kinase C inhibitor protein 1) |
YWHAQ | gi∣55594676 | 14-3-3 protein θ (14-3-3 protein T-cell) |
SPIN | gi∣5730065 | SPINL |
AP3S1 | gi∣4502861 | Adapter-related protein complex 3 σ1 subunit |
Signal transduction proteins | ||
CCNT1 | gi∣2981196 | CDK9 associated C-type cyclin |
S100A8 | gi∣21614544 | S100 calcium-binding protein A8 |
S100A9 | gi∣4506773 | S100 calcium-binding protein A9 |
Transmembrane receptor protein tyrosine kinase protein | ||
FGFR2 | gi∣29432 | Fibroblast growth factor (FGR) receptor |
Lipid kinase proteins | ||
PIP5K2B | gi∣1857637 | Phosphatidylinositol-4-phosphate 5-kinase type II, β |
PIP5K2C | gi∣21322230 | Phosphatidylinositol-4-phosphate 5-kinase, type II, γ |
Heterotrimeric G-protein GTPase protein | ||
GNB4 | gi∣12654119 | Guanine-nucleotide-binding protein (G-protein), β polypeptide 4 |
Guanyl-nucleotide exchange factor protein | ||
ARHGEF4 | gi∣8809845 | Rho guanine-nucleotide-exchange factor (GEF) 4 |
Enzyme: dehydrogenase | ||
GAPDH | gi∣31645 | Glyceraldehyde-3-phosphate dehydrogenase |
Enzyme: synthase | ||
PTS | gi∣4506331 | 6-pyruvoyltetrahydropterin synthase |
Unknown | ||
SMAD9 | gi∣13959539 | MAD mothers against decapentaplegic homologue 9 |
LSM14A | gi∣16877144 | LSM14 homologue A (RNA-associated protein 55) |
DRG1 | gi∣17939479 | Developmentally regulated GTP-binding protein 1 (neural precursor cell expressed developmentally down-regulated 3) |
HCCA2 | gi∣55249549 | HCCA2 protein (MOB2) |
C20orf11 | gi∣21594655 | Chromosome 20 open reading frame 11 |
To select proteins that specifically interact with AF4-1, we subtracted species that were common to AF4-1 and non-transfected control lanes (Figure 2B). These common proteins are shown in Supplementary Table S2 (at http://www.BiochemJ.org/bj/438/bj4380121add.htm). All protein species identified by a single peptide were checked further. First, the peptide sequence stretch was manually reconstructed, and then the peptide sequence and the peptide precursor ion mass were inserted into the Mascot software using the sequence query mode. All searches were performed against the NCBInr and Sprot databases. The peptide sequence was searched for using BLAST software (http://ncbi.nlm.nih.gov/blast). Supplementary Table S1 shows the full MS scan and the properly annotated MS/MS scan (with masses and fragment assignments) of proteins identified by a single peptide.
We classified the human proteins identified according to their Gene Ontology cellular localization (Figure 3A) and to the biological processes in which they are involved (Figure 3B). It is of note that the nature of the identified proteins, which are mainly localized in the nucleus (68%), points to the existence of a large protein network involving AF4. Approximately 60% of the proteins identified that take part in this cross-talk are clustered in the Pol II transcriptional complex.
Classification of the identified proteins
To check the authenticity of the AF4-interacting proteins identified by MS/MS, we verified the presence of some of these proteins by anti-FLAG co-immunoprecipitation and Western blot analysis, depending on the availability and efficiency of commercial antibodies. We confirmed that CRSP3/MED23, ELL, CDK9, CRSP33/MED7, YWHAQ and YWHAE specifically interact with AF4-1 and with the full-length AF4 (Figure 4A), thus demonstrating that the latter two, despite their difference in length, have similar folding in common regions. We also identified SIAH-1, a known molecular direct interactor of AF4 [10,11] (Figure 4A). We validated the AF4 interactors further by reverse immunoprecipitation experiments using specific anti-CDK9, anti-YWHAQ, anti-CRSP3, anti-MED7, anti-ELL and anti-YWHAE antibodies on extracts from HEK-293 cells transfected with FLAG–AF4. Western blot analysis with the anti-FLAG antibody revealed the FLAG-tagged protein in each immunocomplex (Figure 4B). These interactions were also confirmed by Western blot analysis of immunocomplexes obtained from protein extracts of an haemopoietic cell line, the human pre-B lineage leukaemia 697 cells, by using specific antibodies directed against either endogenous AF4 or endogenous interactors (Figure 5). Overall, our immunoprecipitation experiments support the conclusion that endogenous CDK9, YWHAQ, CRSP3/MED23, CRSP33/MED7, ELL and YWHAE interact with FLAG–AF4 in HEK-293 cells, but they also interact with endogenous AF4 in 697 leukaemia cells.
FLAG–AF4 associates with CRSP3, MED7, ELL, CDK9, SIAH1, YWHAQ and YWHAE
Endogenous AF4 associates with CRSP3, MED7, ELL, CDK9, SIAH1, YWHAQ and YWHAE
We also tested antibodies directed against all of the components of the mediator complex that we identified in our proteomic study, namely CRSP8/MED27, THRAP4/MED24, PPARBP/MED1, CRSP6/MED17, MED6 and CRSP7/MED26. However, these antibodies failed to detect the respective target proteins in cellular extracts from HEK-293, HeLa, SAOS, SK-NB-E, K562 or 697 cells analysed by Western blotting (results not shown).
It is of note that in our system we did not identify all of the proteins previously reported to interact with AF4. This is probably due to the use of the N-terminal half of AF4 as ‘bait’. In fact, we did not find AF9, ENL, DOT1L or AF5q31, the AF4 partners known to interact with its C-terminal half [23]. However, although our bait (FLAG–AF4-1) lacks the crucial C-terminal-binding domain, we were able to identify various known and new molecular partners, thereby providing further insight into AF4 function in normal and leukaemic conditions. Obviously we cannot exclude that some of the reported interactions may be cell-lineage specific; however, almost all of the proteins that form the AF4 interactome, as well as AF4 itself, are variably expressed in various cell types, according to the UniGene database (http://www.ncbi.nlm.nih.gov/unigene).
AF4 is a phosphorylated protein [15], but its phosphorylation sites are still unknown. LC-MS/MS analysis of the bait protein revealed two phosphorylation sites (Figure 2B). The MS2 spectra from the identified phosphopeptides are shown in Figure 6. Peptide ELSPLISLPSPVPPLSPIHSNQQTLPR turned out to be phosphorylated on Thr220 and Ser212 (Figures 6A and 6B respectively). We also identified the corresponding unmodified peptide (Figure 6C). The phosphorylation sites identified were manually confirmed (Supplementary Figure S1 at http://www.BiochemJ.org/bj/438/bj4380121add.htm). Although this peptide was previously reported to be phosphorylated [24], we indicate for the first time the precise localization of the phosphorylation sites.
MS/MS spectra of modified and unmodified peptides of AF4
To extend our data, we carried out an in silico proteomic analysis of proteins that are known to directly or indirectly interact with AF4. Examination of the HPRD indicated two direct interactions for AF4: SMAD9 and ENL [23,25]. In the present study, we identified SMAD9, also known as SMAD8A or SMAD8B (Table 1 and Supplementary Table S1). We carried out a detailed clustering analysis using the HPRD to look for a protein network involved in the same functional scenario. To this aim, we filled out a comprehensive list of known primary and secondary interactions for all proteins found in the present study (see Supplementary Table S3 at http://www.BiochemJ.org/bj/438/bj4380121add.htm).
Several proteins reported in the present study interact with each other, thus suggesting that they participate in the same multiprotein complex or complexes. The vast majority of proteins interacting with human AF4 are involved in Pol II-mediated transcription. Thus there is compelling evidence that AF4 plays a key role in the Pol II transcription machinery. Figure 7 shows a map of the AF4-interacting proteins, with Pol II as the core of the interactions.
Map of protein–protein interactions
Indeed, our proteomics analysis identified the serine/threonine kinase CDK9 and cyclin T1 (CCNT1) that interact each other and form the positive transcription elongation factor P-TEFb, which activate the Pol II elongation machinery [26]. It is known that AF4 associates with P-TEFb, positively regulates its kinase activity and stimulates Pol II transcriptional elongation [15]. Through its ability to phosphorylate the Pol II CTD (C-terminal domain), P-TEFb controls productive elongation of most eukaryotic genes, but also co-ordinates downstream events including pre-mRNA splicing and 3′-end processing [27]. P-TEFb recruits various positive regulators other than AF4, namely, the transcriptional activators HIV Tat, CIITA, c-Myc, NF-κB (nuclear factor κB), MyoD, Brd4, AF5q31 and ELL to specific gene promoters and stimulates transcriptional elongation [16,26,28,29]. A recent paper demonstrate that AF4, AF5q31 and ENL associate in an endogenous higher-order complex containing P-TEFb in haemopoietic lineage cells [16]. Our results indicate that such a complex should also be formed in HEK-293 cells. Therefore we mapped the CDK9-binding region along AF4. To this aim, we produced three new flagged constructs, namely AF4-1.1 (bp 4–833), AF4-1.2 (bp 696–1497) and AF4-1.3 (bp 1384–1950) (Figure 1), which we used to transiently transfect HEK-293 cells. We analysed the corresponding anti-FLAG immunoprecipitates by Western blotting with the anti-CDK9 antibody (Figure 8) and found that, in agreement with Yokoyama et al. [16], only AF4-1.1 (aa 2–277) was able to interact with CDK9. Intriguingly, this peptide contains Ser212 and Thr220, which we found to be phosphorylated in our structural characterization of AF4.
CDK9 interacts with the N-terminal region of AF4 (aa 2–277)
P-TEFb phosphorylates AF4 and down-regulates its transactivation activity [15]. Therefore phosphorylation is a key control mechanism of AF4 activity, and the AF4 N-terminal region might reasonably be the P-TEFb phosphorylation site. This type of control should thus be ineffective on MLL–AF4 chimaeras that lack the AF4 N-terminus. It is notable that, phosphorylation events regulate the function of other components that we identified in the AF4 protein network [e.g. 14-3-3 proteins, FGFR2 (fibroblast growth factor receptor 2), Pol II, nucleolin and MCM (minichromosome maintenance)].
In eukaryotes, Pol II is the central component of the basal Pol II transcription machinery. It moves on the template as the transcript elongates [30]. Elongation is influenced by the P-TEFb-mediated phosphorylation of the CTD of the largest Pol II subunit [31]. In agreement with our present study and with reports from other groups showing that AF4 directly interacts with P-TEFb [15,16], we identified in the AF4 multiprotein complex, the largest subunit (POLR2A), the second largest subunit (POLR2B) and a small subunit of 33 kDa (POLR2C) of Pol II. In addition, we found: (i) nucleolin, which forms complexes with POLR2A, CDK9 and CCNT1, and plays a role in transcriptional elongation [32]; (ii) TCEA1 (transcription elongation factor A), another Pol II transcription elongation factor, which has a role in suppression of transient pausing and strongly synergizes with p300 histone acetylase at a step subsequent to pre-initiation complex formation [33]; and (iii) MCMs, a family of proteins related to ATP-dependent helicases, which has been co-purified with Pol II after anti-MCM3 immunoaffinity chromatography [34]. The results of the present study support the idea that MCM proteins are components of the Pol II transcriptional apparatus [34].
We also found that AF4-1 interacts with the Pol II elongation factor ELL that, as well as AF4, ENL and AF5q31, is frequently fused to MLL in childhood lymphoblastic leukaemia [34]. ELL is a multifunction factor that exerts transcriptional elongation activity and inhibitory effects on the initiation of Pol II-mediated transcription [35,36], besides acting as a transcription factor [37]. In particular, the CTD of ELL interacts with P-TEFb [36], a molecular interactor of the N-terminal domain of AF4, as we demonstrate in the present study. Our proteomic analysis also identified EAF1 (ELL-associated factor 1) [38]. EAF1 interaction with the CTD of ELL is necessary and sufficient for the leukaemogenic effect of the MLL–ELL fusion protein [38]. Notably, a heterologous MLL–EAF1 fusion protein recapitulates the phenotype of MLL–ELL in vitro and in vivo [38]. EAF1 is a strong positive regulator of ELL elongation activity and contains a transactivation domain that has high sequence identity with the transactivation domain of AF4 [38]. Therefore it is not surprising that the latter may also associate with ELL and positively regulate its transactivating function [38,39]. The identification of ELL and EAF1 among AF4 interactors is in line with a recent report demonstrating that ELL and EAF1 are part of a SEC (super elongation complex), which includes P-TEFb, AF5q31, ENL and AF4, that is crucial in the control of transcription elongation [40]. Such a complex should be recruited constitutively by MLL chimaeras that should thus be able to bypass the normal transcription initiation and elongation checkpoint steps, and activate aberrant MLL-target gene expression [16,40]. Indeed, knockdown of the central SEC component AF5q31 in MLL–AF4 leukaemia cells causes a reduction in the expression of HOXA9, a key mediator of leukaemogenesis [40].
Although we cannot be sure about the interactions that were not validated with immunoprecipitation/Western blot methods, the data discussed so far demonstrate the appropriateness of our proteomic approach. In fact, all the interactions we identified are in line with the most recent findings about the composition of the multiprotein complex involving AF4 [16,40]. Indeed, the finding of Pol II subunits in this complex supports the crucial role of AF4 in the transcriptional machinery. However, the novelty of the present study resides in the identification of 15 out of the 28 proteins that form the mammalian Mediator complex (Table 1).
The Mediator complex is a multiprotein transcription factor that is evolutionarily conserved and ubiquitously expressed in eukaryotes from yeast to man [41,42]. This large multisubunit complex is the primary regulator of the assembly of the pre-initiation complex, which includes the general transcription factors TFIIA, TFIIB, TFIID, TFIIE, TFIIF, TFIIE and Pol II [41–43]. In general, transcription factors and activators physically bind to a specific subunit of Mediator and thereby recruit other components of the complex to the target promoter. In particular, activator binding provokes specific conformational shifts in Mediator that induce a conformational state in the neighbouring subunits of the complex. Indeed, Mediator can adopt various conformations after the binding of different transcriptional activators to different Mediator subunits [44]. Therefore structural changes in Mediator afford additional opportunities to fine-tune the diverse regulatory inputs received from the DNA-binding factors and from other signals to the transcription machinery [45]. Genetic, biochemical and structural data revealed that Mediator comprises several modules (head, middle, tail and CDK8) [46]. The head, middle and tail modules form a relatively stable ‘core’ that is composed of 18–20 tightly associated subunits [40,45]. Otherwise, the subunits PPARBP/MED1, MED25 and MED8, and the module constituted by CRSP3/MED23, THRAP4/MED24 and TRAP95/MED16 are variably and weakly associated with the central core [40]. The constant core and the variably associated proteins are considered to be components of the active forms of the Mediator complex. In contrast, SRB10/CDK8, MED13, MED12 and SRB11/CycC (cyclin C) form a functional and physically separable module that has been implicated in transcriptional repression [40,45]. The potential microheterogeneity could result in a wide spectrum of mammalian Mediator complexes [47].
Our proteomic analysis of the AF4 complex identified ten Mediator central core subunits (TRAP36/MED4, TRAP32/MED6, CRSP33/MED7, CRSP2/MED14, CRSP6/MED17, TRFP/MED20, MED21, CRSP7/MED26, PCQAP/MED15 and CRSP8/MED27) and four variable subunits (PPARBP/MED1, MED8, CRSP3/MED23 and THRAP4/MED24). Apart from MED12, all of the components identified belong to the active form of Mediator. It is of note, however, that MED12 directly interacts with mammalian β-catenin and activates target genes, thus indicating that MED12 also has an alternative activating role [48]. Interestingly, β-catenin is activated during development of MLL leukaemic stem cells [49].
As our identifications indicate that AF4 interacts with Pol II and the Mediator complex, we may speculate that AF4 binding induces a structural change in Mediator thereby activating stalled Pol II to transition to a positively elongating state and enabling effective transcription to take place. Therefore the AF4–Mediator complex interaction might be crucial for activation of specific gene expression. In this context, it is of note that Yokoyama et al. [16] demonstrate that the AEP complex (AF4, AF5q31, ENL and P-TEFb) co-localizes with wild-type MLL on specific target promoters, thereby indicating that this complex plays a role in physiological as well as in oncogenic MLL-dependent transcriptional pathways. However, they also showed that recruitment of AEP to MLL-target loci was not constitutive, because some MLL-occupied loci (e.g. HOXA7) did not contain AEP. Therefore these authors hypothesized that the MLL complex probably requires other, as yet unidentified, factors or signals for specific recruitment of AEP, and concluded that AEP recruitment, a downstream event in physiological MLL-dependent transcriptional pathways, is regulated in a context-dependent manner. This hypothesis suggested that the results of the present study may be integrated into the Yokoyama et al. [16] model. Specifically, we suggest that the Mediator complex, depending on its subunit composition, plays a crucial role in the recruitment of the AEP complex on specific MLL-target promoters. Indeed, the wild-type MLL complex may initiate the activation pathway by binding the regulatory region of target genes. It would then recruit some specific Mediator components, in a context-dependent manner. For example, MLL could recruit the head subunit MED17/TRAP80, a known direct interactor of ASH2L [ash2 (absent, small, or homeotic)-like] that, together with RBBP4 (retinoblastoma binding protein 4), an AF4 partner (Supplementary Figure S1), forms the MLL complex [50,51]. Alternatively, some Mediator components may recruit MLL directly to the target chromatin. Indeed, PPARBP/MED1 directly interacts with various ligands of nuclear receptors (thyroid hormone receptor, vitamin D receptor, peroxisome proliferator-activated receptor-γ, hepatocyte nuclear factor 4α, glucocorticoid receptor and oestrogen receptor), as well as with non-receptor type factors such as GATA1 (GATA-binding protein 1). After these initial recruitments, assembly of the pre-initiation complex, which entails the various general transcription factors and Pol II, and transcription initiation then ensue, together with the concomitant recruitment of other ad hoc components of the Mediator complex. After Pol II clears the promoter, the process can proceed directly to the elongation phase thanks to the Mediator-dependent specific recruitment of AF4 and the other components of the AEP complex described by Yokoyama et al. [16].
Lastly, we found that YWHAQ and YWHAE, two members of the 14-3-3 protein family, interact with AF4. 14-3-3s are ubiquitous proteins, usually cytosolic, that exert an extraordinarily wide-ranging influence on cellular functions, including cell-cycle control and apoptosis. They operate by binding to specific phosphorylated sites on such diverse target proteins as oncogene products, tumour suppressor proteins, and regulators of cell survival, proliferation and growth [52]. 14-3-3s are often associated with dynamic nucleo-cytoplasmic shuttling. Upon phosphorylation, many nuclear proteins, including transcription factors, bind to 14-3-3s, which control their rate of nuclear import/export thereby modulating transcriptional processes [52]. We suggest that 14-3-3s bind phosphorylated AF4 and contribute to the regulation of its movement into and out of the nucleus. In the cytosol, AF4 undergoes rapid proteasomal degradation via its well-known interaction with SIAH-1a and SIAH-2 ubiquitin ligases [10,11]. This mechanism could closely control AF4 turnover and, consequently, AF4-dependent transcriptional elongation. The phosphorylation sites that we have identified in AF4 might play a key role in the putative 14-3-3-mediated AF4 regulation/degradation pathway.
The characterization of AF4-interacting proteins reported in the present paper supports the growing body of evidence that AF4 is a crucial activator in a multimeric complex that promotes transcription in human cells, and that this multiprotein complex is functionally and structurally regulated (i.e. via the availability of components, phosphorylation, proteosome-mediated degradation, cellular localization and conformational shifts). Indeed, the results of the present study greatly increase the number of putative components participating in the multiprotein complex formed by AF4, as well as by other MLL fusion partners (ENL ELL, AF5q31, AF9 etc.). The information reported in the present paper is useful for future studies aimed at unravelling the AF4-dependent molecular mechanisms, thereby giving insights into the molecular basis of AF4-mediated leukaemia and eventually leading to novel therapeutic targets. We previously reported that even when MLL–AF4 chimaeras lack part of the AF4 transactivation domain, they continue to give rise to acute lymphoblastic leukaemia [53]. Given results from previous studies [16,40] and the results of the present study, it is now clear also that this altered MLL–AF4 chimaera, irrespective of the transactivation domain, is able to recruit all of the protein components that are necessary for the transcription of genes that enhance and sustain cell transformation.
Note added in press
While this article was being processed, a paper appeared [54] that contains some results that overlap our findings.
Abbreviations
- aa
amino acids
- AEP
complex, AF4–ENL–P-TEFb complex
- ALF
AF4/LAF4 (Lymphoid nuclear protein related to AF4)/FMR2 (Fragile X E mental retardation syndrome)
- bp
base pairs
- CDK9
cyclin-dependent kinase 9
- CTD
C-terminal domain
- ECL
enhanced chemiluminescence
- ELL
eleven-nineteen lysine-rich leukaemia
- EAF1
ELL-associated factor 1
- FBS
fetal bovine serum
- FMR2
Fragile X E mental retardation syndrome
- HEK
human embryonic kidney
- HOXA9
homeobox A9
- HPRD
Human Protein Reference Database
- LAF4
lymphoid nuclear protein related to AF4
- LC-ESI
liquid chromatography-electrospray ionization
- MCM
minichromosome maintenance
- MED
Mediator complex subunit
- MLL
mixed-lineage leukaemia
- MS/MS
tandem MS
- PICM
protease inhibitor cocktail for mammalian tissues
- Pol
II, RNA polymerase II
- P-TEFb
positive transcription elongation factor b
- SEC
super elongation complex
- SIAH
seven in absentia homologue
- WCE
whole cell extract
AUTHOR CONTRIBUTION
Gabriella Esposito participated in designing the experiments, provided conceptual input, collated all the data and wrote the final version of the manuscript. Armando Cevenini and Alessandro Cuomo designed and performed the experiments. Francesca De Falco and Dario Sabbatino participated in the design and execution of the experiments. Fabrizio Pane participated in the analysis and discussion of the results. Margherita Ruoppolo participated in designing the experiments and contributed to the discussion and interpretation of the data. Francesco Salvatore conceived the rationale of the investigation and the overall experimental design, participated in the analysis and discussion of the results and in writing the paper. All authors revised the manuscript, and gave their approval to the final version of the text.
We are grateful to Jean Ann Gilder (Scientific Communication srl) for text editing and to Vittorio Lucignano for graphic editorial assistance in relation to the Figures.
FUNDING
This work was supported by the Ministero della Salute (Roma, Italy), by MIUR [grant number PS 35-126 IND] and Progetto SCoPE, by PRIN 2007 to F.S.; by a CEINGE-Regione Campania Convention [grant number DGRC 1901/2009]; by Regione Campania [grant number L.R.5/2002, Es. 2005].