It has been reported previously that the C2-L1Tc protein located in the Trypanosoma cruzi LINE (long interspersed nuclear element) L1Tc 3′ terminal end has NAC (nucleic acid chaperone) activity, an essential activity for retrotransposition of LINE-1. The C2-L1Tc protein contains two cysteine motifs of a C2H2 type, similar to those present in TFIIIA (transcription factor IIIA). The cysteine motifs are flanked by positively charged amino acid regions. The results of the present study show that the C2-L1Tc recombinant protein has at least a 16-fold higher affinity for single-stranded than for double-stranded nucleic acids, and that it exhibits a clear preference for RNA binding over DNA. The C2-L1Tc binding profile (to RNA and DNA) corresponds to a non-co-operative-binding model. The zinc fingers present in C2-L1Tc have a different binding affinity to nucleic acid molecules and also different NAC activity. The RRR and RRRKEK [NLS (nuclear localization sequence)] sequences, as well as the C2H2 zinc finger located immediately downstream of these basic stretches are the main motifs responsible for the strong affinity of C2-L1Tc to RNA. These domains also contribute to bind single- and double-stranded DNA and have a duplex-stabilizing effect. However, the peptide containing the zinc finger situated towards the C-terminal end of C2-L1Tc protein has a slight destabilization effect on a mismatched DNA duplex and shows a strong preference for single-stranded nucleic acids, such as C2-L1Tc. These results provide further insight into the essential properties of the C2-L1Tc protein as a NAC.
INTRODUCTION
Retrotransposons are ubiquitous mobile genetic elements that transpose through an RNA intermediate. These genetic elements are present in the genome of most eukaryotes [1]. They can be classified into two different lineages based on the integration mechanism they utilize. The elements having LTRs (long terminal repeats) are similar in structure and retrotransposition mechanism to those of retroviruses [2]. The elements lacking LTRs, also called LINEs (long interspersed nuclear elements), are very diverse in structure probably due to host–mobile element co-evolution. LINEs use a transposition mechanism originally described for the insect R2Bm non-LTR element, termed TPRT (target-primed reverse transcription) [3].
Most non-LTR retrotransposons have two ORFs (open reading frames). The mechanism for their retrotransposition depends upon the enzymatic functions of the ORF1 and ORF2 encoded proteins [4]. The ORF2 has a high degree of similarity with the pol genes of retroviruses and encodes a protein that provides the reverse transcriptase and endonuclease activities required for TPRT [5]. The role of the ORF1-encoded protein (ORF1p) in LINE retrotransposition has been difficult to uncover since it has not been associated with the function of any known protein. It has been described that in mammalian LINE-L1 elements the ORF1 codes for a protein with RNA-binding activity [6–8] and that it facilitates rearrangements between nucleic acids behaving, thus, as a NAC (nucleic acid chaperone) [9]. The highly basic region located at the C-terminal half of the ORF1p is well conserved among all mammalian ORF1 proteins and it is involved in these activities [10]. The ORF1p protein encoded by the Drosophila melanogaster I factor LINE has also been shown to have nucleic-acid-binding capacity and to be endowed in vitro with NAC activity [11]. The ORF1p from I factor contains a zinc-finger motif (CCHC) similar to the zinc fingers present in the nucleocapside basic portion of the retroviral gag polyproteins. The motif is also present in the proteins encoded by the first ORF from most LINE-like elements [12].
The L1Tc element is the best represented autonomous non-LTR retrotransposon from the Trypanosoma cruzi genome, a protozoan parasite belonging to the Trypanosomatidae family. This parasite is the agent responsible for Chagas' disease, a parasitism that affects 16–18 million people, mainly in Central and South America (www.who.int/tdr/diseases/chagas/direction.htm). The relatively high content of retroelements in the T. cruzi genome has been related to the significant genomic polymorphism and high degree of plasticity that this protozoan pathogen presents [13,14]. L1Tc is actively transcribed in the three stages of the parasite life cycle [15]. Some L1Tc copies have been found to contain a single ORF encoding a 1574 amino acid protein that contains all functional domains [16]. They are considered, therefore, as active transposable elements [16,17]. L1Tc codes for the enzymatic machinery involved in its retrotransposition process including an AP (apurinic/apyrimidinic) endonuclease [18], a 3′ phosphatase, a 3′ phosphodiesterase [18,19], a reverse transcriptase [20] and an RNase H activity [21].
We have previously described that the L1Tc C-terminus encodes a protein, termed C2-L1Tc, which has NAC activity and binds to several types of nucleic acids [22]. C2-L1Tc catalyses the rearrangement of nucleic acids preventing melting of perfect DNA duplexes and facilitates, moreover, the strand exchange between DNAs to form stable DNA duplexes [22]. The C2-L1Tc protein contains two cysteine motifs of the C2H2 type flanked by positively charged amino acid regions. In the context of retrotransposons this is, to our knowledge, the first description of a NAC activity mediated by a protein containing C2H2 zinc-finger motifs. It has been suggested that the two zinc fingers and the basic residues located upstream of the first zinc finger co-operate and are essential for the NAC activity [22]. The C2H2 motifs were first described in the Xenopus laevis TFIIIA (transcription factor IIIA). These motifs are also present in many transcription factors, as well as in other DNA-binding proteins [23]. Furthermore, the C2H2 motifs are also found in proteins encoded by other non-LTR retroelements such as R2 from arthropods, CRE/SLACS from trypanosomes, NeSL from Caenorhabiditis elegans and in the GENIE family from Giardia lamblia [5,24].
The nucleic-acid-binding properties and the different affinity that the NAC proteins have for single- and double-strand nucleic acids have been related to the mechanism of NAC activity of the L1 elements [10]; however, this mechanism has not yet been completely understood in molecular terms. In the present study we have analysed the binding of C2-L1Tc to single- and double-stranded nucleic acids and investigated the contribution of the functional domains of C2-L1Tc to duplex stabilization/destabilization. In addition, we have determined the implication of specific regions of C2-L1Tc in the nucleic-acid-binding properties of the protein and its relationship to the NAC activity.
EXPERIMENTAL
Cloning and protein purification of the recombinant C2-L1Tc
The region of the L1Tc element between positions 3976 and 4851 (GenBank® accession number AF208537) [15] was cloned into the pCAS B vector (Active motif®) as previously described [22] (Figure 1a). The C2-L1Tc protein was produced in bacteria and purified under native conditions as previously described [22]. Thus C2-L1Tc recombinant protein was recovered with more than 95% purity (Figure 1b).
Structure of L1Tc and sequence of the recombinant C2-L1Tc protein (a), and purification of the recombinant protein (b)
Peptide synthesis
Peptides were synthesized by the simultaneous multiple solid-phase synthetic method [25]. The peptides were assembled using the standard t-Boc SPPS (solid-phase peptide synthesis) strategy on a MBHA (p-methylbezhydrilamide) resin [26]. Purity was checked by HPLC. Peptide sequences are shown in Table 1. Peptides were dissolved in sterile 1×PBS containing 30 μM zinc chloride, at a final concentration of 500 μM.
Peptide . | Sequence . | RNA Kd (μM) . | ssDNA Kd (μM) . | dsDNA Kd (μM) . | ΔTm (°C) (b) . | Conc50 (μM) . |
---|---|---|---|---|---|---|
C2-L1Tc | 24±5×10−3 (a) | 98±20×10−3 | 1.78±0.35 | 0 | 0.07 | |
5033 | RRRKEKCPHCDSTLTGFSGLVSHCRSFHP | 1.21±0.302 | 0.42±0.07 | 1.34±0.08 | +12 | 0.4 |
5015 | TVPPSAREEDVSPVRRRTLTRRRKEKC | 0.43±0.11 | 0.21±0.06 | 1.51±0.09 | Not melted | 0.51 |
5016 | RKEKCPHCDSTLTGFSGLVSHCRSFHP | 1.86±0.02 | 1.03±0.08 | 3.12±1.01 | +7 | 1.5 |
5031* | RKEKSPHSDSTLTGFSGLVSHCRSFHP | 3.08±0.11 | 2.29±0.07 | 9.13±0.16 | +2 | 2.4 |
10987 | EHPPPLPELKCDFCDMVFPTRRSTAQHRSRCAHNPD | 6.01±0.05 | 0.30±0.02 | 25.68±4.22 | −1 | 2.9 |
5020 | ATRHRNSSARRRSLLPQDQPAST | 19.79±3.18 | 10.48±0.45 | 22.83±9.78 | −2 | NO |
5030* | TVPPSAREEDVSPV…TLTRRRKEKC | NO | NO | NO | 0 | NO |
5032* | LT……CPHCDSTLTGFSGLVSHCRSFHP | NO | NO | NO | 0 | NO |
Peptide . | Sequence . | RNA Kd (μM) . | ssDNA Kd (μM) . | dsDNA Kd (μM) . | ΔTm (°C) (b) . | Conc50 (μM) . |
---|---|---|---|---|---|---|
C2-L1Tc | 24±5×10−3 (a) | 98±20×10−3 | 1.78±0.35 | 0 | 0.07 | |
5033 | RRRKEKCPHCDSTLTGFSGLVSHCRSFHP | 1.21±0.302 | 0.42±0.07 | 1.34±0.08 | +12 | 0.4 |
5015 | TVPPSAREEDVSPVRRRTLTRRRKEKC | 0.43±0.11 | 0.21±0.06 | 1.51±0.09 | Not melted | 0.51 |
5016 | RKEKCPHCDSTLTGFSGLVSHCRSFHP | 1.86±0.02 | 1.03±0.08 | 3.12±1.01 | +7 | 1.5 |
5031* | RKEKSPHSDSTLTGFSGLVSHCRSFHP | 3.08±0.11 | 2.29±0.07 | 9.13±0.16 | +2 | 2.4 |
10987 | EHPPPLPELKCDFCDMVFPTRRSTAQHRSRCAHNPD | 6.01±0.05 | 0.30±0.02 | 25.68±4.22 | −1 | 2.9 |
5020 | ATRHRNSSARRRSLLPQDQPAST | 19.79±3.18 | 10.48±0.45 | 22.83±9.78 | −2 | NO |
5030* | TVPPSAREEDVSPV…TLTRRRKEKC | NO | NO | NO | 0 | NO |
5032* | LT……CPHCDSTLTGFSGLVSHCRSFHP | NO | NO | NO | 0 | NO |
RNA and DNA synthesis
144nt-RNA and 130nt-RNA were generated using a HindIII-digested pGR77 plasmid that contains 77 bp corresponding to the internal promoter of L1Tc [22,27] and a HindIII-digested TcKMP11n clone whose sequence is not related to L1Tc respectively (GenBank® accession number AJ000077) [28]. In vitro transcription was carried out using, as a template, 2 μg of linearized DNA and T7 RNA polymerase as described by Barroso-delJesus et al. [29]. Then, 30 μCi of [α-32P]UTP (3000 Ci/mmol) were added to the reaction to radiolabel the in-vitro-synthesized transcripts. Specific activity was determined using a Bioscan QC2000 counter. The RNA was eluted from denaturing polyacrylamide gels, precipitated and resuspended in diethyl pyrocarbonate-treated water. A 2100nt-RNA containing a fragment of L1Tc mRNA from nucleotides 232–1468 was generated in vitro using the T7 polymerase from XbaI-linearized pCMV4NL1Tc as described above. Briefly, a 1234 bp KpnI–XbaI PCR fragment from L1Tc (GenBank® accession number AF208537) [15] was cloned into the expression vector pCMV4, resulting in pCMV4NL1Tc. The RNA was transcribed from the T7 promoter. Unincorporated nucleotides were removed by gel filtration (Sephadex G-50). Single-stranded RNAs (130 nt and 2100 nt; 130nt-denatured-RNA and 2100nt-denatured-RNA) were obtained by heating for 2 min at 65 °C, followed by cooling in ice for 2 min.
2.1kb-dsDNA (where dsDNA is double-stranded DNA) and 135bp-dsDNA were obtained by PCR using a T7 oligonucleotide (5′-GTAATACGACTCACTATAGGG-3′) as the sense primer. To amplify 2.1kb-dsDNA, pCMV4NL1Tc was used as a template and a 1446 bp L1Tc oligonucleotide (5′-GCTGATGCGGCGTAGATA-3′) as the antisense primer. A 135bp-dsDNA was amplified using kmp2 (5′-TTCCTCAAGAGTGGTGGC-3′) as the antisense primer and the TcKMP11n clone as a template [25]. The PCR products were purified by gel filtration (Sephadex G-50). A 135 nt single-stranded DNA (135nt-ssDNA) fragment was generated by PCR and enzyme digestion. In this case, the Pfu DNA polymerase and the T4 polynucleotide kinase-phosphorylated kmp2 primer and the T7 primer were used to amplify blunt-end-135bp-dsDNA. Following the PCR amplification, the phosphorylated minus-strand of the PCR product was removed by digestion with λ-exonuclease (Fermentas). After the inactivation of the enzyme by heating at 80 °C for 10 min, the plus-strand was purified by gel filtration (Sephadex G-25) and precipitated with ethanol. Both of the 135bp-dsDNA and 135nt-ssDNA products were 5′-end labelled using [γ-32P]ATP and T4 polynucleotide kinase (Roche). The unincorporated isotope was removed by gel-filtration chromatography (Sephadex G-25).
EMSAs (electrophoretic mobility-shift assays)
In the dsDNA- and ssDNA-binding experiments, 32P-labelled 135bp-dsDNA or 32P-labelled 135nt-ssDNA (0.5 nM) and increasing amounts of C2-L1Tc protein (0.02–2.8 μM), or the indicated concentration of each peptide, were incubated in 20 μl of binding buffer [20 mM Hepes (pH 7.5), 100 mM NaCl, 2 mM MgCl2, 2 mM DTT (dithiothreitol), 5% glycerol and 100 μg/ml BSA], for 30 min at 37 °C. For the RNA-binding experiments, 32P-labelled 130nt-RNA (0.72 nM) was incubated with increasing concentrations of the C2-L1Tc (0.015–0.733 μM) protein or the synthetic peptides (1–30 μM) in 16 μl of binding buffer containing 5 units of RNasin (Ambion) for 30 min at 37 °C. To compare the affinity of C2-L1Tc for 130nt-RNA and 130nt-denatured-RNA (see Figure 3), the reactions containing the native or denatured in vitro transcripts and the indicated amount of C2-L1Tc protein were incubated for 5 min to avoid the formation of any secondary structure in the denatured 130nt-RNA. All of these reactions were incubated in ice and stopped by addition of 8 μl of dye solution (50% glycerol, 0.1% Bromophenol Blue and 0.1% Xylene Cyanol). Nucleic-acid–protein complexes were resolved by electrophoresis on 5% native polyacrylamide gels (39:1, acrylamide/bisacrylamide) containing 1% glycerol. The gels were dried and phosphorimaged. The images were recovered on a Storm 820 and analysed with ImageQuant 5.2 (Amersham Biosciences).
Competition assays were performed by incubation of the C2-L1Tc protein (0.67 μM) with radiolabelled 130nt-RNA (0.72 nM) and increasing amounts of the non-radioactive 130nt-denatured-RNA and non-radioactive 2100nt-denatured-RNA in binding buffer at 37 °C for 5 min to avoid the formation of any secondary structure in the competitors. In a similar way, the binding affinity of the 130bp-dsDNA and 2.1kb-dsDNA fragments was calculated by mixing increasing concentrations of these molecules with the radiolabelled 144nt-RNA transcript (0.65 nM). These reactions were also incubated in binding buffer with the C2-L1Tc protein (0.67 μM) at 37 °C for 5 min. The reaction was stopped as described above. Electrophoretical analysis of the generated products was performed as described above.
DNA-melting assays
Assays were performed as described previously [9,22]. Briefly, a preannealed mismatched duplex was made by mixing 200 mM 32P-labelled 29-mer oligonucleotide with its complementary oligonucleotide containing four internal mismatches (mm29c) in water. The mixture was heated for 5 min at 95 °C. NaCl was added to a concentration of 50 mM and the mixture was slowly cooled to room temperature (22 °C). Then, 1 nM of the 32P-29mer/mm29c preannealed duplex was mixed with 1 μM of each peptide in 50 μl of buffer [20 mM Hepes (pH 7.5), 50 mM NaCl, 1 mM MgCl2, 1 mM DTT and 0.1% Triton X-100]. The sample was incubated for 5 min at temperatures ranging from 25 °C to 55 °C. At each 5 °C interval, a 5 μl aliquot was removed and mixed with 5 μl of ice-cold stop mix (0.4 mg/ml tRNA, 0.2% SDS, 15% Ficoll, 0.2% Bromophenol Blue and 0.2% Xylene Cyanol). Gels and analysis were performed as described. The melting effect was monitored on native 15% polyacrylamide gels. The dried gel was analysed using a phosphorimager system.
RESULTS
Binding properties of the C2-L1Tc protein to nucleic acids
We have previously shown that the C2-L1Tc protein encoded by L1Tc (Figures 1a and 1b), a non-LTR retrotransposon from T. cruzi, exhibits NAC activity and that it is able to bind to several types of nucleic acids with different affinity [22]. To determine whether the different affinity for nucleic-acid binding are due to specific features of the nucleic acids, EMSA experiments were carried out using increasing protein concentrations and several radiolabelled molecules (RNA, ssDNA and dsDNA), having the same sequence composition. As shown in Figure 2(a), and consistent with previous studies [22], when a low protein concentration of C2-L1Tc was incubated with RNA (130nt-RNA) a discrete product was formed. If the protein concentration increased, the amount of reduced-mobility products also increased (Figure 2a). When a ssDNA (135nt-ssDNA) was incubated with C2-L1Tc, a single sharp complex was detected. However, when the DNA was incubated with increasing concentrations of the protein, additional shifted bands were not formed (Figure 2b). In contrast, when dsDNA (135bp-dsDNA) was incubated with the protein, a faint smear was observed, together with a reduction of the amount of free-form dsDNA. The fraction of the protein-bound dsDNA retained on the wells increased as the concentration of protein increased (Figure 2c). These results suggest that the binding behaviour of C2-L1Tc for double- and single-stranded nucleic acids is different. The data from three independent experiments were used to generate linear-regression curves. The protein concentration at which half of each nucleic acid remained bound to the protein (Kd) was estimated to be 24±5×10−3 μM for the RNA, 98±20×10−3 μM for ssDNA and 1.78±0.35 μM for dsDNA. Thus we may conclude that C2-L1Tc has at least a 16-fold higher affinity for ssDNA than for dsDNA, and that it exhibits a clear preference for RNA than for DNA binding.
Nucleic-acid-binding analysis of C2-L1Tc protein
To test whether the high affinity of the protein for RNA is influenced by the 2′-OH group (or the methyl group in thymidine) or by the specific secondary structure of the RNA used, we compared the binding capacity of C2-L1Tc to native 130nt-RNA and to the same RNA in a denatured state, 130nt-denatured-RNA (Figure 3). The slight decrease in affinity of C2-L1Tc for the denatured RNA form (Kd=35.6±1.5×10−3 μM for 130nt-denatured-RNA and Kd=24.2±0.1×10−3 μM for 130nt-RNA) indicated that the RNA conformation influenced the C2-L1Tc binding capability (Figures 3a and 3b). However, the affinity of C2-L1Tc for the denatured 130nt-RNA was still higher than the observed affinity for 135nt-ssDNA (Kd=98±20×10−3 μM) as an indication that the C2-L1Tc protein has a preference for RNA.
Influence of the secondary structure on the binding capacity of C2-L1Tc to RNA
To determine whether the nucleic-acid length affects the capability of C2-L1Tc to bind RNA, competition experiments using competitors of different sizes were carried out (Figure 4). A constant amount of protein and a radiolabelled 130nt-RNA and, as non-radioactive competitors, a denatured 130nt-RNA (130nt-denatured-RNA) or a denatured 2100nt-RNA (2100nt-denatured-RNA) are shown in Figure 4(a). The experimental data were fitted to a four-parameter logistic curve. The EC50, defined as the competitor concentration required to release half the amount of the protein bound to the radiolabelled RNA, and the Hill coefficient (αH) which reflects co-operativity, were determined. The EC50 values were 0.43±0.17 ng/μl for the 130nt-denatured-RNA and 0.12±0.03 ng/μl for the 2100nt-denatured-RNA (Figure 4a). A similar assay was performed using a labelled 144nt-RNA and increasing concentrations of two unlabelled dsDNA molecules, 130bp-dsDNA and 2.1kb-dsDNA (Figure 4b). In this case, the EC50 was 24.69±0.17 ng/μl and 10.34±5.7 ng/μl respectively. These results revealed that, although C2-L1Tc has a lower affinity for double-stranded than for single-stranded nucleic acids, the binding affinity of C2-L1Tc to both types of nucleic acid increases as the length of the nucleic acid increases. Furthermore, the Hill coefficient obtained in all cases was close to ~1, corresponding to a non-co-operative-binding model. Thus the αH values of denatured-RNA and dsDNA binding of higher length were 1.13±0.24 for the 2100nt-denatured-RNA long molecule and 0.94±0.39 for the 2.1kb-dsDNA long molecule, compared with 1.06±0.29 and 1.14±0.12 respectively, for nucleic acids of shorter length.
Effect of the nucleic-acid length on the C2-L1Tc nucleic-acid-binding capacity
Mapping of C2-L1Tc-binding domains to RNA
C2-L1Tc has two C2H2 zinc-finger motifs [22] flanked by domains enriched in basic residues such as RRR and RRRKEK (Figure 1a and Table 1). The RRRKEK domain has been described as a NLS (nuclear localization sequence) and also as a DNA-binding motif [30]. To determine the implication of these domains in the binding of C2-L1Tc to nucleic acids, peptides mapping the zinc fingers and the basic stretches (see sequence details in Figure 1a) were incubated with labelled 130nt-RNA. Figures 5(a)–5(c) show the band-shift assays. In order to determine the dissociation constant, Kd, the peptide-bound RNA fraction was quantified and plotted against the peptide concentration and fitted to the Hill equation (Figure 5d). The analysis indicated that peptide 5015 (Figure 5a), which covers the NLS and RRR stretches, has the strongest affinity for RNA with a Kd value of 0.43±0.11 μM. Peptide 5033 containing the NLS motif and the zinc finger located immediately downstream of this domain (named upstream-finger) was also shown to have a high affinity for the RNA molecule with a Kd value of 1.21±0.02 μM (Figure 5b). However, peptide 10987 containing the zinc finger situated towards the C-terminal end of the C2-L1Tc protein (named downstream-finger) and the peptide covering the region located downstream of this zinc finger, peptide 5020, had a lower affinity (Kd=6.01±0.05 μM and Kd=19.79±3.18 μM respectively), for the RNA molecule (Figure 5d).
Binding analysis of C2-L1Tc-derived peptides to RNA by EMSA
To further analyse the implication of the upstream-finger and the basic regions located at the C2H2 N-terminal end on the RNA-binding capacity of C2-L1Tc, several peptides containing deletions or substitutions (see sequence details in Table 1) were studied using EMSAs (Figures 5a and 5c). The results showed that the partial or complete deletion of NLS (peptides 5016 and 5032) resulted in a decrease (Kd=1.86±0.02 μM) and a complete loss of affinity for RNA respectively, in spite of the fact that the upstream-finger was conserved. In addition, when the RRR stretch was removed from peptide 5015, to generate peptide 5030, the RNA-binding capacity was eliminated although the NLS domain was present (Figure 5c). The substitution of the CCHH motif for SSHH in peptide 5016, peptide 5031, resulted in a 2-fold decrease in binding affinity (Kd=3.08±0.11 μM) to RNA. These data confirm that the central region of C2-L1Tc containing the upstream-finger and the basic RRR and RRRKEK stretches are the main regions responsible for the binding of the protein to RNA.
Mapping of C2-L1Tc-binding domains to ssDNA
A similar approach to that described above was carried out in order to determine the implication of the C2-L1Tc motifs for ssDNA-binding affinity. Thus band-shift assays were performed using a constant amount of radiolabelled 135nt-ssDNA and an increasing amount of each peptide (Figures 6a–6d). To calculate the Kd, the peptide-bound DNA fraction was quantified and plotted against the peptide concentration (Figure 6e). This analysis showed that peptide 5015 containing the RRR and NLS stretches, and peptide 10987 containing the downstream-finger, have, among the assayed peptides, the highest binding affinity for ssDNA (Kd=0.21±0.06 μM and Kd=0.30±0.02 μM respectively) (Figure 6e). The presence of a single-shifted band was observed at different concentrations of peptide 10987 (Figure 6b). Despite the fact that the 10987 peptide concentration increased from 0.09 to 1.8 μM (Figures 6b and 6e), 25% of the ssDNA remained unbound (Bmax=0.74±0.02). Interestingly, in spite of the very low affinity of peptide 5032 (containing the upstream-finger) to ssDNA, a similar shifted band was detected at a high concentration of this peptide (Figure 6c).
Binding analysis of C2-L1Tc-derived peptides to ssDNA by EMSA
Peptide 5033 containing the upstream-finger and the complete NLS motif also has a high affinity for ssDNA (Kd=0.42±0.07 μM). Deletions and mutations of these two motifs resulted in a significant reduction in binding affinity. Thus peptide 5016 which contains a partial deletion of the NLS, but maintains the zinc-finger motif, had less affinity for ssDNA (Kd=1.03±0.08 μM) than peptide 5033. Peptide 5031, in which the cysteine residues of the upstream-finger were substituted by serine residues, had a 2-fold lower affinity for ssDNA than the 5016 peptide (Kd=2.29±0.07 μM) (Figure 6d). Peptide 5020, containing the RRR stretch located downstream of the downstream-finger, exhibited a low ssDNA-binding affinity (Kd=10.48±0.45 μM). Thus, most probably, both zinc-fingers and the basic stretches flanking them (the two RRR and the RRRKEK sequences) participate in the binding of C2-L1Tc to ssDNA.
Mapping of C2-L1Tc-binding domains to dsDNA
The role of the C2-L1Tc motifs in binding to dsDNA was also investigated by incubating 135bp-dsDNA and increasing concentrations of each peptide (Figures 7a–7d). The data were analysed using the Hill equation (Figure 7e). Peptide 10987, which contains the downstream-finger, had only a slight affinity for dsDNA (Kd=25.68±4.22 μM). However, the peptides containing the upstream-finger and the complete or partial NLS motif, peptides 5033 and 5016, had a high affinity (Kd=1.34±0.08 μM and 3.12±1.01 μM respectively). The substitution of CCHH for SSHH in this zinc finger (peptide 5031) resulted in a 5-fold decrease in binding affinity (Kd= 9.13±0.16 μM), corroborating the important role of this motif in the binding of C2-L1Tc to dsDNA. Nevertheless, peptide 5032, which lacks the NLS, but maintains the zinc finger, had no affinity for dsDNA. Thus the basic stretches upstream of the zinc fingers (RRR and NLS) also participate in the binding of C2-L1Tc to dsDNA. In fact, peptide 5015 (which contains both regions) showed among the assayed peptides a high affinity (Kd=1.51±0.09 μM) for dsDNA. Moreover, the deletion of the RRR stretch (peptide 5030) resulted in a complete loss of binding affinity. However, the peptide containing the region located downstream of both C2H2 zinc fingers (peptide 5020), bearing also a basic stretch (RRR), showed a low affinity for dsDNA. The Kd for peptide 5020 was estimated to be 22.83±9.78 μM.
Analysis of the binding affinity of C2-L1Tc-derived peptides to dsDNA by EMSA
Remarkably, the concentration-dependent binding curves obtained for the binding of each peptide to dsDNA displayed significantly different shapes (Figure 7e). The sigmoidal or hyperbolic shapes of the curves indicate whether the binding is co-operative and non-co-operative respectively. To obtain the co-operativity value, a Hill transformation was applied to the dsDNA-binding data (Figure 7f). The Hill coefficients (αH) for peptides 5015, 5033 and 5020 (containing at least one of the complete basic stretches) were 3.34±0.21, 3.35±0.42 and 2.47±0.48 respectively, indicating a high degree of dsDNA binding co-operativity. However, peptide 10987 (containing the downstream-finger) binds to dsDNA with low co-operativity (αH=1.29 ± 0.35). The binding co-operativity of peptide 5016 [which contains the upstream-finger and a fraction of the NLS motif (RKEK)] is also low (αH=1.63±0.35). The relatively low-binding co-operativity of 10987 and 5016 peptides is likely to be due to the presence of the zinc fingers. We observed that the substitution of the CCHH in peptide 5016 for SSHH (peptide 5031) increased the degree of co-operativity more than 2-fold (αH=1.63 ± 0.16 and αH=3.75 ±0.19 respectively).
Effect on duplex stability of the protein motifs of C2-L1Tc
We have previously shown that the C2-L1Tc protein promotes the exchange of strands on mismatched DNA duplexes in the presence of an excess of single-stranded complementary DNA, even though the protein has no effect on the Tm (melting temperature) of the mismatched duplex [22]. This process probably requires the conjunction of several effects, such as those endowed with stabilization and destabilization properties on a mismatched DNA. In order to evaluate the implication of the C2-L1Tc motifs on the Tm of mismatched duplexes, peptides bearing these motifs were tested in melting assays. As expected, peptides 5030 and 5032, lacking both ssDNA- and dsDNA-binding capability does not have any influence on the Tm of this imperfect DNA duplex (Figure 8 and Table 1). Peptide 5015, containing both basic motifs (RRR and NLS sequence), strongly prevented the melting of the duplex. Peptide 5033, containing the NLS sequence together with the upstream-finger, also prevented the melting of the duplex, increasing the Tm from 40 °C to 52 °C. The partial deletion of the NLS motif, peptide 5016, and the additional substitution of cysteine residues by serine residues in the upstream-finger, peptide 5031, resulted in a reduction in the Tm (47 °C and 42 °C respectively compared with 52 °C of the 5033 peptide), although both of them maintained the stabilization effect. In contrast, the peptides containing the downstream-finger or the RRR region located downstream of both zinc fingers (peptides 10987 and 5020 respectively) induced a slight decrease in the Tm of the duplex containing four internal mismatches (Figures 8a and 8b).
Effect of C2-L1Tc-derived peptides on the Tm of a preformed mismatched dsDNA duplex
DISCUSSION
Retrotransposition of LINE requires the interaction at different steps of some of the proteins that the elements code for and the nucleic acids. Thus the interaction between the proteins encoded by these elements and the intermediate RNA forming a RNP (ribonucleoparticle), as well as those between the newly formed RNP with the target DNA, are obligated processes [31,32]. Even though there is a large diversity among non-LTR retrotranposable elements, a conserved domain containing a potential nucleic-acid-binding motif is, however, retained in most of them [24]. It has been previously shown that the C2-L1Tc protein encoded by the sequence located at the 3′-end of the T. cruzi L1Tc element binds to nucleic acids and that it has NAC activity [22]. The results of the present study show that C2-L1Tc exhibits a preference for RNA binding. The high binding affinity that C2-L1Tc shows for RNA suggests that it may have an important role in vivo for L1Tc mRNA binding. This capability to bind RNA is also present in other proteins encoded by other non-LTR retroelements, such as ORF1-derived proteins (ORF1p) from the human and mouse L1 elements and from that encoded in the Drosophila I factor [7,32,33]. Thus the high affinity that these proteins have for RNA seems to be essential for mobilization of RNA-intermediate-mediated transposable elements. In fact, some specific point mutations in the ORF1p that reduce the binding affinity of the protein for RNA lead to the formation of altered RNPs and to a severe reduction of the retrotransposition efficiency [32].
The C2-L1Tc-binding profile to RNA corresponds to a non-co-operative-binding model. However, the affinity of the C2-L1Tc protein for this type of nucleic acid molecule increases with the RNA size, showing a clear non-specific sequence affinity. Consequently, since the L1Tc mRNA has a large size (5 kb), it is expected that the C2-L1Tc protein may bind to several positions on the RNA. C2-L1Tc may have binding preferences for specific conformations of the L1Tc RNA. In fact, we have observed that the RNA conformation influences the binding capacity of the C2-L1Tc protein for RNA molecules. Thus the affinity of C2-L1Tc for the RNA molecule is reduced when the RNA molecule is in a denatured state. Taken together, the results presented suggest that the C2-L1Tc protein plays an important role in the binding to the L1Tc transcript and consequently in the RNP formation. Furthermore, the data also suggest that the C2-L1Tc–RNA association may protect the L1Tc transcript from degradation.
We believe that the RRR and RRRKEK (NLS) domains, as well as the zinc finger located immediately downstream of these basic stretches (upstream-finger), are the main motifs responsible of the high binding affinity that the C2-L1Tc protein exhibits for the RNA molecule. The co-operativity of these motifs and the relative position of them in the protein should probably play an essential role in the binding of the protein to RNA as the affinity of these peptides for the RNA is substantially lower than that shown by the full-length protein. Previously reported studies have indicated that the C2-L1Tc protein endowed with NAC activity stabilizes complementary DNA duplexes and does not modify the Tm of mismatched duplexes [22]. In the present study we provide some insights into the molecular mechanisms involved in this activity. In fact, C2-L1Tc has a 16-fold higher affinity for ssDNA than for dsDNA. Also, a higher affinity for binding single-stranded nucleic acids than double-stranded has also been described to exist in the NAC proteins encoded by HIV-1 NC [34,35], the LTR retrotransposons from the yeast Ty-1 NC-like protein [36] and other non-LTR retrotransposons, such as the mouse L1 protein and Drosophila I factor ORF1p [3,11]. Since some L1 ORF1p mutant proteins, in which neither the binding affinity for RNA nor the RNP formation have been altered, exhibit a reduced retrotransposition rate [10,32], it has been suggested that this protein having NAC activity should have an additional function in the TPRT mechanism at a subsequent step of that involving the RNP formation. This additional function may be correlated with the capability of the protein for maintaining a delicate balance between its ability to promote both stabilization and destabilization of the helix as it has been demonstrated to be required for effective L1 retrotransposition [37].
EMSA analysis using synthetic peptides covering different active regions present in the C2-L1Tc protein show that the region containing the RRR and RRRKEK (NLS) domains has a high capability for DNA binding. The affinity is higher for single-stranded than for double-stranded molecules. Deletion of any of these motifs produces a complete loss of the binding capacity for both types of nucleic acids. The binding affinity for the DNA molecules, but not for RNA molecules, of peptides bearing the mentioned motifs is similar to that shown by the full-length protein. This fact suggests that the synergic effect of the motifs present in the protein is not essential for DNA binding. The C2-L1Tc region that covers both the NLS domain and the zinc finger located immediately downstream of these basic stretches (upstream-finger) also shows a high affinity for both nucleic acid molecules. The substitution of cysteine residues in that finger for serine residues induces a significantly higher decrease in binding affinity for dsDNA than for single-stranded nucleic acids. This indicates that the upstream-finger plays a relevant role in binding to dsDNA. However, the peptide bearing the zinc finger situated towards the C-terminal end of the C2-L1Tc protein (downstream-finger) has the ability to bind mainly ssDNA. These results indicate the two zinc fingers present in C2-L1Tc have a differential behaviour relative to nucleic-acid binding. Other proteins containing multiple C2H2 zinc fingers (e.g. TFIIIA) are also able to bind both DNA and RNA molecules [38,39], although the zinc fingers involved in this binding are specific for each type of nucleic acid molecules [35,40].
Previous strand-exchange experiments have shown that the two C2H2 zinc fingers and the basic domains located upstream of the first C2H2 cysteine motif are involved in the NAC activity [22]. Our results show that the protein regions responsible for the nucleic-acid binding are the same as those previously described to be implicated in NAC activity. However, the data presented in the present study (Table 1) suggest that both processes are uncoupled. Thus, although the binding to nucleic acids is essential for NAC activity, the affinity of isolated motifs for double- or single-stranded nucleic acids is not directly correlated with NAC activity. This fact seems to be a general characteristic of the NAC proteins. This feature is probably due to the need for establishing a proper balance between single-stranded and double-stranded interactions. This balance seems to be required to promote both stabilization and destabilization of the nucleic acid helix [35,41]. Previous studies have shown that the effective NAC activity (strand-annealing function) of the HIV-1 nucleocapsid protein is correlated with the protein ability to rapidly bind and dissociate from nucleic acids [42]. Our results show that the RRR and RRRKEK (NLS) domains and the upstream-finger implicated in binding to both ssDNA and dsDNA have a stabilizing effect on mismatched duplexes. However, both the downstream-finger and the basic motif located downstream of the finger, mainly implicated in binding to ssDNA, have a destabilization effect. We suggest, therefore, that in order to properly function as a NAC, specific motifs of the C2-L1Tc protein must maintain a proper balance between the binding affinity for single- and double-strand nucleic acids and the capability for stabilizing and destabilizing the nucleic acid helix.
Abbreviations
- dsDNA
double-stranded DNA
- DTT
dithiothreitol
- EMSA
electrophoretic mobility-shift assay
- LINE
long interspersed nuclear element
- LTR
long terminal repeat
- NAC
nucleic acid chaperone
- NLS
nuclear localization sequence
- ORF
open reading frame
- RNP
ribonucleoparticle
- ssDNA
single-stranded DNA
- Tm
melting temperature
- TFIIIA
transcription factor IIIA
- TPRT
target-primed reverse transcription
AUTHOR CONTRIBUTION
Sara Heras designed and performed most of the experiments, analysed the data and wrote the initial draft of the manuscript. Carmen Thomas designed the experiments, analysed the data, discussed results and contributed in writing the manuscript. Francisco Macias carried out experiments and contributed to manuscript preparation. Manuel Patarroyo carried out the synthesis of the peptides. Carlos Alonso discussed and corrected the manuscript prior to submission. Manuel López conceived the research, designed experiments, analysed the data, was involved in scientific discussion and also wrote the manuscript.
We thank M. Caro for technical assistance in the purification of the C2-L1Tc recombinant protein. We also thank Dr Cristina Romero for her help with data analysis and Dr Javier Cáceres for critical reading of the manuscript.
FUNDING
This work was supported by Plan Nacional I+D+I [MICINN (Ministerio de Ciencia e Innovación)] [grant numbers BFU2006-07972, BFU2007-64999]; PAI (Plan Andaluz de Investigación; Junta de Andalucía) [grant number P05-CVI-01227]; ISCIII-RETIC (Instituto de Salud Carlos III-Redes Temáticas de Investigación Cooperativa en Salud), Spain [grant number RD06/0021/0014]; a PAI Predoctoral Fellowship [grant number P05-CVI-01227 (to F.M.)]; and Colciencias [grant number RC-2007 (to M.E.P.)].