Inteins are intervening protein sequences that undergo self-excision from a precursor protein with the concomitant ligation of the flanking polypeptides. Split inteins are expressed in two separated halves, and the recognition and association of two halves are the first crucial step for initiating trans-splicing. In the present study, we carried out the structural and thermodynamic analysis on the interaction of two halves of DnaE split intein from Synechocystis sp. PCC6803. Both isolated halves (IN and IC) are disordered and undergo conformational transition from disorder to order upon association. ITC (isothermal titration calorimetry) reveals that the highly favourable enthalpy change drives the association of the two halves, overcoming the unfavourable entropy change. The high flexibility of two fragments and the marked thermodynamic preference provide a robust association for the formation of the well-folded IN/IC complex, which is the basis for reconstituting the trans-splicing activity of DnaE split intein.
INTRODUCTION
Protein splicing is a post-translational process in which intervening proteins, termed inteins, are self-excised from the precursor proteins, while the two flanking polypeptides (exteins) are seamlessly ligated together with a normal peptide bond [1]. Two types of protein splicing occur naturally: cis-splicing with inteins having continuous sequences or trans-splicing with inteins having split sequences. In comparison with cis-splicing, trans-splicing has one additional binding step, in which two fragments of split inteins recognize and associate with each other to reconstitute splicing activity, followed by the four steps in protein splicing mechanism (Figure 1) [2]. Protein trans-splicing can be easily controlled because each individual fragment can be prepared separately in an inactive form, and protein splicing occurs only after the reconstitution of the two complimentary fragments. Thus, split inteins have been used for various applications in protein engineering, such as segmental isotopic labelling, protein semi-synthesis and detection of protein–protein interactions [3–5].
Schematic representation of protein trans-splicing
The DnaE intein from the split dnaE gene of Synechocystis sp. PCC6803 is a naturally occurring split intein, which has trans-splicing activity both in vivo and in vitro [6,7]. The crystal structure of an artificially fused continuous DnaE intein has been reported to adopt a horseshoe-like three-dimensional structure, termed the HINT (hedgehog/intein) fold, although the attempt to co-crystallize the two fragments of DnaE split intein failed [8]. A fast association rate (2.8×107 M−1·s−1) with high affinity (Kd~36 nM) of DnaE split intein was determined by FRET (fluorescence resonance energy transfer) [9]. Furthermore, the electrostatic interactions between split intein halves have been postulated to play a critical role during the binding process of the split DnaE intein [9–11]. Although the protein splicing mechanism has been extensively studied [11–14], the recognition of two fragments of split inteins, which determines the reconstitution of trans-splicing activity, is still poorly understood at the atomic level. All reported structures of inteins, including artificially fused split inteins, share a common HINT fold. However, the artificially split protein fragments from continuous inteins are prone to misfold or aggregate in individual expression in Escherichia coli [15]. The solution behaviours of individual halves from the naturally occurred split intein and their recognition have not been carefully studied.
To gain insight into the recognition of split inteins, we cloned and purified two halves of DnaE split intein, IN containing three extein residues and IC without extein. Our results show that, in the absence of the complementary partner, both IN and IC are unstructured under physiological conditions. The conformational changes occur upon association in an enthalpy-driven process. The inherent flexibilities with abundance of polar contacts provide an efficient approach to reconstitute protein trans-splicing.
EXPERIMENTAL
Plasmids
The gene encoding IN (amino acids 1–123) was amplified by PCR from pMEB10 with the following primers: forward, 5′-TACTTCCAATCCAATGCGTGCCTGTCTTT-3′; and reverse, 5′-TTATCCACTTCCAATGCTATTTAATTGTCCCAG-3′. The PCR product was cloned into a modified pMCSG7 vector with an N-terminal His6-GB1 (the B1 domain of protein G) tag as a SET (solubility enhancement tag) using the LIC (ligation-independent cloning) method. This construct is named pEGN for preparation of the N-terminal fragment of DnaE split intein (Figure 2A). The gene for His6-IN was amplified from pMEB10 via PCR using forward (5′- TGCGCTcatatgCACCATCATCATCACACTGCCTGTCTTT-3′, containing an NdeI site denoted in lower-case letters and a His6 tag underlined) and reverse (5′-CGgaattcCTATTTAATTGTCCCAGCGT-3′, containing an EcoRI site denoted in lower-case letters and a stop codon underlined) oligonucleotide primers. The PCR product was cloned into the NdeI–EcoRI-digested pET-21b(+) vector (Novagen). This construct is called pEN for the expression of sole IN without the SET (Figure 2B). The co-expression plasmid containing genes encoding His6-IN and IC was constructed by two-step PCR using pEN described above according to the previously published method [16]. IC and a second RBS (ribosome-binding site) were amplified with the IC PCR products as the template described above using forward (5′-TAACTTTAAGAAGGAGATATACCATGGTTAAAGTTATCGGTCGT-3′, with RBS underlined) and reverse (5′- CCGctcgagCTAGTTAGCAGCGATAGC-3′, containing an XhoI site denoted in lower-case letters and a stop codon underlined) oligonucleotide primers. The second PCR was done with the first PCR product as the template using the forward (5′-CGgaattcAAATAATTTTGTTTAACTTTAAGAAGGAGATAT-3′, containing an EcoRI site denoted in lower case letters) oligonucleotide primer and the reverse primer remained unchanged. The resulting PCR products were digested and cloned into the EcoRI–XhoI-digested pEN. This construct is named pENC for the co-expression of IN/IC complex (Figure 2C). The gene for IC (2–36 amino acid) was amplified from pKEC3 via PCR using forward (5′-CGAcagatgctgGTTAAAGTTATCGGTCG-3′, containing a AlwNI site denoted in lower case letters) and reverse (5′-CCGctcgagCTAGTTAGCAGCGATAGC-3′, containing a XhoI site denoted in lower case letters) oligonucleotide primers. The PCR product was cloned into the AlwNI-XhoI-digested pET-31b(+) vector (Novagen), a vector for high-level expression of peptide with an N-terminal KSI (ketosteroid isomerase) tag. This construct is named pEKC for preparation of the C-terminal fragment of DnaE split intein (Figure 2D). All clones were verified by DNA sequencing.
Schematic representation of the constructs used in the present study
Protein expression and purification
N-terminal fragment (IN)
Plasmid pEGN was introduced into E. coli ER2566 cells. The cells were grown in 1 litre of LB (Luria–Bertani) medium with 100 μg/ml ampicillin at 37°C until the cells reached a D600 of 0.6–0.8, then the protein expression was induced by 0.5 mM IPTG (isopropyl β-D-thiogalactoside) at 16°C for 16 h. The cells were harvested by centrifugation at 4000 g for 20 min and resuspended in buffer A (50 mM Tris/HCl, pH 8.0, 200 mM NaCl, 5 mM 2-mercaptoethanol and 10 mM imidazole), and sonicated in an ice bath. Cell lysates were centrifuged at 22324 g for 30 min at 4°C, and then the supernatant was loaded on to a column of Ni-NTA (Ni2+-nitrilotriacetate) resin (Qiagen) pre-equilibrated with buffer A. The fusion protein was eluted by buffer B (buffer A with 250 mM imidazole). The protein was concentrated and the imidazole was removed by ultrafiltration using a membrane with a 10-kDa molecular mass cut-off (Amicon). The His6-GB1 tag was digested using TEV (tobacco etch virus) protease at 16°C overnight in 50 mM Tris/HCl (pH 8.0), 100 mM NaCl, 0.5 mM EDTA and 1 mM DTT (dithiothreitol). After removal of EDTA and DTT by ultrafiltration with a 3-kDa molecular mass cut-off, the enzyme-digested products were purified again using Ni-NTA resin to remove His6-GB1 tag, TEV protease and the undigested proteins. The IN (containing three additional N-terminal residues (Ser-Asn-Ala) from TEV digestion) in the flow-through fraction was purified further by a Superdex-75 gel filtration column 16/60 (GE Healthcare) with eluent containing 20 mM Tris/HCl, pH 7.4, 100 mM NaCl, 1 mM EDTA and 5 mM 2-mercaptoethanol. The molecular mass of IN was confirmed by SDS/PAGE (15% gel) and MS. The concentration of IN was determined by UV absorption at 280 nm with the theoretical molar absorption coefficient (∊=12950 M−1·cm−1).
C-terminal fragment (IC)
Plasmid pEKC was transformed into E. coli BL21 (DE3). The cells were grown in 1 litre of LB medium with 100 μg/ml ampicillin at 37°C until the cells reached a D600 of 0.6–0.8, then the protein expression was induced by 0.8 mM IPTG at 37°C for 5 h. IC was prepared using method given in the literature [17] with the following modifications. After the CNBr digestion, the pH of the solution diluted 1:4 (v/v) with MiniQ water was adjusted to approximately 4.5 with ammonia to yield white precipitates containing KSI and IC. IC was released from the precipitate by MiniQ water treatment and further purified by RP-HPLC (reverse-phase HPLC) using a C18 kromasil column (10 mm×250 mm, 5 μm) at a flow rate of 3 ml/min with two-step linear gradient: 0–5 min: 10–30% eluent B; 5–20 min: 30–100% eluent B, where eluent A is 0.1% TFA in Mini-Q water and eluent B is methanol. The purified IC was freeze-dried. The molecular mass of IC (containing one additional N-terminal leucine residue from CNBr cleavage) was confirmed by MS. The concentration of IC was determined by the BCA (bicinchoninic acid) assay.
Co-expression system
The co-expression plasmid pENC was transformed into ER2566 cells. The cells were grown in 1 litre of LB medium with 100 μg/ml ampicillin at 37°C. Protein expression was induced when cells reached a D600 of 0.6–0.8 with 0.1 mM IPTG at 37°C for 3 h. Cells harvested from cultures were resuspended in buffer A and sonicated in an ice bath. The IN/IC complex was isolated from the supernatant on an Ni-NTA column and further purified by a Superdex-75 gel filtration column 16/60 (GE Healthcare) as described for the purification of IN. The concentration of the IN/IC complex was also determined by UV absorption at 280 nm with the theoretical molar absorption coefficient (∊=12950 M−1·cm−1).
N-terminal fragment without SETs
For expression of sole IN, plasmid pEN was transformed into E. coli ER2566. The cells were grown in 4 ml LB medium with 100 μg/ml ampicillin at 37°C. Protein expression was induced when cells reached a D600 of 0.6–0.8 with 0.1 mM IPTG at 37°C for 3 h. The cells were harvested using 1.5 ml Eppendorf tube and resuspended using buffer A, and then sonicated in an ice bath. Cell lysates were centrifuged at 22324 g for 10 min at 4°C and the inclusion bodies were dissolved in 8 M urea. Both supernatant and inclusion bodies were analysed by gel electrophoresis. In order to compare the influence of IC on the expression of IN, both uninduced and induced cells harbouring the co-expression plasmid pENC were treated in the same manner as cells harbouring pEN.
15N-Labelled proteins
The uniformly 15N-labelled samples were prepared in the same procedure, except for replacing the LB medium with the M9 medium supplemented with 15NH4Cl as the sole nitrogen source. The samples of IN and the IN/IC complex were concentrated by ultrafiltration and the buffer was exchanged to NMR buffer (100 mM NaCl, 3 mM DTT and 20 mM sodium phosphate buffer at pH 7.0). The freeze-dried IC was dissolved directly in NMR buffer.
CD spectroscopy
Far-UV CD spectra were obtained with a Jasco J-810 apparatus equipped with a temperature-controlled water bath using quartz cuvette with a 1.0 mm path length at 25°C from 190 to 260 nm in 50 mM sodium phosphate buffer (pH 7.4). Spectra were recorded with 15 μM IN with buffer solution as the blank. All measurements were repeated three times. The protein denaturation experiments were performed by adding 0–6 M GdmCl (guanidinium chloride; 1 M increments) to protein samples. CD spectra of the thermal denaturation experiments were recorded at 25, 55, 75 and 95°C respectively. The sample was allowed to stand for 5 min at each temperature prior to the measurements.
Gel-filtration chromatography
Gel filtration was performed on an AKTA purifier liquid chromatography system using a Superdex-75 10/300 GL prepacked column (GE Healthcare), which was pre-equilibrated with 20 mM Tris/HCl containing 100 mM NaCl (pH 8.0). The column was calibrated with RNase A (13.7 kDa, 16.4 Å; where 1 Å=0.1 nM), chymotrypsinogen A (25.7 kDa, 20.9 Å), ovalbumin (44 kDa, 30.5 Å) and BSA (66.2 kDa, 35.5 Å). The results were analysed according to the method described previously [18].
DLS (dynamic light scattering)
DLS measurements were performed at 25°C on a DynaPro 99 instrument equipped with a temperature-controlled microsampler (Protein Solutions) at a laser wavelength of 824.3 nm. The purified IN was centrifuged at 22324 g for 10 min at 4°C prior to measurements. Data were recorded using 0.5 mg/ml IN in 20 mM Tris/HCl (pH 7.4) containing 100 mM NaCl, 1 mM DTT and analysed using the DYNAMICS V6.0 software from Protein Solutions.
Bioinformatic analysis
The amino acid composition of IN was analysed using the composition profiler software [19], which includes the Disprot 3.4 and PDB_Select_25 datasets. The disorder prediction of IN was performed using PONDR® software with VL-XT algorithm (http://www.pondr.com/), and then CDF (cumulative distribution function) analysis of the output of PONDR® was performed to provide the distribution of prediction scores and a linear boundary for distinguishing ordered and disordered proteins. The CH plot (charge hydropathy plot) was also done using PONDR®.
NMR spectroscopy
All 1H-15N HSQC (heteronuclear single quantum correlation) spectra were recorded at 298 K on 500 MHz Bruker spectrometer equipped with a cryoprobe. All uniformly 15N-labelled protein samples were prepared in NMR buffer (100 mM NaCl, 3 mM DTT, 20 mM sodium phosphate buffer at pH 7.0) with 10% 2H2O. Spectra were processed and analysed using NMR Pipe [20] and Sparky (T. D. Goddard and D. G. Kneller, SPARKY 3, University of California, San Francisco, CA, U.S.A.).
Limited proteolysis
Limited proteolysis was carried out on IN and the co-expresed IN/IC complex at 25°C with trypsin in 20 mM Tris/HCl, 1 mM DTT, 0.5 mM EDTA and 100 mM NaCl (pH 8.0). The enzyme to substrate ratio was 1:1000 (w/w). Aliquots were taken from the proteolysis solution at the various reaction times (0, 2, 4, 8, 16, 32 and 64 min), and the reaction was stopped by adding SDS loading buffer and boiling for 5 min. Cleavage products were resolved by Tris-tricine gel electrophoresis (15% gel) and visualized by staining with Coomassie Brilliant Blue R250.
ITC (isothermal titration calorimetry)
ITC experiments were performed on a Microcal ITC 200 (GE Healthcare) by titrating 150 μM IC into 20 μM IN at 25°C in 20 mM sodium phosphate buffer (pH 7.4) and 100 mM NaCl. The heat of dilution was measured by titrating IC into blank buffer, and the net binding heat was obtained by subtracting the dilution heat from the apparent reaction heat. Data were analysed using Origin ITC software.
RESULTS AND DISCUSSION
Expression of IN and IC
To study the interaction of two DnaE split intein fragments (IN and IC) and the properties of these fragments in solution, we first constructed a plasmid to produce IN with an N-terminal His6 tag, but this fusion protein was expressed mainly in insoluble inclusion bodies at 16°C in E. coli. To improve high-level expression of soluble IN, we next constructed a plasmid to produce IN with an N-terminal His-tagged GB1 (6.2 kDa) as a SET and a TEV protease recognition site between GB1 and IN. The fusion protein (His6-GB1-IN) was purified by Ni-NTA affinity chromatography from supernatant and the fusion tag was removed by digestion with TEV protease. After protease digestion, IN was purified further by Ni-NTA affinity chromatography, followed by gel filtration. The purified IN was readily degraded at room temperature (25°C) by residual protease contaminants in the protein preparation. The degradation can be effectively inhibited with 2 mM protease inhibitor PMSF (results not shown). Therefore freshly prepared IN was used in the present study.
IC was prepared using the pET-31b(+) vector (Novagen) for the high-level expression of peptide. The protein was expressed with an insoluble expression tag (KSI tag) at the N-terminus in order to protect the peptide from proteolytic degradation in vivo. The KSI-IC fusion was extracted from inclusion bodies using 8 M urea. The fusion protein was precipitated after removing urea by dialysis against ultra-pure water, and then freeze-dried. The KSI tag was removed by CNBr digestion in 88% formic acid. IC was purified using RP-HPLC. The identities of both IN and IC were confirmed by MS.
Isolated fragments of split intein are disordered
Based on the intein structures solved so far, all inteins possess the β-rich HINT fold, which brings the N- and C-terminal splicing junctions close in space. Such a three-dimensional arrangement must play a crucial role in protein splicing. CD spectroscopy is the most commonly used method to determine the extent of secondary structure with the spectroscopic signatures of the α-helices (negative bands at 208 and 222 nm) and β-sheets (a negative minimum at 218 nm and a positive maximum at 195 nm), whereas the random coils show a minimum near 198 nm. CD spectra showed that IN possesses a nearly unstructured conformation, characterized by a single negative minimum at 203 nm (Figure 3A, black curve). As the artificially fused Ssp DnaE intein showed a well-ordered structure, this observation suggests that the short peptide sequence of IC plays an important role in assisting the folding of IN. Meanwhile, the shoulder approximately 220–230 nm in CD spectra of IN indicates that this disordered protein contains some residual secondary structure [21]. The chemical denaturation experiments confirmed the presence of the residual secondary structure, which were gradually lost with the addition of GdmCl (Figure 3B). It has been proposed that the residual structure in unstructured proteins or protein domains serves as a primary contact site and guides correct protein folding by limiting the conformational space in protein–protein recognition [22]. Therefore the residual structure in IN can be helpful in the association of two halves of DnaE split intein.
IN is intrinsically disordered in vitro
To test the stability of the residual secondary structure in IN, the CD spectra were recorded from 25 to 95°C. Interestingly, the CD spectra of IN exhibit very little change over the temperature range, in which the ellipticity (mDeg) at 222 nm slightly decreases with temperature increase (Figure 3A). The temperature independent CD spectra are consistent with IDPs (intrinsically disordered proteins) [23]. Disordered proteins typically exhibit abnormally larger hydrodynamic radius (Stokes radius) than the globular proteins with the same molecular mass [24]. To test whether this is true for IN, the hydrodynamic radius of IN was first measured by gel filtration. The results showed that IN (14.2 kDa) has a larger hydrodynamic radius (24 Å) than the corresponding molecular mass of globular protein {19 Å, calculated by the equation: log(RNS)=−0.204+0.357×log M, where RNS is the expected Stokes radius of globular proteins [23]} (Figure 3C). Meanwhile, the hydrodynamic radius of IN determined by DLS gave a very similar value (22 Å) to the gel filtration result (Figure 3D). Furthermore, gel electrophoresis showed that the unusual solubility during boiling protects IN from insoluble aggregates, which also supports the natively unfolded state of IN (Figure 3E) [24].
For both IN and IC, two-dimensional 1H-15N HSQC NMR spectra were recorded on the uniformly 15N-labelled protein samples. Results showed very poor dispersion for both IN (Figure 4A, blue) and IC (Figure 4B, blue) in the isolated form, which demonstrated the disordered state of these two fragments. Taken together, these results suggest that both IN and IC in separated form are in disordered state.
Both IN and IC undergo disorder-to-order transitions upon the association with each other
It has been reported that the IDPs typically have more disorder-promoting residues and less order-promoting residues, based on the bioinformatics analysis [25]. However, the amino acid composition analysis showed that IN does not belong to the class of IDPs compared with the Disprot 3.4 dataset (Figure 5A). This result suggested that IN has the potential to adopt an ordered structure. The intrinsic disorder propensity of IN was further analysed by a neural network-based predictor PONDR® (http://www.pondr.com/) using the VL-XT algorithm with default parameters. The CDF analysis of the output of PONDR® has been used to provide the distribution of prediction scores and a linear boundary for distinguishing ordered and disordered proteins [26]. The CDF curve showed that IN is located in the range of ordered proteins (Figure 5B). In addition, the CH plot, a plot of the normalized Kyte–Doolittle hydropathy against the mean net charge to classify ordered and disordered proteins [27], also located IN in the ordered region (Figure 5C). These theoretical predictions consistently indicate that IN has strong potential to form an ordered structure, although it demonstrated a disordered state experimentally. Therefore these results suggest that the disordered IN is ready to fold into a well-structured protein in the association with its complementary partner.
Bioinformatic analysis of IN
Folding upon interaction
Although both IN and IC exhibit unfolded states in the isolated form, the X-ray crystal structure indicates that the artificially fused DnaE intein possesses a HINT structure. This fold is shared by all reported structures of inteins, so that it is very likely also present in split inteins after the association of two fragments, although no structures of split intein complex have been reported. Disordered proteins are usually involved in folding upon binding [28]. To verify the conformational change of IN and IC after association, 1H-15N HSQC spectra were recorded on 15N-labelled IN in the absence and presence of non-labelled IC. The spectrum of IN shows a significant improvement in signal dispersion after adding IC (Figure 4A, red). A similar dispersion improvement was also observed on the 15N-labelled IC upon adding IN (Figure 4B, red). This result clearly indicates the disorder-to-order transition of IN upon binding to IC, and vice versa. Therefore, it can be concluded that the disordered structure of the isolated fragments of IN and IC is due to the split of the continuous DnaE intein, and the mutual synergistic folding occurs during the association of IN and IC, which is a vital step for protein trans-splicing.
Limited proteolysis
It has been reported that the disordered proteins are more easily digested by proteases because the residues are largely exposed to solvent. On the contrary, ordered proteins are more resistant to proteases as the compact structures protect cleavage sites from protease access [29]. Therefore the limited proteolysis can be used to probe the conformational characteristics of proteins. Here, the protease sensitivity of IN and IN/IC protein complex were examined using trypsin. Tris-tricine gel electrophoresis (15% gel) was used to monitor the time course of protease digestion. The results show that IN is prone to digestion as an individual fragment (Figure 6, upper panel); however, the protein gained great protease resistance in the co-expressed binary complex (Figure 6, lower panel). This result further confirmed that the disordered protein of IN forms a folded three-dimensional structure in the binary complex.
Limited proteolysis of purified IN (upper panel) and IN/IC complex (lower panel) with trypsin
IC assists the folding of INin vivo
We next examined the association of two halves of DnaE intein in vivo using a co-expression system. The plasmid carrying two different genes, encoding IN and IC with two RBSs, was constructed according to the approach reported previously [16]. By transformation of the plasmid into E. coli ER2566 strain, both IN and IC were co-expressed simultaneously. In contrast with the expression of IN alone, which mainly yielded the protein in insoluble inclusion bodies in E. coli, the co-expression system produced the soluble IN/IC complex in the supernatant (Figure 7A). The formation of insoluble IN is likely attributed to the disordered nature of IN, whereas the co-expression system generated the well-structured protein complex in the supernatant. It is well known that the newly synthesized proteins can escape protein misfolding and aggregation by folding modulators, such as molecular chaperones [30–32]. Thus, the co-expression data demonstrate that the expressed IC can associate with expressed IN within E. coli and assist the folding of INin vivo.
Characterization of the IN/IC complex produced by co-expression
To characterize further the IN/IC complex, the co-expression product was isolated using Ni-NTA affinity chromatography, and further purified by gel-filtration chromatography. The collected fractions were subjected to Tricine-SDS/PAGE (15% gel) and two protein bands corresponding to IN and IC were observed (Figure 7B). This result indicates that the IN/IC complex is highly stable during the chromatography processes. To verify the conformation of the binary complex, the 1H-15N HSQC NMR spectrum was recorded on the 15N-labelled sample from the co-expression system. The well-dispersed backbone amide resonances provide solid evidence that the IN/IC complex is well folded (Figure 7C). Gel filtration clearly showed that the folded IN/IC complex has a smaller Stokes radius than disordered IN (Figure 7D). This result reveals that the two fragments of split inteins can recognize and associate with each other in vivo to form a well-defined three-dimensional structure, which is essential for splicing activity. Similar results have also been observed in the split DnaE intein from Nostoc punctiforme (results not shown).
Thermodynamics of interaction
The thermodynamic parameters of the interaction between the two fragments of split intein were quantitatively measured by ITC. The titration of IC into IN was an exothermic process, illustrated by negative peaks in Figure 8. A non-linear best-fit binding isotherm yielded a dissociation constant (Kd) of 33 nM (Figure 8). This datum agrees with the previously reported result from the interaction of extein-containing IN and IC fragments [GFP (green fluorescent protein) as N-extein and organic fluorophore Texas Red as C-extein] measured by fluorescence [9]. The thermodynamic parameters of enthalpy change (ΔH=−29.1 kcal/mol) and entropy change (TΔS=−18.9 kcal/mol) indicate that the binding is driven by enthalpy with a very unfavourable entropic cost. This phenomenon is known as the enthalpy–entropy compensation [33], which is often associated with folding upon binding. The favourable enthalpy change arises from a large number of polar contacts between IN and IC. The unfavourable entropy change is likely due to the loss of conformational freedom caused by the disorder-to-order transition during the association [34].
ITC profile of the titration of 150 μM IC into 20 μM IN
It is well known that the folded structure is a prerequisite for most proteins to perform various biological functions. However, accumulated evidence in the last decade showed that disordered proteins are involved in many cellular processes, such as transcriptional regulation and signal transduction [35]. On the other hand, disordered proteins could undergo conformational transition from disorder to well-folded structure upon binding to their partners and then carry out their specific biological functions [29,36–38], and the availability of disordered proteins are tightly controlled in cells [39]. Protein recognition and association can be through different mechanisms, including lock-and-key, induced-fit, conformational-selection and fly-casting. The fly-casting mechanism suggests that the disordered proteins provide a greater capture radius for speeding up association followed by protein folding [40]. This mechanism may be applicable to the association of two split intein halves, which results in efficient binding and reconstitution of protein splicing activity.
Conclusions
In summary, the results of the present study reveal that both halves of DnaE split intein are unfolded, and undergo large conformational changes upon the association of two fragments. The mutual synergistic folding is featured by the highly favourable binding enthalpy change and highly unfavourable entropy change. The disordered nature of split intein halves, and the polar interactions between the two halves pave the way for reconstituting protein trans-splicing activity. This work provides insights into protein recognition of two halves of split inteins, which is the first crucial step for initiating protein trans-splicing.
Abbreviations
- CDF
cumulative distribution function
- CH
plot, charge hydropathy plot
- DLS
dynamic light scattering
- DTT
dithiothreitol
- GB1
B1 domain of protein G
- GdmCl
guanidinium chloride
- HINT
hedgehog/intein
- HSQC
heteronuclear single quantum correlation
- IDP
intrinsically disordered protein
- ITC
isothermal titration calorimetry
- IPTG
isopropyl β-D-thiogalactoside
- KSI
ketosteroid isomerase
- LB
Luria–Bertani
- Ni-NTA
Ni2+-nitrilotriacetate
- RBS
ribosome-binding site
- SET
solubility enhancement tag
- TEV
tobacco etch virus
AUTHOR CONTRIBUTION
Yuchuan Zheng and Yangzhong Liu designed the research and co-wrote the paper, Yuchuan Zheng and Qin Wu performed the research, and Chunyu Wang and Min-qun Xu provided technical support.
FUNDING
This work was supported by the National Science Foundation of China [grant number 20873135], the National Basic Research Program of China [973 Program, grant number 2009CB918804], the China Postdoctoral Science Foundation [grant number 2012M511408], and the National Institutes of Health [grant number GM081408 (to C.W.)].