Health is fundamental for the development of individuals and evolution of species. In that sense, for human societies is relevant to understand how the human body has developed molecular strategies to maintain health. In the present review, we summarize diverse evidence that support the role of peptides in this endeavor. Of particular interest to the present review are antimicrobial peptides (AMP) and cell-penetrating peptides (CPP). Different experimental evidence indicates that AMP/CPP are able to regulate autophagy, which in turn regulates the immune system response. AMP also assists in the establishment of the microbiota, which in turn is critical for different behavioral and health aspects of humans. Thus, AMP and CPP are multifunctional peptides that regulate two aspects of our bodies that are fundamental to our health: autophagy and microbiota. While it is now clear the multifunctional nature of these peptides, we are still in the early stages of the development of computational strategies aimed to assist experimentalists in identifying selective multifunctional AMP/CPP to control nonhealthy conditions. For instance, both AMP and CPP are computationally characterized as amphipatic and cationic, yet none of these features are relevant to differentiate these peptides from non-AMP or non-CPP. The present review aims to highlight current knowledge that may facilitate the development of AMP’s design tools for preventing or treating illness.
During the history of drug development, scientists have explored the use of small molecules (e.g., penicillin, metformine), large-size molecules (e.g., antibodies, vaccines), and more recently mid-size molecules; for instance, in recent years, the FDA for the first time approved a therapy based on antisense oligonucleotides  and more peptides are being approved for pharmaceutical purposes . The molecular size of pharmaceutical drugs is relevant for different reasons. For instance, small molecules may not have the size to cover the area required for providing specificity in some cases [3,4], peptides may be more expensive , yet more effective in advanced human trials . In that sense, mid-size molecules such as oligonucleotides or peptides are a new frontier in pharmaceutics that provide an intermediate solution to these problems, that is, they may have the right size to cover large areas to provide specificity and are not too large and their cost of synthesis is not so expensive. In the present review, we will focus on analyzing potential therapeutic uses of peptides. In particular, we will focus on two classes of peptides: antimicrobial peptides (AMP) and cell-penetrating peptides (CPP).
Therapeutic drugs aim to treat any condition that affects health. The concept of health has been reviewed recently , and it has been proposed that such condition does not represent a ‘normal’ or ‘appropriate’ state of humans or living organisms. Instead, health is defined as the capacity of living organisms for adaptive variation, consequently disease or illness should be viewed as the reduction in such capacity. From that perspective, a healthy organism should maintain homeostasis and allostasis. Homeostasis is considered a process that maintains without change the cellular processes of a living form, while allostasis refers to the stability achieved through change of its components to constantly adapt to the changing environment . These concepts are well aligned with two emergent cellular mechanisms that are now recognized as fundamental to health: autophagy and microbiota. Autophagy is a catabolic process that enables cells to maintain their functionality by degrading what does not work for the cell anymore or degrade anything that is affecting the cell functionality, while the microbiota comprises cellular microorganisms that live within animal tissues to help them perform their tasks. As we will review in the present work, AMP with CPP activity are able to control both autophagy and microbiome, constituting a new class of biologically active compounds that may endow animals with the capacity to maintain allostasis and/or homeostasis, hence promote healthy states either to treat an illness or to prevent it (prophylactic). The idea that prevention is better than treatment of diseases has been thoroughly analyzed , and while it is true that when prevention is too expensive treatment is preferred, there are cases where humans, animals, and insects present behaviors that prefer prevention over treatment. For instance, humans have learned over many years the importance of maintaining clean spaces to prevent diseases; termites that need to share their nest with many bacteria have developed social strategies to control the spread of mortal fungal infections . If these organisms have adapted their behavior to prevent diseases, is it possible that during evolution some molecular mechanisms have been selected for preventing diseases? Here, we review previous published results to support the idea that if preventing diseases could be genetically inherited, this may have occurred through AMP with CPP activity. We are still in an early stage for designing these peptides to prevent or treat complex traits such us gut microbiome dysbiosis; hence, it is an appropriate time to highlight where we stand and what to look for in the near future.
AMP structure–activity relationship
Peptide structure may be described in different ways. For instance, peptides may be described by their amino acid sequence (primary structure), the contents of structural patterns such as β-sheets or α-helices (secondary structure), by the dispositions in the three-dimensional space of the peptide atoms (tertiary structure), by the composition of chemical groups (e.g., carboxylates, primary amines), chemical properties (e.g., polarity, isoelectric point), among other chemical descriptors (e.g., electrotopological states, entropy). There are different suites of computer programs aimed to calculate such structure descriptors of proteins and peptides [11–14]. On the other hand, the function of proteins and peptides are annotated in a systematic fashion through the Gene Ontology consortium ; in such effort, three aspects are recognized about the peptide function: molecular function (activity performed by the peptide without considering where or in what context the action takes place), cellular component (the cellular location where the peptide resides) and biological process (the cellular program accomplished by multiple activities). It is recognizable that since both peptide structure and peptide function are observed from the same molecule, these two observations should be related. However, it is not clear so far what is the form of such relationship. Recognizing the current limitation to relate protein structure and function, an international experiment has been conducted in the past decade to improve the formalization of the structure–function relationship of proteins; the experiment is referred to as the Critical Assessment of Function Annotation algorithms (CAFA) and it is currently in its 4th edition (CAFA4) . The most recent report from CAFA3 showed that simple sequence comparison does not improve machine-learning (ML) methods to relate structure and function in proteins, and that the cellular localization (Fmax < 0.7), molecular function (Fmax < 0.7), and biological process (Fmax < 0.5) are predicted in that order by the best ML methods; furthermore, small improvements have been observed in the performance of these methods despite the increasing number of data included by the methods, suggesting that the current descriptors/methods used for that goal are improvable .
The annotation of peptides as AMP refers to the molecular function according to the gene ontology consortium (see above). AMP are produced by unicellular and multicellular organisms ; they constitute a line of defense against other competing organisms such as bacteria, fungi, or viruses. But these peptides also have the capacity to regulate the immune system , control angiogenesis, and wound healing  as well as prevent tumor growth ; such functions correspond to biological processes. We may anticipate then, that AMP should be better predicted than angiogenesis or immunomodulation based on the CAFA experience, unless these biological processes are associated by a common mechanism and consequently may be related by common structural/chemical descriptors.
In that sense, the reported mechanisms triggered by AMP associated with these activities are diverse, yet some similarities may be recognized. For instance, AMP may kill bacteria by creating membrane pores . This mechanism of action has been reported for cationic and amphipathic AMP interacting with bacterial membranes, which possess many negatively charged groups on their membranes. This mechanism has been reported also relevant to explain the antitumor activity of AMP, since in these cells the negatively charged phosphatidylserine, heparan sulfate, O-glycosylated mucins, and sialylated gangliosides are exposed to the outside of cell membrane; this membrane composition in combination with an increased transmembrane potential and membrane fluidity makes these cells susceptible to the pore-forming mechanism of the positively charged AMP .
On the other hand, regulation of immune system and wound healing by AMP is accomplished by different mechanisms than the membrane pore-forming mechanism. AMP are known to bind chemokine receptors and through these induce pro- or anti-inflammatory effects . The ability of AMP to bind other targets than membranes has been recognized as an alternative way to kill bacterial cells; for instance, AMP are known to bind ATP [24,25], DNA [26,27], and lipid II . In all these cases, the AMP-interacting molecule is negatively charged; hence, it is possible that positively charged AMP may bind to proteins that possess a negatively charged surface. Indeed, several AMP displaying antiviral activity were reported to bind glycoproteins and such binding depended on the abundance of basic amino acids . Whether such pattern is also observed in AMP–chemokine receptor interactions requires experimental evidence not currently available .
If the antimicrobial activity of AMP depends on interacting with DNA for instance, AMP should be able to penetrate cells. In fact, accumulating evidence shows that many AMP exert their antimicrobial activity by interacting with intracellular targets. For instance, several AMP are known to inhibit protein synthesis by interacting with the ribosome [31,32] or by regulating metabolic enzymes [33,34], among other targets [35,36].
Since AMP have multiple activities in vivo, these peptides should balance their ability to penetrate cells with their ability to interact with different targets either outside cells to form pores or inside cells, to ultimately activate different biological processes. It is expected then that different AMP may have tuned their multiple functions to act specifically in different cellular contexts. What are the structural features of peptides and how these are related to different AMP functions represents an important aspect of the structure–function relationship models conducted to predict AMP activity and a fundamental problem in molecular biology for many other polymers such as proteins, among others, as we will describe next. Of particular interest to our review is the relationship between cell-penetrating activity with antimicrobial activity.
Modeling AMP activity from peptide structure
As noted above, there is an overlap between the different activities displayed by AMP that are associated with some structural properties of these peptides, particularly the net charge of peptides. But a single structural feature is unlikely to efficiently classify AMP from non-AMP and from the other activities displayed by these peptides. Here, we will review the most efficient methods to classify AMP and the structural features that more strongly relate with such activity. Later on, we will perform a similar analysis with CPP with the interest to compare AMP and CPP structural features.
Predictors of different AMPs’ activities are based on two main ML approaches: shallow- and deep-learning models. Shallow-learning models [37–44] require computing a set of structural descriptors to represent each peptide. The best up to date set of predictors are based on the Random Forest (RF) method . The success of the method is based on the dataset used and on the modeling approach. The dataset comprises the largest collection of AMPs experimentally validated: 22642 peptide sequences from the StarpedDB database . The training dataset is selected after a clustering process and by sampling, at random, 80% of the sequences in each resulting cluster for training and 20% for testing. The same selection procedure is followed for the positive (true AMP) and negative (non-AMP) instances.
To construct their set of non-AMP sequences, the authors in  downloaded from the UniProt database (v2019_08)  a total of 561046 peptides with no evidence of antimicrobial activity through the next two queries: (Golgi OR cytoplasm OR ‘endoplasmic reticulum’ OR mitochondria) AND NOT antimicrobial AND length: (5–100), and NOT antimicrobial AND reviewed: YES. Peptides with non-natural amino acids were excluded. By using the CD-HIT program , sequences are sorted by length and then alphabetically; these sorted sequences are grouped by percentage of identity (50% or more). The sequences are compared with the first sequence in the sorted list (the cluster’s representative), once the identity falls below 50%, a new cluster is created, the process is repeated until all sequences are considered. Finally, from the remaining set of non-AMPs, several peptide fragments were randomly selected to be part of the antibacterial, antifungal, antiparasitic, and antiviral sets with a one to one ratio for the positive (AMP)–negative (non-AMP) sequences. Similar queries have been used in other works to construct non-AMP sequences derived from UniProt [38,40,47,49]. The two queries used in  were the same as those proposed by Gabere and Noble  to carry out their benchmarking study on AMP binary classifiers.
The modeling part uses an ensemble of feature subsets derived from six feature selection algorithms. From the set of resulting features, a second round of feature selection is performed by using a Wrapper method  based on RF  as induction algorithm and a genetic algorithm (GA) as optimizer, the fitness function for the GA is the classification accuracy, the wrapper is implemented in the Weka framework .
where g(s) is a mathematical function from the peptide structure, s ϵ S, that is related to the biological function (the label of s) of the peptide, a ϵ A. Here, S is the universe of peptides and A is the set of biological functions. Thus, given a set of examples, S’ subset of S, an ML method will try to learn a function f(s) that is an approximation to g(s); the function f(s) uses the set of structural features described above. In consequence, the success of any ML method relies on a careful and systematic selection of the training dataset, followed by an optimal selection of structural features for each activity to be modeled. A relevance analysis of the structural features of AMP revealed that some of them, computed by the ProtDcal software , are repeatedly selected within the top five to discriminate AMPs, antibacterial, antifungal, antiviral, and antiparasitic activities. These structural descriptors are of the physicochemical type and include indices of heat of formation (DHf) and isotropic surface area (ISA), as well as the index of the interfacial free energy of an unfolded state [Gs(U)]. Deep-learning models [54–58] produce competitive results; however, they do not shed light on what the relevant descriptors are, hence, we will not attempt to make any sense of the structural features used by those methods in the present review. Notice, however, that there is a research area aimed at explaining what are the features the deep network recognizes for discriminating the classes, the field is known as eXplainable IA (XIA) .
CPP structure–activity relationship
CPP annotation corresponds to a molecular function according to the gene ontology consortium (see above); hence, its classification may be efficiently learned by current function annotation methods (see below). CPP are peptides capable of internalizing into different cell types, and these are found both in nature or are purposely designed. Among the first CPP described was the homeodomain of Antennapedia, a 60-amino acid residue peptide with the ability to bind DNA that was later discovered to translocate across neuronal membranes to reach the nuclei of cells ; as noted above, AMP are also able to bind DNA presumably because of charge complementation. With the accumulation of sequences with AMP and CPP activities, it was noticed that there were structural similarities between AMP and CPP, and such similarities were imprinted in their sequences, that is, many AMP and CPP are cationic and amphipathic and some CPP display AMP activity and vice versa . Such bivalent activity was shown to depend on the target rather than on the peptide ; hence, it is expected that some CPP may have this dual activity when the targets (microbes) are close by. Thus, the molecular function of CPP shares structural similarities with AMP as well as functional ones: some CPP and AMP are able to penetrate cell membranes, and some are able to bind DNA and display CPP or AMP activities; yet, these two classes of peptides are presumably not identical. This premise needs further testing, as we will summarize next.
CPP are currently studied as vectors to deliver molecules inside cells ; such deliverable molecules may be small drugs , siRNA , AMP , or probes to diagnose a disease . CPP may penetrate cells through different mechanisms, namely: (i) endocytosis-mediated mechanism  or (ii) energy-independent mechanisms . Both of these mechanisms are also found in AMP. For instance, it has been noted that the AMP named CGA-N12 kills fungal cells by internalizing cells using an energy-dependent mechanism via endocytosis ; alternatively, PAF26 is able to kill fungal cells at low concentrations using an energy-dependent mechanism via endocytosis, and at high fungicidal concentrations internalizes through an energy-independent mechanism . But a combination of both mechanisms may also take place in some cases. For instance, if the CPP include a peptide sequence recognized by a receptor, the peptide may be internalized through receptor-mediated endocytosis , yet sometimes the internalization although may use the receptor, does not necessarily involve endocytosis . Additionally, since some AMP are able to bind to external targets (e.g., receptors) in addition to their ability to penetrate cells, this dual functionality may as well be present among CPP. To test this idea, cells expressing the reported receptors known to bind to AMP should be analyzed. To the best of our knowledge, there are no CPP-receptor reported interactions, although there are many reports fusing CPP to peptide sequences recognized by receptors [72,74]. This fundamental question about the mechanism of action of CPP is against the notion that CPP are only able to translocate into cells.
Indeed, while initially CPP were considered innocuous to living cells , recently it has been recognized that some CPP induce autophagy . This is an activity also shared with some AMP [77,78]. It has been proposed that the antibacterial activity and autophagy induction of AMP is the consequence of their ATP-binding activity , which may as well apply to CPP considering the cationic character of both AMP and CPP. Understanding which targets of AMP are relevant for their toxic activity will be relevant to differentiate AMP from CPP, and consequently to improve on the current designs of AMP and CPP.
Modeling CPP activity from peptide structure
The CPP predictors can be classified as direct or phenomenological and ML-based approaches. In the first category, the methods compute physicochemical characteristics that are known to be related to the cell-penetration property of peptides, for instance, Z-descriptors, number of heavy atoms, among others. Here, a peptide is labeled as CPP if the computed physicochemical characteristics fall within some specified intervals computed from a set of experimentally validated CPP. Examples in this group have been reviewed elsewhere [79,80]. In the second category, the methods follow a standard ML approach, where a set of features are calculated, then a feature selection algorithm is applied, afterward an induction algorithm is trained based on a set of positive and negative samples. In this category, Dobchev et al.  were the first to propose an artificial neural network (ANN), followed by Sanders et al.  that proposed a support vector machine (SVM) model, years later Diener et al.  used SVM and RF models. The three methods explored different set of features, based on physicochemical descriptors. Besides, the first two approaches also included a feature selection component. In all of them, the physicochemical properties are the input features for the ML model. Many other ML-based approaches have been proposed [83–89] since the pioneer work of Dobchev et al. Central to the success of ML models is the quality and the size of the datasets. In this regard, the largest database is CPPsite 2.0  with 1699 unique peptides, which is still a small number considering, for instance, the size of databases of other bioactive peptides that include more than 20000 sequences . All approaches use experimentally validated CPP as their positive cases. However, the construction of the negative instances is not unique, some use random sequences, other use bioactive peptides that are not CPP, a few cases use experimentally validated non-CPP, yet there are only 34 validated non-CPP . Notice that the construction of negative examples for CPP differs from the construction of negative samples for AMP.
Considering the diversity of CPP predictors, it is relevant to ask, which of all these methods is the best to identify true CPP? To partially answer this question, a comparison of web-based tools is provided by Su et al. . According to their study, the server KELM-CPPpred in its variant KELM-hybrid-AAC achieved the highest performance measured by the Mathews Correlation Coefficient or MCC. The method uses a kernelized version of the extreme learning machine . To the best of our knowledge, there are no deep-learning-based approaches for the binary classification of CPP, this can be explained by the small amount of available CPP to date that will allow shallow approaches to outperform deep models, as it has been recently shown to be the case for the classification of AMP .
A recent work  proposed a CPP classification model based on an ensemble of ANN, SVM, and a Gaussian Process Classifier. The results achieved by this ensemble outperform other state-of-the-art models such as MLCPP , CPPred-RF , and SkipCPP-Pred. The study also reveals that a combination of sequence- and structure-based descriptors along with physicochemical descriptors achieves competitive accuracy levels and requires a smaller number of descriptors (43) than those used when using only sequence-based or only structure-based descriptors. The most relevant descriptors according to the normalized cumulative information entropy are the pseudo amino acid compositions (PseAACs) and the structure-based features. Regarding the sequence-based features, the abundance of cationic residues, such as lysine and arginine, plays an important role. An optimal selected combination of features consists of structure-based and physicochemical features such as molecular weight (MW), 1-octanol/water partition coefficient (cLogP), the fraction of SP3-hybridized carbon atoms (Fsp3), hydrogen bond acceptors (HBA), number of aromatic rings (NAR), primary amine groups (NPA), number of guanidine groups (NG), net charge (NetC), number of negatively charged amino acid groups (NNCAA) (nine out of 12), sequence based such as fraction of Lysine residues (f[Lys]), fraction of Arginine residues (f[Arg]) (two out of 20 AACs), ten out of 40 dipeptide composition, and 22 out of 22 PseACC descriptors (see Table 4 in ). Other works also report MW and NetC as relevant for classifying antimicrobial activity [97,98]. Also, Fsp3 has recently being proposed as a drug-likeness measure . However, further analysis is required to understand the relationship between the descriptors selected to discriminate CPPs from non-CPPs and those selected to discriminate AMPs from non-AMPs. Such an analysis might shed light about the mechanism of action of certain AMPs that permeate the membrane and interact with some internal cellular process. For that end, further experimental characterization of both AMP and CPP is also required.
Thus, it has been early recognized that AMP and CPP being cationic and amphipathic may share the ability to bind anions, such as ATP, and consequently may induce autophagy as well as control the growth of microbial communities. However, structure–function studies of AMP and CPP are discovering new physicochemical features that are relevant for these activities; it is interesting to note the lack of overlap between the AMP and CPP most relevant features that help classifying AMP from non-AMP and CPP from non-CPP. Thus, designing AMP with CPP activity may find some redundancy in terms of the cationic and amphipathic character, but other attributes should be considered to provide other functionalities such as immunomodulation or the ability to interact with other intracellular targets (see below). In the next two sections, we will summarize the relevance of autophagy and microbial communities for health, to justify the potential uses of designing AMP with CPP activity to health.
Autophagy and animal health
Autophagy is a term used to describe the self-eating molecular process originally characterized in yeast cells . In such process, the degradation is performed by the lysosomes (or vacuole in the case of yeast cells). Three types of autophagy have been identified : (i) microautophagy, (ii) macroautophagy, and (iii) chaperone-mediated autophagy. These three types of autophagy correspond to different mechanisms involved in delivering cargo to the lysosome. Macroautophagy is the first and most commonly identified form of autophagy; it uses autophagosomes to engulf cargo and deliver it to the lysosome. Microautophagy does not involve autophagosomes, instead the lysosome directly engulfs the cargo. The chaperone-mediated autophagy uses chaperones to translocate proteins, DNA and RNA to the lysosome.
Due to the catabolic nature of autophagy, this process is relevant for multiple biological processes, such as immunity, early development, ageing, adaptive metabolism, among others; but autophagy not only degrades the self-contents of cells, it also eliminates invading pathogens, paternal mitochondria, and others . Such degradation process provides then the substrate to synthesize new cellular components, thus it is a fundamental component of allostasis and/or homeostasis.
Consequently, it has been observed that genetic mutations affecting autophagic genes promote cellular degeneration [103,104], promote early-age-related phenotypes [105,106], tumor development [107,108], and susceptibility to infections in animals [109,110]. Similarly, mutations of genes related to autophagy are present in several Mendelian diseases in humans such as Rett’s syndrome, Parkinson’s disease, cataracts, several forms of cardiomyopathies, among others .
The focus of the present review is on the relationship between autophagy, bacterial infections, and AMP/CPP. In that sense, it is important to define that an infection is considered any overgrowth of a microorganism within the host that affects its health. Microorganisms that affect health are considered pathogens , yet some nonpathogenic microorganisms cause disease states in immunocompromised patients or would infect patients with an underlying disease ; hence, the term pathogen is not always precise . Furthermore, bacterial infections are found either inside host cells (e.g., human cells) or outside of them; such condition may differentiate the use of autophagy to deal with the invader, that is, only internal bacterial infections are subjected to xenophagy, the autophagic mechanism to deal with cellular invaders [115,116], which are recognized by macroautophagy . This observation emphasizes the relevance of having AMP with CPP activity: these AMP will exert their activity by activating autophagy as well as by acting directly against the microbes, among other activities of relevance (e.g., immunomodulation).
When microbes are internalized in host cells and cause infections, the invaded cells activate autophagy; hence, the invading microbes must inhibit that response to survive. When invaded cells activate autophay, this also down-regulates the inflammation response [118,119]. Thus, if autophagy is inhibited by the invading microbe, then inflammation is activated  and may eliminate bacteria that are outside of cells. In such scenario, only bacteria residing inside cells will survive. Alternatively, if autophagy is not inhibited, inflammation will resume and bacteria residing inside will be eliminated while those outside of cells may be eliminated by the primary immune response, which includes AMP . In the first case when intruder microorganisms are able to inhibit autophagy, these microbes may be eliminated by further activating autophagy [122,123]; in agreement with these observations, it has been reported that autophagy induction may prevent infections and other diseases . Recognizing that AMP and CPP may induce autophagy it was proposed that AMP may have the dual ability to kill infectious bacteria through direct antibacterial activity and through autophagy , and now there is experimental evidence both in cells and in animals that such hypothesis is true [25,77].
Thus, AMP are a natural source of compounds that activate the immune system, activate autophagy, and have direct antimicrobial activity. Such combination of activities has been shown to be useful in treating diseases such as specific viral and bacterial infections. In the next section, we explore evidence that AMP may be used as well to maintain health; hence, these may as well be used to prevent diseases (prophylactic), once we have a better understanding on how to design them.
AMP as prophylactic natural compounds to control microbiome establishment and animal health
Microbiota is the term used to refer to the microorganisms that live together with a multicellular host (e.g., humans, mice), that in combination with host cells provide the host with increased capabilities [126–129]. Nowadays, it is recognized that the multiple roles such microbiota play in animal health; for instance, the microbiota residing in the gut is relevant for the brain–gut axis, which in turn is relevant for behavior and the etiology of neuropsychiatric and neurodegenerative disorders , the gut microbiota is relevant for the immune system development  or the host metabolism performance [132,133], among others .
Thus, once the microbiota is established, this community assists the host in multiple tasks. But how is the microbiota established? That process may require the adaptability of host immune response to control the size of the different microbes populations. In that sense, it has been recognized the relevance of the mother in the newborn initial microbiota colonization [134,135]. Most of the microbiota from the mother may have the possibility to be transmitted to the newborn after birth . During pregnancy, the mother predominantly uses the T-helper 2-type immunity (anti-inflammatory), to not affect the development of the fetus, hence promoting such immunity in the fetus immune system as well. But this immunity type gradually changes as the baby is ready to be born and during the first weeks of life it turns into Th1-type activity (proinflammatory). At birth, the newborn is exposed to the microbiota from the mother through the newborn skin; it has been reported that both mice and human newborns overexpress AMP in their skin  as a way to control first microbial infections. Later on, during the first months of life, while the adaptive immune response is not established, the infants depend mostly on the innate immune system; AMP are an essential component of this system and the relevance of AMP for the healthy newborn survival has been recently reviewed . Among the AMP that have been detected in newborns are cathelicidins; a well-studied cathelicidin is the LL-37 AMP shown to induce autophagy . As noted above, autophagy plays an anti-inflammatory role that is important for the newborns to deal with the continuous exposition and acquisition of new microorganisms. AMP are known to up-/down-regulate the release of chemokines and cytokines ; chemokines and cytokines are known to exert direct antimicrobial activity , suggesting that AMP may enhance the antimicrobial arsenal or reduce it, through the regulation of expression of these cytokines. Furthermore, cytokines are also able to up-/down-regulate autophagy ; hence, AMP is part of a connected network of molecular events that regulates the innate immune system response.
Later in life, microbiota constantly change to adapt to multiple internal and external conditions. Such changes are accompanied by AMP production as well. Examples of this are also found in the gut microbiota in animals, where diverse studies have shown a relationship between gut microbiota composition and health in adults ; for instance, changes in the gut microbiome (the microbiota inferred from sequencing techniques) are associated with cardiovascular disease . To control the microbiota, the mammalian immune system uses the inflammatory system; a component of this system is the inflammosome, a protein complex containing pattern-recognition receptors (e.g., NLRP3) that upon recognition of its target activates the inflammatory response. Such activation should be regulated, otherwise may lead to dysbiosis . For instance, an activating mutation in the NLRP3 receptor has been associated with autoinflammatory conditions on the skin , but no apparent pathology on the intestines . The absence of any pathological state in the intestines of animals carrying this activating NLRP3 mutation comes with an altered composition of the gut microbiota, which provides resistance to inflammation; furthermore, such altered composition of the microbiota is the result of producing AMP by IL-1β cytokine . Many other examples have been documented where AMP help control the microbiota in animals . Thus, AMP play a role on controlling microbiota early in life and during the rest of the life of hosts; consequently, AMP sequences may have been adapted during the evolution of the species . In summary, AMP are involved in controlling microbiota establishment and composition during our life, hence acting as natural prophylactic drugs.
AMP as therapeutic drugs
The aim of the present review is not to cover all different peptides clinically approved to treat infections in humans. Instead, we have analyzed the multiple functions that AMP and CPP have on cells and in whole animals. The relevance of such analysis is to highlight the multifunctional nature of such peptides that are relevant to health under natural conditions: control of autophagy and microbiota. AMP have been used in clinical trials to treat different microbial infections, yet the mechanism of action of such AMP is not based on the combination of autophagy, immune regulation, and direct antimicrobial activities. The last AMP approved to treat microbial infections was daptomycin back in 2003 , despite the fact that the motivation to use AMP is their multifunctional capacity and consequently, the reduced chance to promote microbial resistance.
A recent review about the last two decades of failure to approve AMP in clinical trials noted different limitations of these AMP ; among others: (i) AMP may be toxic to human cells, (ii) AMP may lose activity under physiological salt conditions, (iii) AMP in the presence of serum may bind to proteins such as albumin, and (iv) AMP are susceptible to protease degradation, among others that will reduce their bioavailability. Most of these limitations are being addressed by different strategies such as chemical modifications to prevent proteolysis, shorten the AMP to reduce their capacity to interact with other proteins or include these within delivery systems to target the action of AMP. In that same review, 43 AMP that are under clinical investigation were identified; among these 43 AMP only 14 are recognized as multifunctional, in this case to display a direct antimicrobial activity through membrane disruption and to display an immunomodulatory activity (see Table 1).
Five out of these 14 AMP are administered intravenously (Polymixin E, Daptomycin, hLK1-11, AP-214, and PMX-30063); hence, it is likely that their pharmaceutical action should overcome all the noted limitations of AMP above. It will be relevant to review these and other AMP that will enter human trials to learn how to continue developing new AMP that are able to control microbial infections in a complex biological system; of particular interest will be the molecular features of such AMP. For instance, are these AMP capable of inducing autophagy? Are these AMP also CPP? Which sequence/physicochemical/topological features differentiate these successful AMP from those that failed clinical trials? As more examples of successful multifunctional AMP with CPP activity are registered, such information may eventually be helpful in designing AMP to control gut microbiota.
Our review shows that AMP are multifunctional. Due to the cationic and amphipathic nature of AMP, these may bind anions such as DNA or ATP; for that end, AMP may need to penetrate cells and in doing so, AMP may activate autophagy. Furthermore, AMP may be recognized by receptors to regulate the production of cytokines, which may also act as antimicrobials directly or regulate the production of AMP. While these are generalities of many AMP, for this to act in a specific way, it is important to decipher the differences between penetrating from nonpenetrating AMP; for that end, CPP may provide some guidance. Furthermore, AMP and CPP may directly interact with proteins (e.g., receptors) that may also explain their multifunctionality. What are the structural properties of AMP that allow them bind selectively to different proteins remains unclear (see Figure 1).
The authors declare that there are no competing interests associated with the manuscript.
This work was supported in part by PAPIIT IT200320 to G.dR. and CONACYT A1-S-20638 to C.B. M.A.T.P. received a scholarship from CoNaCyT during his doctoral studies.
CRediT Author Contribution
Gabriel del Rio: Conceptualization, Resources, Supervision, Funding acquisition, Investigation, Writing—original draft. Mario A. Trejo Perez: Investigation, Visualization, Writing—review & editing. Carlos A. Brizuela: Conceptualization, Formal analysis, Funding acquisition, Investigation, Writing—original draft, Writing—review & editing.
The authors thank Maria Teresa Lara Ortíz for her technical assistance.
artificial neural network
critical assessment of function annotation
Food and Drug Administration
fraction of SP3-hybridized carbon atoms
hydrogen bond acceptor
isotropic surface area
Mathews correlation coefficient
number of aromatic ring
number of guanidine groups
number of negatively charged amino acid groups
Primary amine group
pseudo amino acid composition
support vector machine
eXplainable Artificial Intelligence