A short guide to abbreviations and their use in peptide science
-
Published:02 Feb 2024
-
Special Collection: 2024 eBook Collection
Amino Acids, Peptides and Proteins
Download citation file:
Abbreviations, acronyms and symbolic representations are very much part of the language of peptide science – in conversational communication as much as in its literature. They are not only a convenience, either – they enable the necessary but distracting complexities of long chemical names and technical terms to be pushed into the background so the wood can be seen among the trees. Many of the abbreviations in use are so much in currency that they need no explanation. The main purpose of this editorial is to identify them and free authors from the hitherto tiresome requirement to define them in every paper. Those in the tables that follow – which will be updated from time to time – may in future be used in this Journal without explanation.
All other abbreviations should be defined. Previously published usage should be followed unless it is manifestly clumsy or inappropriate. Where it is necessary to devise new abbreviations and symbols, the general principles behind established examples should be followed. Thus, new amino-acid symbols should be of form Abc, with due thought for possible ambiguities (Dap might be obvious for diaminoproprionic acid, for example, but what about diaminopimelic acid?).
Where alternatives are indicated below, the first is preferred.
Amino Acids
Proteinogenic Amino Acids
Ala | Alanine | A |
Arg | Arginine | R |
Asn | Asparagine | N |
Asp | Aspartic acid | D |
Asx | Asn or Asp | |
Cys | Cysteine | C |
Gln | Glutamine | Q |
Glu | Glutamic acid | E |
Glx | Gln or Glu | |
Gly | Glycine | G |
His | Histidine | H |
Ile | Isoleucine | I |
Leu | Leucine | L |
Lys | Lysine | K |
Met | Methionine | M |
Phe | Phenylalanine | F |
Pro | Proline | P |
Ser | Serine | S |
Thr | Threonine | T |
Trp | Tryptophan | W |
Tyr | Tyrosine | Y |
Val | Valine | V |
Ala | Alanine | A |
Arg | Arginine | R |
Asn | Asparagine | N |
Asp | Aspartic acid | D |
Asx | Asn or Asp | |
Cys | Cysteine | C |
Gln | Glutamine | Q |
Glu | Glutamic acid | E |
Glx | Gln or Glu | |
Gly | Glycine | G |
His | Histidine | H |
Ile | Isoleucine | I |
Leu | Leucine | L |
Lys | Lysine | K |
Met | Methionine | M |
Phe | Phenylalanine | F |
Pro | Proline | P |
Ser | Serine | S |
Thr | Threonine | T |
Trp | Tryptophan | W |
Tyr | Tyrosine | Y |
Val | Valine | V |
Copyright © 1999 European Peptide Society and John Wiley & Sons, Ltd. Reproduced with permission from J. Peptide Sci., 1999, 5, 465–471.
Other Amino Acids
- Aad
-
α-Aminoadipic acid
- βAad
-
β-Aminoadipic acid
- Abu
-
α-Aminobutyric acid
- Aib
-
α-Aminoisobutyric acid; α-methylalanine
- βAla
-
β-Alanine; 3-aminopropionic acid (avoid Bal)
- Asu
-
α-Aminosuberic acid
- Aze
-
Azetidine-2-carboxylic acid
- Cha
-
β-Cyclohexylalanine
- Cit
-
Citrulline; 2-amino-5-ureidovaleric acid
- Dha
-
Dehydroalanine (also ΔAla)
- Gla
-
γ-Carboxyglutamic acid
- Glp
-
Pyroglutamic acid; 5-oxoproline (also pGlu)
- Hph
-
Homophenylalanine (Hse=homoserine, and so on). Caution is necessary over the use of the prefix homo in relation to α-amino-acid names and the symbols for homo-analogues. When the term first became current, it was applied to analogues in which a side-chain CH2 extension had been introduced. Thus homoserine has a side-chain CH2CH2OH, homoarginine CH2CH2CH2NHC(═NH)NH2, and so on. In such cases, the convention is that a new three-letter symbol for the analogue is derived from the parent, by taking H for homo and combining it with the first two characters of the parental symbol – hence, Hse, Har and so on. Now, however, there is a considerable literature on β-amino acids which are analogues of α-amino acids in which a CH2 group has been inserted between the α-carbon and carboxyl group. These analogues have also been called homo-analogues, and there are instances for example not only of ‘homophenylalanine', NH2CH(CH2CH2Ph)CO2H, abbreviated Hph, but also ‘homophenylalanine', NH2CH(CH2Ph)CH2CO2H abbreviated Hph.Further, members of the analogue class with CH2 interpolated between the α-carbon and the carboxyl group of the parent α-amino acid structure have been called both ‘α-homo'- and ‘β-homo’. Clearly great care is essential, and abbreviations for ‘homo’ analogues ought to be fully defined on every occasion. The term ‘β-homo’ seems preferable for backbone extension (emphasizing as it does that the residue has become a β-amino acid residue), with abbreviated symbolism as illustrated by βHph for NH2CH(CH2Ph)CH2CO2H.
- Hyl
-
δ-Hydroxylysine
- Hyp
-
4-Hydroxyproline
- αIle
-
allo-Isoleucine; 2S, 3R in the l-series
- Lan
-
Lanthionine; S-(2-amino-2-carboxyethyl)cysteine
- MeAla
-
N-Methylalanine (MeVal=N-methylvaline, and so on). This style should not be used for α-methyl residues, for which either a separate unique symbol (such as Aib for α-methylalanine) should be used, or the position of the methyl group should be made explicit as in αMeTyr for α-methyltyrosine.
- Nle
-
Norleucine; α-aminocaproic acid
- Orn
-
Ornithine; 2,5-diaminopentanoic acid
- Phg
-
Phenylglycine; 2-aminophenylacetic acid
- Pip
-
Pipecolic acid; piperidine-s-carboxylic acid
- Sar
-
Sarcosine; N-methylglycine
- Sta
-
Statine; (3S, 4S)-4-amino-3-hydroxy-6-methyl-heptanoic acid
- Thi
-
β-Thienylalanine
- Tic
-
1,2,3,4-Tetrahydroisoquinoline-3-carboxylic acid
- αThr
-
allo-Threonine; 2S, 3S in the l-series
- Thz
-
Thiazolidine-4-carboxylic acid, thiaproline
- Xaa
-
Unknown or unspecified (also Aaa)
The three-letter symbols should be used in accord with the IUPAC-IUB conventions, which have been published in many places (e.g. European J. Biochem. 1984; 138: 9–37), and which are (May 1999) also available with other relevant documents at: http://www.chem.qnw.ac.uk/iubmb/iubmb.html#03
It would be superfluous to attempt to repeat all the detail which can be found at the above address, and the ramifications are extensive, but a few remarks focussing on common misuses and confusions may assist. The three-letter symbol standing alone represents the unmodified intact amino acid, of the l-configuration unless otherwise stated (but the l-configuration may be indicated if desired for emphasis: e.g. l-Ala). The same three-letter symbol, however, also stands for the corresponding amino acid residue. The symbols can thus be used to represent peptides (e.g. AlaAla or Ala-Ala=alanylalanine). When nothing is shown attached to either side of the three-letter symbol it is meant to be understood that the amino group (always understood to be on the left) or carboxyl group is unmodified, but this can be emphasized, so AlaAla=H-AlaAla-OH. Note however that indicating free termini by presenting the terminal group in full is wrong; NH2AlaAlaCO2H implies a hydrazino group at one end and an α-keto acid derivative at the other. Representation of a free terminal carboxyl group by writing H on the right is also wrong because that implies a terminal aldehyde.
Side chains are understood to be unsubstituted if nothing is shown, but a substituent can be indicated by use of brackets or attachment by a vertical bond up or down. Thus an O-methylserine residue could be shown as 1, 2, or 3.
Note that the oxygen atom is not shown: it is contained in the three-letter symbol – showing it, as in Ser(OMe), would imply that a peroxy group was present. Bonds up or down should be used only for indicating side-chain substitution. Confusions may creep in if the three-letter symbols are used thoughtlessly in representations of cyclic peptides. Consider by way of example the hypothetical cyclopeptide threonylalanylalanylglutamic acid. It might be thought that this compound could be economically represented 4.
But this is wrong because the left hand vertical bond implies an ester link between the two side chains, and strictly speaking if the right hand vertical bond means anything it means that the two Ala α-carbons are linked by a CH2CH2 bridge. This objection could be circumvented by writing the structure as in 5.
But this is now ambiguous because the convention that the symbols are to be read as having the amino nitrogen to the left cannot be imposed on both lines. The direction of the peptide bond needs to be shown with an arrow pointing from CO to N, as in 6.
Actually the simplest representation is on one line, as in 7.
Substituents and Protecting Groups
- Ac
-
Acetyl
- Acm
-
Acetamidomethyl
- Adoc
-
1-Adamantyloxycarbonyl
- Alloc
-
Allyloxycarbonyl
- Boc
-
t-Butoxycarbonyl
- Bom
-
π-Benzyloxymethyl
- Bpoc
-
2-(4-Biphenylyl)isopropoxycarbonyl
- Btm
-
Benzylthiomethyl
- Bum
-
π-t-Butoxymethyl
- Bui
-
i-Butyl
- Bun
-
n-Butyl
- But
-
t-Butyl
- Bz
-
Benzoyl
- Bzl
-
Benzyl (also Bn); Bzl(OMe)=4-methoxybenzyl and so on
- Cha
-
Cyclohexylammonium salt
- Clt
-
2-Chlorotrityl
- Dcha
-
Dicyclohexylammonium salt
- Dde
-
1-(4,4-Dimethyl-2,6-dioxocyclohex-1-ylidene)ethyl
- Ddz
-
2-(3,5-Dimethoxyphenyl)-isopropoxycarbonyl
- Dnp
-
2,4-Dinitrophenyl
- Dpp
-
Diphenylphosphinyl
- Et
-
Ethyl
- Fmoc
-
9-Fluorenylmethoxycarbonyl
- For
-
Formyl
- Mbh
-
4,4′-Dimethoxydiphenylmethyl, 4,4′-Dimethoxybenzhydryl
- Mbs
-
4-Methoxybenzenesulphonyl
- Me
-
Methyl
- Mob
-
4-Methoxybenzyl
- Mtr
-
2,3,6-Trimethyl,4-methoxybenzenesulphonyl
- Nps
-
2-Nitrophenylsulphenyl
- OA11
-
Allyl ester
- OBt
-
1-Benzotriazolyl ester
- OcHx
-
Cyclohexyl ester
- ONp
-
4-Nitrophenyl ester
- OPcp
-
Pentachlorophenyl ester
- OPfp
-
Pentafluorophenyl ester
- OSu
-
Succinimido ester
- OTce
-
2,2,2-Trichloroethyl ester
- OTcp
-
2,4,5-Trichlorophenyl ester
- Tmob
-
2,4,5-Trimethoxybenzyl
- Mtt
-
4-Methyltrityl
- Pac
-
Phenacyl, PhCOCH2 (care! Pac also=PhCH2CO)
- Ph
-
Phenyl
- Pht
-
Phthaloyl
- Scm
-
Methoxycarbonylsulphenyl
- Pmc
-
2,2,5,7,8-Pentamethylchroman-6-sulphonyl
- Pri
-
i-Propyl
- Prn
-
n-Propyl
- Tfa
-
Trifluoroacetyl
- Tos
-
4-Toluenesulphonyl (also Ts)
- Troc
-
2,2,2-Trichloroethoxycarbonyl
- Trt
-
Trityl, triphenylmethyl
- Xan
-
9-Xanthydryl
- Z
-
Benzyloxycarbonyl (also Cbz). Z(2C1)=2-chlorobenzyloxycarbonyl and so on
Amino Acid Derivatives
Reagents and Solvents
- BOP
1-Benzotriazolyloxy-tris-dimethylamino-phosphonium hexafluorophosphate
- CDI
Carbonyldiimidazole
- DBU
Diazabicyclo[5.4.0]-undec-7-ene
- DCCI
Dicyclohexylcarbodiimide (also DCC)
- DCHU
Dicyclohexylurea (also DCU)
- DCM
Dichloromethane
- DEAD
Diethyl azodicarboxylate (DMAD=the dimethyl analogue)
- DIPCI
Diisopropylcarbodiimide (also DIC)
- DIPEA
Diisopropylethylamine (also DIEA)
- DMA
Dimethylacetamide
- DMAP
4-Dimethylaminopyridine
- DMF
Dimethylformamide
- DMS
Dimethylsulphide
- DMSO
Dimethylsulphoxide
- DPAA
Diphenylphosphoryl azide
- EEDQ
2-Ethoxy-1-ethoxycarbonyl-1,2-dihydroquinoline
- HATU
This is the acronym for the ‘uronium’ coupling reagent derived from HOAt, which was originally thought to have the structure 8, the Hexafluorophosphate salt of the O-(7-Azabenzotriazol-lyl)-Tetramethyl Uronium cation.
In fact this reagent has the isomeric N-oxide structure 9 in the crystalline state, the unwieldy correct name of which does not conform logically with the acronym, but the acronym continues in use.
Similarly, the corresponding reagent derived from HOBt has the firmly attached label HBTU (the tetrafluoroborate salt is also used: TBTU), despite the fact that it is not actually a uronium salt.
- HMP
Hexamethylphosphoric triamide (also HMPA, HMPTA)
- HOAt
1-Hydroxy-7-azabenzotriazole
- HOBt
1-Hydroxybenzotriazole
- HOCt
1-Hydroxy-4-ethoxycarbonyl-1,2,3-triazole
- NDMBA
N,N′-Dimethylbarbituric acid
- NMM
N-Methylmorpholine
- PAM
Phenylacetamidomethyl resin
- PEG
Polyethylene glycol
- PtBOP
1-Benzotriazolyloxy-tris-pyrrolidinophosphonium hexafluorophosphate
- SDS
Sodium dodecyl sulphate
- TBAF
Tetrabutylammonium fluoride
- TBTU
See remarks under HATU above
- TEA
Triethylamine
- TFA
Trifluoroacetic acid
- TFE
Trifluoroethanol
- TFMSA
Trifluoromethanesulphonic acid
- THF
Tetrahydrofuran
- WSCI
Water soluble carbodiimide: 1-ethyl-3-(3′-dimethylaminopropyl)-carbodiimide hydrochloride (also EDC)
Techniques
- CD
-
Circular dichroism
- COSY
-
Correlated spectroscopy
- CZE
-
Capillary zone electrophoresis
- ELISA
-
Enzyme-linked immunosorbent assay
- ESI
-
Electrospray ionization
- ESR
-
Electron spin resonance
- FAB
-
Fast atom bombardment
- FT
-
Fourier transform
- GLC
-
Gas liquid chromatography
- hplc
-
High performance liquid chromatography
- IR
-
Infra red
- MALDI
-
Matrix-assisted laser desorption ionization
- MS
-
Mass spectrometry
- NMR
-
Nuclear magnetic resonance
- nOe
-
Nuclear Overhauser effect
- NOESY
-
Nuclear Overhauser enhanced spectroscopy
- ORD
-
Optical rotatory dispersion
- PAGE
-
Polyacrylamide gel electrophoresis
- RIA
-
Radioimmunoassay
- ROESY
-
Rotating frame nuclear Overhauser enhanced spectroscopy
- RP
-
Reversed phase
- SPPS
-
Solid phase peptide synthesis
- TLC
-
Thin layer chromatography
- TOCSY
-
Total correlation spectroscopy
- TOF
-
Time of flight
- UV
-
Ultraviolet
Miscellaneous
- Ab
-
Antibody
- ACE
-
Angiotensin-converting enzyme
- ACTH
-
Adrenocorticotropic hormone
- Ag
-
Antigen
- AIDS
-
Acquired immunodeficiency syndrome
- ANP
-
Atrial natriuretic polypeptide
- ATP
-
Adenosine triphosphate
- BK
-
Bradykinin
- BSA
-
Bovine serum albumin
- CCK
-
Cholecystokinin
- DNA
-
Deoxyribonucleic acid
- FSH
-
Follicle stimulating hormone
- GH
-
Growth hormone
- HIV
-
Human immunodeficiency virus
- LHRH
-
Luteinizing hormone releasing hormone
- MAP
-
Multiple antigen peptide
- NPY
-
Neuropeptide Y
- OT
-
Oxytocin
- PTH
-
Parathyroid hormone
- QSAR
-
Quantitative structure–activity relationship
- RNA
-
Ribonucleic acid
- TASP
-
Template-assembled synthetic protein
- TRH
-
Thyrotropin releasing hormone
- VIP
-
Vasoactive intestinal peptide
- VP
-
Vasopressin
J. H. Jones