Chapter 1: Identification of Cellular Protein–Protein Interactions
Published:25 Nov 2020
C. Benz, E. Kassa, E. Tjärnhage, S. B. Lind, and Y. Ivarsson, in Inhibitors of Protein–Protein Interactions, ed. A. Tavassoli, The Royal Society of Chemistry, 2020, ch. 1, pp. 1-39.
Download citation file:
There are myriad interactions between proteins in the cell. The diversity and dynamics of the interactions make them challenging to chart, and a large variety of methods have been developed to tackle the task. In this chapter we describe methods for mapping cellular protein–protein interactions. We describe the basic principles of each method, exemplify with relevant cases, and highlight strengths and weaknesses of the various assays.
Proteins are the executors of the cell. Proteins rarely perform their functions alone, but engage in binary protein–protein interactions (PPIs) and in larger complexes. The human interactome have been estimated to about 650 000 interactions.1 These PPIs are vital to cellular function and are often dynamic and regulated in response to different stimuli. Based on their binding interfaces, PPIs can be divided into two main groups.2 The first group represents interactions between folded proteins that interact with each other using large interfaces, which for example include obligate complexes between homodimers. The second group represent interactions between short linear motif (SLiMs, 3–10 amino acids found in intrinsically disordered regions of the proteome) and their binding partners (Figure 1.1).
SLiM-based interactions involve smaller buried surface areas and are often of low-to-medium affinities. However, the affinities of the SLiM-based interactions span a wide range, with dissociation constants (Kd values) ranging from nanomolar (e.g. the major histocompatibility complex (MHC) and its target peptides) to millimolar (e.g. between the nuclear pore complex and the FG repeat binding proteins that it transports into the nucleus) (Figure 1.1B). The diversity and dynamics of PPIs makes some of them challenging to chart. Consequently, there is not one universal method for identifying cellular PPIs. Instead, a plethora of methods are being used to chart the interactome, such as mass spectrometry (MS), phage display, array-based methods and two-hybrid methods. Each method has its own strength and limitation (Figure 1.1C), and different methods typically provide only partly overlapping information.
In this chapter, we survey commonly used and emerging methods for the identification and validation of cellular PPIs. For each method, we describe the basic principle, provide some relevant examples, and outline the pros and cons (Figure 1.1). We start with a section on MS-based approaches for broad PPI screening, followed by dedicated methods for SLiM-based interaction screening and end with a variety of cell-based approaches. We do not attempt to cover all methods available, or to give an in-depth description of each method, but provide the basics of how the selected methods work and what type of interactions they are applicable to, which may guide the reader when choosing a suitable method.
1.2 Mass Spectrometry-based Proteomics for Identification of PPIs
MS-based methods are powerful approaches for PPI analysis.3 They are frequently used to identify binary interactions or complexes (e.g. affinity purification coupled to MS, AP-MS) or to elucidate which proteins are in the vicinity of each other in the cell (proximity-labeling MS).4,5 The main challenges with MS-based approaches to identify PPIs lies in the sample preparation upfront of the MS instrument (Figure 1.2). The challenges are centered around how to maintain the protein of interest in a native binding conformation and how to capture and preserve interactions throughout the workflow. We outline various upfront approaches in the following sections, namely AP-MS, proximity-labeling MS and crosslinking MS (Figure 1.2).6 Other approaches, such as co-fractionation–mass spectrometry (CF–MS) that can be used to study global changes in complex composition upon stimulation, or native MS that can be used to study complexes, are beyond the scope of this chapter.7,8
After the upfront sample preparation, the samples are prepared for identification and relative protein content quantification by shotgun MS analysis using high resolving instruments, often equipped with Orbitrap mass analyzers.3 Proteins are digested into peptides using a protease (typically trypsin), digested peptides are dissolved in an acidic solution and loaded onto a reversed phase separation column. By using a liquid chromatography (LC) separation, peptides continuously reach the electrospray ionization, and generated ions are accelerated into the MS. The MS analyzes the peptides according to their m/z value (mass over charge) and values for both intact and fragmented peptides are recorded to give accurate determination of peptide mass as well as sequence information. To determine the identities of the proteins in the sample, collected spectra are compared with theoretical ions from searches of databases downloaded from e.g. UniProt. MS-data is routinely deposited in the PRIDE (PRoteomics IDEntifications) database.9
From a general perspective, the main advantages of MS-based approaches are that the analysis is unbiased and that the m/z value of the ions can be determined with high accuracy, which gives a high confidence in the generated data. A general limitation is that the instruments are expensive. Different studies typically produce only partially overlapping data. A set of large-scale MS-based human interactome studies have been integrated in the so-called Hu.MAP database (http://proteincomplexes.org) to tackle this issue.10
1.2.1 Affinity Purification Coupled to Mass Spectrometry (AP-MS)
AP-MS is based on the possibility of using a bait protein of interest to extract binary interactors and complexes from a cell lysate through affinity purification (Figure 1.2A). The affinity purification can be performed through co-immunoprecipitation (co-IP) or through pulldown. In co-IP, complexes of interacting proteins are immunoprecipitated from cell lysate using an antibody with high affinity and specificity for the bait protein. Databases such as the Antibodypedia are useful for finding suitable antibodies.11 If there is no suitable antibody available against the bait protein, or several different bait proteins are to be screened, an epitope tag can be genetically fused to the bait protein(s). The tag can be a short amino acid sequence such as FLAG, HA or c-myc, or a protein such as GFP (green fluorescent protein) or GST (glutathione S-transferase). The tagged bait protein is then expressed in the cell and pulled down using an antibody or other affinity matrices against the tag. This approach is commonly used in high-throughput studies. For example, the BioPlex 2.0 network, which consists of 56 553 interactions, was generated using 5891 HA-FLAG tagged bait proteins.12 Similarly, Hein and co-workers identified 28 504 interactions using 1125 GFP-tagged bait proteins.13
The results of AP-MS experiments contain information on relevant binders as well as non-specific binders that have affinity towards the used affinity tags, antibodies or solid affinity supports and/or highly abundant proteins. To decrease the background and increase specificity, a tandem affinity purification (TAP) tag can be employed.14 The TAP tag consists of two different affinity tags (protein A and a calmodulin-binding peptide) separated by a tobacco etch virus (TEV) protease cleavage site. This allows for two consecutive affinity purification steps with enzymatic cleavage in between. The repeated purification steps are advantageous in case of high-affinity interactions, but low affinity and transient interactions are more likely to be lost during the long and stringent procedure. A wide range of different affinity tags following similar construction can be used for this purpose.15
To increase the confidence in interaction identification, several databases and tools have been developed. The contaminant repository for affinity purification (the CRAPome) offers a database of common contaminants from several AP-MS studies, taking experimental parameters into account.16 A multitude of different scoring tools are available to evaluate the data obtained from AP-MS experiments, such as SAINT, CompPASS and MiST.17–19 Guidelines and recommendations for AP-MS study design, scoring and data visualization have been summarized by Morris et al.20 Recent developments in instruments and data analysis software implement quantitative MS that enables measurement of relative affinity enrichment rather than affinity purification. Thus, non-specific binders do not have to be completely removed.21
Pros of AP-MS
There is no need for specialized equipment for the affinity purification.
PPIs are discovered in a relevant, close to a cellular environment, where proteins have their native post-translational modifications (PTMs).
Cons of AP-MS
Complexes need to be maintained during affinity purification. Weak, transient PPIs are often lost during washing steps.
The affinity tag can affect protein expression, folding, localization and interactions.
It generates a high frequency of false negatives (real interactors removed by stringent washing steps) and false positives (non-specific binders).
It is not suited for interactions with membrane proteins.
1.2.2 Proximity-labeling MS: BioID and APEX
Proximity-labeling MS is a more recent alternative to AP-MS (Figure 1.2). In this approach, bait proteins are expressed as fusion proteins with a specialized labeling enzyme (i.e. a mutant biotin ligase in BioID and a biotin peroxidase in case of APEX).22 The enzyme is activated and, using exogenous biotin, biotinylates proteins in the vicinity of the bait protein. Cells are then lysed and the labeled proteins are affinity purified (e.g. using magnetic beads coated with streptavidin) and detected by MS. Enzyme-only negative controls are commonly used to reveal proteins that are labeled due to their association with the enzyme and not the bait.23–25 Cells that are not expressing the fusion protein, or are not supplemented with biotin can also be used as negative controls.26,27 Relevant biotinylated proteins are then determined using quantitative and statistical approaches. While these methods do not provide exact proof of the interaction, the labeling radius can be small enough to label primarily the interaction partners and proteins within the same complex. Proximity-dependent labeling can further be used to determine the cellular localization.28 Despite being a relatively recent addition to the toolbox, several variants of this approach (e.g. BioID, APEX) have already been established, as detailed below.
In BioID a promiscuous mutant biotin ligase (BirA*) is fused to the protein of interest.22 BirA catalyzes the combination of biotin and ATP to generate a highly reactive molecule called biotinoyl-5′-AMP (BioAMP). BioAMP is released and biotinylates primary amines, such as the side chain of lysine, within a radius of approximately 10 nm.29 The first version of BioID required rather a long time for effective labeling (18–24 h).27 Recently, biotin ligase variants have been generated that have more favorable properties for interaction analysis. BioID2 uses a considerably smaller, engineered biotin ligase tag that also requires less biotin for effective labeling.24 TurboID is similar in size to the BioID tag but offers labeling times as short as 10 minutes.27 Furthermore, the miniTurbo is comparable in size to the BioID2 tag, while having a markedly higher activity.27 BioID MS, for example, has been used to investigate the interactome of TDP-43 aggregates, which are related to neurological diseases such as amyotrophic lateral sclerosis and frontotemporal dementia disease spectrum.30
The 28 kDa ascorbate peroxidase (APEX) tag is used in a similar fashion to BioID.31 APEX is fused to the protein of interest which, after activation, biotinylates proteins in close proximity. APEX generates reactive biotin-phenoxyl molecules after the addition of biotin-phenol and hydrogen peroxide. These radicals have a short lifetime (<1 ms) and target electron-rich amino acid (tyrosine, tryptophan, histidine, cysteine) residues.32 The original APEX tag is capable of providing sufficient biotinylation in about 1 minute, which is considerably faster than BioID. This makes it possible to investigate dynamic processes as well as to take “snapshots” of interactions inside the cell. The labeling radius is reported to be smaller than 20 nm.32 APEX2 was generated through directed evolution.33 It carries a point mutation (A134P) that gives the enzyme an increased activity. APEX2 is also more resistant to the H2O2 used in the biotinylation process. These improvements allow the use of APEX2 at low expression levels. An example of how the APEX proximity-labeling approach can be used for interactome analysis is provided by Markmiller et al., who used a CRISPR/Cas9-generated G3BP1-APEX2-GFP fusion protein to identify interactions in stress granules in Drosophila under different conditions.34
One of the main differences between APEX and BioID is the different side chains the reactive biotin compound targets. Lysine residues, targeted in the BioID method, are usually more abundant and more exposed on the surface of the protein or complex than the electron-rich side chains targeted by APEX. In addition, the reactive biotin compounds have very different lifetimes. In the case of BioID the lifetime of the BioAMP molecule is a few minutes in contrast to the half-life of the biotin–phenoxyl radicals, which is less than 1 ms.32 This, in theory, would cause different labeling radii, but the reported ones are fairly similar. Both BioID and APEX2 have further been developed for binary PPI mapping protein complementation assays.35,36 There is thus a continuous method development in the proximity-labeling MS field. Finally, it is worth mentioning that there are other, similar proximity-labeling approaches being developed using, for example, the SNAP-tag and photoreactive labeling molecules.37
Pros of proximity-labeling MS
Interactions can be examined and captured in a relevant cellular context.
It is suitable for identifying low affinity and transient interactions as the interaction does not have to be preserved.
Cons of proximity-labeling MS
It typically reports on the vicinity of proteins rather than binary interactions or complexes.
The fusion tags may sterically hinder interactions.
1.2.3 Crosslinking MS – XL-MS
XL-MS is emerging as a viable approach to capture low affinity and transient protein–protein interactions (Figure 1.2C).38 In these experiments, interaction partners are captured by the introduction of covalent bonds. The crosslinking can be incorporated into an AP-MS workflow and can be carried out in vitro or inside of living cells. In the latter case, the crosslinking agent is required to permeate the cell membrane. Several different crosslinking chemistries are used, from general approaches to tailored solutions, each with its unique benefits and drawbacks which are well summarized in a review by Holding.38 Commonly, homobifunctional N-hydroxysuccinimide esters are used to covalently link primary amines on the side chains of proximal lysine residues. Formaldehyde can also be used as a non-specific crosslinking agent that can pass through the cell membrane. The crosslinking can be reversed, most often by using elevated temperature.39 Other solutions include photoreactive crosslinkers that can be cleaved with UV light, crosslinkers that can be cleaved by reduction or collision in the mass spectrometer, isotopically labeled crosslinking agents or zero length crosslinkers.38 An example of XL-MS is the study of the nucleus interactome of human cells.40 In this study, an MS-cleavable crosslinker was used and the analysis identified approximately 8700 crosslinks. Strikingly, the results showed defined interaction hot spots on core histones. For crosslinked fragments with posttranslational modifications (PTMs) it was possible to extract data for how specific modifications influence nucleosome interactions. The information generated through crosslinking MS can further be used to elucidate structural details of proteins and protein complexes.41
Pros of XL-MS
It captures low-affinity and transient interactions.
Crosslinking inside living cells may provide information on biologically relevant interactions.
Cons of XL-MS
Formaldehyde can non-specifically crosslink proximal proteins.
The crosslinks may hamper tryptic digestion. Reversing the crosslinking should be considered.
Identification requires specialized software.
1.3 Display Methods
SLiM-based interactions are underrepresented in the current maps of the interactomes. Their often low-to-medium affinities (typically micromolar range), and rapid dissociation kinetics, make them difficult to capture through approaches such as AP-MS.42 However, a variety of display methods have been developed for finding these interactions, such as phage display, mRNA display and yeast surface display described in this section (Figure 1.3). Other display systems such as bacterial surface are described elsewhere.43 In the display methods, peptide or protein libraries are genetically encoded and are then presented by the display system giving a link between the genotype and phenotype. Peptide display experiments are often performed using combinatorial peptide libraries where large, highly diverse peptide libraries are displayed. Such libraries are optimal for finding high-affinity ligands with shared consensus motifs. Once established, the shared consensus motif can be used to predict ligands in the human proteome, by bioinformatically scanning the proteome for matching regions.44 However, it is inherently difficult to bioinformatically predict ligands, and such analysis is prone to generate a lot of false positive hits.45 A more straightforward approach to identify binding motifs in the proteome is to present peptides that represent intrinsically disordered regions of the proteome to a bait protein using a display system. This can be accomplished by fragmentation and incorporation of cDNA into the display system, or by computational design of synthetic oligonucleotide libraries (Figure 1.3).46,47 The latter gives better control over library quality and coverage.
1.3.1 Proteomic Peptide-phage Display – ProP-PD
Of the display methods, phage display is dominating the field due to the ease and low cost of the experiments. A bacteriophage is a virus that infects and replicates in bacteria. By fusing a library of peptides, or proteins, to a phage coat protein, the peptides are displayed on the surface of the phage and can be used in selections against bait proteins of interest, as shown by the seminal work on the M13 peptide phage display by George Smith.48 The filamentous M13 phage and the lytic T7 phage are commonly used display systems.46 Here we describe the M13 phage display system, and we refer the readers to a recent review on the T7 system.49 Peptides are typically displayed on the M13 minor coat protein p3 or the major coat protein p8. In the first case, the phage will display between one and five copies of the peptide, and in the latter case the phage will display hundreds of copies of the same sequence. Multivalent display on the p8 protein is used to capture weak and transient interactions, and the monovalent display is used to capture high-affinity interactions (e.g. for inhibitor design). It is common to use a hybrid phage display approach, where a phagemid encodes for a recombinant p3/p8 protein and a helper phage provides all other proteins required to make a functional phage. Libraries can be generated using fragmented cDNA, but high-quality proteomic peptide phage display (ProP-PD) libraries are constructed using synthetic oligonucleotide libraries bioinformatically designed to encode for peptides that tile regions of a proteome of interest.46,50,51 The oligonucleotide libraries are fused to the gene encoding the selected phage coat protein, and the library is transformed into E.coli that then produce the phage library. The phage library is used in selections (panning) against an immobilized bait protein, or against arrays of proteins immobilized in a 96-well plate.52 Unbound phage particles are washed away, and bound phage are eluted and used to infect actively growing bacteria. The infected bacteria produce more phage particles, after which the amplified binding-enriched phage library is used in repetitive rounds of selections for three to five rounds. The peptide coding region of the binding-enriched phage particles are then sequenced. Traditionally, Sanger sequencing was used to sequence individual phage clones. Nowadays, next-generation sequencing (NGS) is routinely used to analyze the content of binding-enriched phage pools. This obviates the need for analyzing individual clones and provides deep information on the enriched sequences. Consensus binding motifs can be established based on the peptide sequences. The peptides can further be directly matched to the relevant proteome (or the library design), which provides direct information on which proteins the enriched peptides belong to. ProP-PD libraries have, for example, been used to chart interactions between PDZ domains and the C-terminal regions of human and viral proteomes.51 The approach has further been used to chart the docking motifs of protein phosphatases to their substrates (e.g. PP2A B56 and PP4).53,54 The method is applicable to a large part of the peptide binding domains and proteins, and is expected to provide large-scale information on motif-based interactions in the years to come.55
Pros of ProP-PD
It is a scalable method for the identification of SLiM-based interactions.
Once constructed, a phage library can be used an almost unlimited amount of times.
It provides largely unbiased information on motif-based interactions.
There is no need for specialized equipment for the experiments.
It is relatively low-cost.
Cons of ProP-PD
The search space is limited to the library design.
Some proteins may not behave well upon immobilization.
Some peptides can affect growth and infectivity of bacteria, which may skew the results.
False negatives can occur as a result of the competitive nature of the assay.
False positives can be caused by sticky or promiscuous peptides.
1.3.2 mRNA Display
mRNA display (Figure 1.3) is a well-established approach for finding high-affinity ligands for peptide binders.56 It takes advantage of the translation-terminating antibiotic puromycin, which is an analogue of the 3′ end of a tyrosyl-tRNA that has a non-hydrolysable amide bond. When puromycin enters the A site of ribosomes it forms a covalent bond with the nascent peptide and causes a release of the translation product. A covalent bond between a polypeptide and its encoding mRNA is accomplished during in vitro translation by having puromycin linked to the 3′ end of the mRNA templates used, using a flexible nucleotide spacer of appropriate length. This gives a physical link between the genotype and the phenotype. The generated mRNA library is then used in repeated rounds of selections, and the sequences of the selected peptides can be determined by DNA sequencing after reverse transcription and PCR amplification.
Typically, mRNA display is performed using highly diverse libraries of randomized peptide sequences.57 A main advantage of the approach is the possibility to display more diverse libraries than with other display techniques. However, mRNA display has also been used to display regions of the proteome. An mRNA-display library constructed with cDNA was, for example, used to discover calmodulin-binding proteins.58 Furthermore, an mRNA library derived from fragmented mRNAs isolated from various kinds of human cells was recently used to identify a high-affinity ligand for the Keap1.59 Thus, the approach has potential to contribute to interactome mapping.
Pros of mRNA display
mRNA display libraries can be larger than phage display libraries (1012).
It can be used for the display of non-natural/PTM containing peptides.
Cons of mRNA display
Instability of mRNA
It is more experimentally tedious than phage display.
1.3.3 Yeast Surface Display
The protein folding and secretory machinery of S. cerevisiae is homologous to that of mammalian cells, which can be used to efficiently express mammalian proteins that are difficult to express in bacterial hosts. Yeast surface display was pioneered by Boder and Wittrup in 1997.60 Agap2 is the protein most commonly used for yeast surface display, although other yeast cell wall proteins have been used for the display, as reviewed elsewhere.61 In the Aga2p system, proteins or peptides are displayed on the cell wall of yeast by fusion to α-agglutinin. The glycoprotein is composed of two-subunit glycoproteins, Aga1p and Aga2p. The Aga1p subunit anchors the assembly to the cell wall via a covalent linkage, and the Aga2p subunit is linked to Aga1p via two disulfide bonds. Peptides or proteins are genetically fused to either end of Aga2p and are displayed on the yeast cell surface. Each yeast cell may display 104–105 copies of fusion proteins.61,62 By adding epitope tags to the N- and the C-terminus of the fusion protein it is possible to quantify the expression level of the fusion protein, which can be used to relate apparent affinities to the expression levels.
Yeast surface display libraries can be screened using fluorescently labeled proteins through magnetic-activated cell sorting (MACS) or by fluorescence-activated cell sorting (FACS, e.g. by using GFP-tagged baits or by staining the proteins using fluorescently labeled antibodies). The sorted yeast pools are then sequenced using NGS. Yeast surface display combined with FACS further allows for affinity discrimination between ligands, and opens up the possibility to obtain information on non-binding sequence.63 Libraries of 106 clones may be screened by FACS. For larger libraries MACS is sometimes used for a first enrichment of binders, followed by FACS sorting.64
Yeast surface displayed cDNA libraries have been constructed and used for various applications, such as the display of peptide libraries and the identification of antigens and interactions with small molecules.65–67 It has also been used for PPI mapping. For example, a yeast surface displayed human cDNA library displaying fragments of random length was used to find protein fragments that bind tyrosine phosphorylated peptides, which revealed interactions with several phospho-peptide binding SH2 domains.68 Recently, the yeast display system has been engineered for screening based on the reprogramming of yeast mating.69 The authors showed that the method can be used to capture interactions with Kd values in the range of 0.5 nM to 300 μM and that the approach can be used for library-on-library screening, meaning that one library is for interactions against another library. It will be very interesting to see the results of this large-scale screening approach over the years to come.
Pros of yeast surface display
Proteins and peptides are expressed in yeast, which is a eukaryotic host that has some levels of PTMs.
It is possible to obtain information on non-binding sequences.
It is possible to multiplex.
Cons of yeast surface display
The library size diversity (∼107–108) is lower than for other display methods.
The avidity effects caused by the high number of copies of displayed peptides or proteins per yeast cell may lead to the capture of weak and irrelevant interactions.
1.4 Array-based Methods
Array-based methods are used to screen for interactions between a given ligand and an array of immobilized proteins or peptides (Figure 1.4). When combined with MS, the approaches can also be used for proteome-wide interaction screening. These approaches are excellent for characterizing SLiM-based interactions and allow the identification of interactions that rely on PTMs. A general drawback is the limited quality control of immobilized proteins or peptides.
1.4.1 Protein Arrays
PPIs can be screened using protein microarrays. For these experiments, proteins typically need to be expressed and purified. Low-throughput purification and characterization give proteins of high quality, that can then be arrayed. Alternatively, proteins can be expressed and purified in high-throughput (e.g. 96-well format) although typically with less control of the protein quality.70,71 The purified proteins are immobilized on a surface and probed for binding to a labeled peptide or protein.72–74 Unbound ligands are washed away and bound ligands are detected using the probe, which can be, for example, a fluorescent tag. Alternatively, an antibody that recognizes the tag of the query protein/peptide can be used together with a fluorescently labeled secondary antibody (or HRP conjugated antibody). The output signal can then be detected spectrophotometrically.75
Protein arrays have, for example, been used for family-wide analysis of binding specificities. The interactome of the human WW domain family was explored by screening a WW domain array against 2200 peptides from the proteome with putative WW binding motifs.76 Similarly, the interactome of murine PDZ domains was analyzed by a combination of protein microarrays and quantitative fluorescence polarization assays.77 The approach can further be used to screen for PTM-dependent interactions. For example, protein microarrays of 106 phosphotyrosine-binding SH2 domains and 41 phosphotyrosine-binding (PTB) domains were used to explore their phospho-Tyr-dependent interactions with sites on the ErbB receptors, thereby providing an interaction network for these receptors.78
A general limitation of protein arrays is the need to purify the proteins. NAPPA (Nucleic Acid Programmable Protein array) is a variant of protein arrays that circumvents the issue of protein production and purification. In NAPPA, the cDNA of the proteins to be screened is immobilized on a microarray surface (glass or nitrocellulose).79 The cDNA is in vitro transcribed by RNA polymerase followed by translation in situ by using a mammalian cell free expression system based on rabbit reticulocyte lysate or a wheatgerm extract reaction.80 The expressed proteins are fused to a tag (typically GST) that is recognized by the surface (e.g. an anti-GST capture antibody) next to the cDNA that captures the expressed protein on the surface of the microarray. The immobilized proteins are then challenged with the query protein or peptide that can be detected by a suitable antibody. The NAPPA approach was benchmarked by mapping pairwise interactions between a set of DNA replication initiation proteins, and recapitulated 85% of the previously biochemically determined interactions.75 The numbers of spots on a NAPPA array has typically been limited to about 2000 due to a relatively low capture efficiency of the anti-GST antibody.81 More recently, Yazaki et al. showed that the use of a HaloTag led to a four times greater capture of the expressed protein on the microarray surface.82 Using this approach, they made a high-density protein array of 12 000 Arabidopsis proteins and generated an interactome of 38 transcription factors and their transcriptional regulators.
Pros of protein arrays
They can be used to screen for interactions that require PTMs of target peptides.
Self-assembling protein arrays obviate the need for purifying proteins and have the advantages that after the cDNA is bound the microarray is stable in dry room condition until the activation to expression of protein happens.
Cons of protein arrays
The production and quality control of proteins for a classical protein array can be labor intensive.
Some proteins may not behave well upon immobilization.
The tag added for protein immobilization in NAPPA can cause steric hindrance or negatively influence protein folding.
1.4.2 Peptide Arrays
Peptide arrays are frequently used to explore SLiM-based interactions. In these arrays, overlapping peptides (typically in the range of 12–18 amino acids) are designed to tile a protein sequence (or even a proteome of interest) and are covalently attached to a solid support (a cellulose membrane or a glass slide).83,84 The peptide array is incubated with the protein of interest, and the bound protein is then detected using standard immunoblotting techniques, or by using fluorescent target proteins. Once a binding region has been identified, the binding motif can be established by systematically varying (e.g. Ala scanning) the identified sequence. Peptide arrays are often used to narrow down the binding region in a known target. For example, following such an approach, a novel LxxPTPh motif in yeast protein Kar9 was found to bind to End-binding proteins.85 The array designs are often made to tile the full sequence of the protein of interest, but it might be wiser to focus on the intrinsically disordered regions to avoid identifying false positive hits (e.g. sequences that can bind but would not be available for binding in the context of the full-length protein). Peptide arrays can also be used to chart interactions on a proteome-wide scale. Such an analysis was performed for yeast SH3 domains using a peptide array that presented all the predicted SH3 binding motifs in the species.86,87
Peptide arrays can further be used to explore the effects of PTMs on binding. The phospho-Tyr binding specificity of a set of SH2 domains, for example, was elucidated using a quantitative peptide microarray-based approach of 124 physiological phospho-Tyr peptides.88 Moreover, the SH2 domain interaction landscape of the human cell was elucidated using a high-density peptide chip of most of the phospho-Tyr-containing ligands of the human proteome.89 The effects of other PTMs can also be probed, as shown for the acetylation-dependent interactions of the bromodomain family.90
Another application of peptide arrays for interaction screening is peptide array crosslinking (PAX).91 PAX uses the setup of peptide arrays and combines it with crosslinking to capture interactions between peptide motifs and proteins. A photoactive crosslinker (pBpa, a phenylalanine derivative) is incorporated in the peptide sequences of the array, and the peptide array is then incubated with cell lysate and subjected to UV light (350–365 nm) to induce crosslinking. The non-crosslinked proteins are then washed off using harsh, denaturing conditions, and parts of the peptide array membrane are subjected to MS analysis after sample preparation. The array setup allows for higher throughput. The crosslinking happens very close to the interaction site, as only short peptides are presented on the array membrane, and is fairly specific. However, it is important to consider the position of the photoactive crosslinker in the sequence, to avoid disrupting interactions or introducing new ones. The amount of crosslinked interacting proteins is low, which can be a problem for the MS analysis.
A recent addition to the tool box is protein interaction screen on peptide matrix (PRISMA), which combines peptide arrays with MS.92,93 In PRISMA, peptides of up to 25 amino acids are synthesized on a cellulose membrane. The membrane is incubated with a cell extract for affinity enrichment of binders, peptide spots are isolated, bound proteins are digested and the identities of the enriched proteins are then determined through MS. The approach is powerful as it combines the detailed analysis provided by peptide arrays with the unbiased approach of MS. PRISMA has been used to explore how mutations in intrinsically disordered regions influence PPIs, which uncover a pathogenic gain of function mutation that creates a novel dileucine motif.92 The approach has further been used to explore the interactome of the CCAAT enhancer-binding protein beta (C/EBPbeta) and how it is modulated by PTMs.93
Pros of peptide arrays
They are available from several commercial providers.
Throughput is relatively high.
They allow semi-quantitative analysis.
Information on both binder and non-binder peptides is provided.
Non-natural amino acids can be incorporated, and it is possible to probe PTM-dependent interactions.
PAX allows for harsh washing conditions.
PRISMA combines the resolution of the peptide arrays with the unbiased approach of MS.
Cons of peptide arrays
The process is prone to produce both false positive and false negative hits, which in part is due to a varying yield and purity of the synthesized peptides.
1.5 Fluorescence- and Luminescence-based Methods
Fluorescence- and luminescence-based methods are often used to confirm or to screen for binary interactions.94 In these assays the bait and the prey proteins are genetically linked to a fluorescence/luminescence donor and co-expressed in a cell. Given that the bait and prey interact and are in close proximity, a fluorescence/luminescence signal can be detected using a suitable microscopic setup (e.g. a confocal microscope and an analysis software) (Figure 1.5). A common advantage of these approaches is that they report on binary interactions, and a shared disadvantage is that they require bait and prey proteins to be physically linked to fluorescent donors and acceptors, which may affect interactions and cellular localization.
1.5.1 Förster Resonance Energy Transfer (FRET)
FRET is a mechanism by which energy is transferred from one light-sensitive molecule to another.95 The fluorophores are chosen to ensure an overlap between the emission spectra of the donor and the absorbance spectra of the acceptor. For example, the FRET pairs ECFP/EYFP, mTurquoise/mCitrine or EGFP/mCherry.94 An external light source is used to excite the donor. If there is an interaction between the bait and the prey proteins, the donor and the acceptor will be brought into close proximity. There will then be a non-radiative transfer of energy from the donor to the acceptor through a dipole–dipole coupling mechanism. The fluorescent acceptor will in turn emit a fluorescent signal at wavelengths that are distinct from emitted light of the donor fluorophore, which can be detected using microscopy.96 The allowed distance for efficient energy transfer between the two fluorophores is 1–10 nm.
It is important to select the donor and the acceptor pair carefully, as there may be problems with crosstalk and bleed through. Crosstalk is caused by overlapping excitation spectra of the donor and the acceptor and results in the acceptor being excited directly by the applied excitation light. Bleed through occurs when the fluorescence emission from the donor is detected in the range of the acceptor fluorescence emission. FRET requires optimization of the orientation and localization of the donor and acceptor.97,98 However, once an efficient system has been established, it can be used for inhibitor screening in a high-throughput manner.99 The approach can also be used for PPI screening when combined with automated fluorescence lifetime imaging (FLIM).100
Pros of FRET
Dynamic intracellular equilibrium between complex formation and dissociation can be observed.
It allows the detection of the interactions in living cells.
Cons of FRET
The free fluorophores can mask energy transfer.
The spatial proximity of the fluorophores is very important for strong read out.
There are problems with crosstalk and with bleed through.
Autofluorescence in the cell can cause a strong background signal. Increasing the intensity of the external light sources leads to enhanced output signal but also increases photobleaching.
1.5.2 Bioluminescence Resonance Energy Transfer (BRET)
BRET circumvents some of the limitations of FRET.101 In this approach the bait protein is typically linked with a Renilla luciferase and the prey linked to a green or yellow fluorescent protein (GFP/YFP) that serves as energy acceptor. The luciferase and the fluorescent protein are brought into close proximity given that the bait and prey interact. When a luciferin substrate is added, the luciferase produces light by oxidizing the substrate. This produces emission of light with a peak emission wavelength of 482 nm. If the GFP/YFP energy acceptor is in close proximity, the energy is transferred to the acceptor, and is released as green/yellow light emitted by the prey fluorophore. For BRET the distance for the energy transfer has to be in the 10 nm range, which is in the same range as FRET.101 The emission intensity of the acceptor is related to the light emitted by the donor luciferase and has to be normalized to the donor luciferase signal alone in control cells. BRET has, for example, been used to study the dynamics and activity of G-protein coupled receptors (GPCRs), such as their multimerization and their trafficking.102–105
The standard BRET system has been improved by the development of NanoLuc, a small (19 kDa) stable enzyme engineered from a luciferase subunit from the deep-sea shrimp Oplophorus gracilirostris. NanoLuc has a high emission intensity and a relatively narrow spectrum, and is 150-fold brighter than the Renilla luciferase when used together with the substrate furimazine.106 Furimazine further displays enhanced stability and lower background activity, resulting in overall improved performance. Several commercial NanoLuc vectors and systems are available, as reviewed in detail elsewhere.107
Pros of BRET
Interaction can be measured in cells and in real time.
It is more sensitive than FRET.
Cons of BRET
BRET signals tend to be significantly weaker than the signals produced by FRET.
The spatial proximity of the donor and the acceptor is very important for strong read out.
The tagging of the bait and prey proteins may block interactions.
1.5.3 Luminescence-based Mammalian Interactome Mapping – LUMIER
In LUMIER, the bait is expressed in mammalian cells with an affinity tag (e.g. FLAG).108,109 The prey is genetically linked to a luciferase (e.g. Renilla luciferase) and co-expressed with the bait in mammalian cells. The interaction is co-immunoprecipitated with an antibody against the affinity tag. A luciferase substrate is then added. If the prey was co-immunoprecipitated with the bait, the luciferase will catalyze the formation of a luminescent product that is used as an output signal. The proportion of the measured signal is relatable to the amount of captured complex. By sequentially performing an enzyme-linked immunosorbent assay (ELISA), a LUMIER-determined interaction can be analyzed to determine the interaction strength.109 The assay detects binary interaction but may also report on indirect interactions. In the original high-throughput study from 2005, the PPIs initiated by transforming growth factor-beta signaling and the influence of post translational modification on the related SMAD signaling pathway were investigated.108 It can further be used to study interactions of transmembrane proteins.110
Pros of LUMIER
A straightforward read out is provided.
PPIs are detected in a cellular context, which gives the opportunity to study for example the effect of post-translational modification on binding.
It is scalable and can be performed in different cell lines.
Cons of LUMIER
Cells have to be lysed before analysis of the interaction, which may destroy weaker PPIs or introduce artifacts caused by bringing proteins together during cell lysis which would normally not interact with each other.
The transfection efficiency and expression of the proteins influence the output.
It is necessary to tag proteins.
1.6 Protein Complementation Assays (PCAs)
PCAs, or split systems, are based on a system where the potential binding partners are fused to split domains of a protein.111 The first example of a split protein was ribonuclease S, which was found to spontaneously fold into a structure from two unfolded fragments.112 The enzymatic activity could be restored upon reassembly of the two fragments. The ribonuclease S example illustrates two important features of the split systems, namely a lack of function of each fragment in the absence of the other fragment, and restoration of function upon complementation. Several split systems have been developed. We describe a set of them in this section, including the split dihydrofolate reductase (DHFR) and split TEV, and the use of split fluorescent and bioluminescent proteins.113 In addition, several split ubiquitin-based systems have been developed. Among the shared advantages of the split systems is that they can be used for high-throughput screening. In terms of disadvantages, the use of tagged proteins may lead to blocking of interaction sites and the interaction between the split fragments can stabilize the interaction between the tagged proteins and increase the barrier for dissociation (Figure 1.6).
1.6.1 Split DHFR and Split TEV
DHFR is an essential enzyme that catalyzes the reduction of dihydrofolic acid (DHF) to tetrahydrofolic acid. In the split DHFR assay, a mutated version of murine DHFR is used, that is insensitive to the DHFR inhibitor methotrexate (MTX). The protein can be split into two parts that form a functional enzyme when they are brought into close proximity. The DHFR fragments are genetically linked to the bait and the prey proteins, which then are co-expressed in a cellular context. Upon bait and prey interaction, the split DHFR parts fold into a functional and active enzyme. The reconstituted DHFR allows the cell to grow in MTX-containing medium, which can be used for survival selection.114 By fusing a bait to one of the fragments and a cDNA library to the other fragment, the interactome of the bait can be elucidated; the approach has been used to perform a genome-wide screen for PPIs in yeast.115 More recently, the approach has been combined with molecular barcoding and NGS analysis, which allows for multiplexed assays that can be used to screen for condition-dependent changes in PPIs.116 Using a dual barcoding system, the approach can be used for large-scale analysis.117
The split tobacco etch virus protease (TEV) assay is based on using inactive fragments of the TEV protease that reconstitute an active enzyme only when they are co-expressed as fusion proteins with interacting proteins.118 The functional reconstitution of TEV protease is detected by the proteolytic activation of reporters. The reporters may be the proteolytic release of a transcription factor and the subsequent production of a firefly luciferase or EGFP. Alternatively, the read out may be the release of an inhibited luciferase that becomes active upon proteolytic release. As the report is only active after proteolytic release, the approach does not report on the site of interactions. The versatility of the approach allows it to be applied to detect PPIs at the membrane and in the cytosol of mammalian cells.118 The split TEV system has, for example, been used to detect phosphorylation-dependent and transient interactions in living cells.119 A split receptor tyrosine kinase (RTK)-TEV system has further been developed, which can be used for interaction screening as well as for testing the effects of pharmacologically relevant inhibitors on the interactions.120
Pros of split DHFR and split TEV
Assays can be performed in prokaryotic and eukaryotic cells.
Assays can be performed in a high-throughput manner.
They are suitable for drug screening and analysis of the drug-induced changes in PPIs.
Cons of split DHFR and split TEV
They are difficult to quantify based on the output signal.
1.6.2 Bimolecular Fluorescence Complementation (BiFC) and Bimolecular Luminescence Complementation (BiLC) Assays
BiFC is a frequently used approach for testing PPIs in cells. In BiFC, a fluorophore, such as GFP, is split into two fragments and fused to the bait or the prey. The two proteins are co-expressed in the cell and, upon interaction of prey and bait, the fragments of the fluorescent protein are located in close proximity and complement each other to reconstitute a functional fluorescent protein. The intrinsic fluorescence can then be determined by microscopy.121,122 Using a combination of high-resolution imaging and BiFC, it is possible to track the subcellular distribution and dynamics of individual interacting protein pairs.123 BiFC can also be used in a higher throughput approach. For example, the interactions of proliferating cell nuclear antigen (PCNA) were screened using a human cDNA library.124 In this screen, PCNA was linked to the C-terminal part of the yellow fluorescent protein Venus, and the cDNA library was fused to the N-terminal Venus fragment. The assay was combined with fluorescent activated cell sorting and NGS to screen for PPIs in living cells. The screen confirmed previously reported interactions and identified several novel interactors.124
The BiLC method is a luminescence-based approach that is conceptually similar to BiFC.125 In BiLC, a luciferase is split into two parts and these are genetically linked to the bait and the prey proteins. Upon interactions between the bait and the prey, the reporter fragments are reassembled to a catalytically active pseudo-luciferase, which catalyzes the oxidation of the cell membrane permeable substrate. Several luciferases can be used for the assay, such as firefly luciferase (FLuc) from Photinus pyralis, RLuc from the sea pansy (Renilla reniformas) and GLuc from the copepod crustacean Gaussia princeps.125–129 FLuc uses D-luciferin/ATP as substrate, while RLuc and GLuc use coelenterazine. The luminescence generated by the decomposition of the oxidized species is detected using a luminometer. This can be used for compartment resolved determination of PPIs.
A variant of BiLC uses dual-color click beetle luciferase heteroprotein PCA. In this approach, the split protein is reconstructed from different luciferases, such as from the click beetle green and click beetle red, that utilize the same substrate (d-luciferin). The inactive fragments are genetically linked to the potential interaction partners. Upon binding, the split fragments fold into an active structure. By having split fragments from several luciferases, it is possible to tag several potential interaction partners, and detect and differentiate between the interaction partners based on the different emissions spectra.130
Pros of BiFC & BiLC
They allow identification of interaction in living cells together within information on where the interactions occur.
The techniques are very sensitive, allowing identification of weak and transient interactions and at low protein concentrations.
They are low tech and low cost.
Cons of BiFC & BiLC
False positives can occur because of the inherent affinity between the BiLC or BiFC fragments.
The fusion to the fluorophore fragments can influence the localization and interactions of the bait and prey proteins.
Slow maturation times and high stability of the complemented complexes limit the possibility for real-time analysis.
1.7 Two-hybrid Methods
Two-hybrid methods are designed for binary PPI screening (Figure 1.7). These genetic approaches, and in particular yeast-two-hybrid (Y2H), together with AP-MS have dominated the field of PPI analysis. The methods are based on the activation of downstream reporter genes as a consequence of a transcription factor binding an upstream activating sequence. The methods are scalable, which has been taken advantage of for binary interaction profiling of the human proteome.131 Since the first description of Y2H in 1989, several variants have been designed for specific purposes, such as screening of interactions involving membrane proteins (i.e. MYTH & MaMTH), which is highly relevant as around 30% of all proteins are integral membrane or membrane anchored.132–134 Emerging multiplexed two-hybrid methods will contribute to the speed and depth of interactome mapping.135,136 Novel n-hybrid methods are further being developed for charting interactions and for exploring the effects of mutations, which will contribute to unravel novel parts of interactome and to shed some light on the molecular details of the interactions.137
1.7.1 Yeast Two-hybrid (Y2H)
Y2H is a common approach for interaction screening.138 As the name implies, the assay takes place in yeast, and in particular in the yeast nucleus. Y2H utilizes the two different domains of the transcription factor GAL4, which is a yeast-specific transcription factor that binds to an upstream activating sequence (UAS) of reporter genes. Wild-type GAL4 has two domains, the DNA-binding domain (DNA-BD) that recognizes the UAS, and a transcriptional activation domain (AD) that contains binding sites for other transcript coregulators. In Y2H these two domains are physically separated at the genetic level and fused to the genes encoding candidate interacting proteins (the bait and the prey).132 The plasmids containing the prey fused to the AD and the bait fused to the DNA-BD are co-transformed into yeast and the fusion proteins are expressed. Given that the bait and prey interact, the GAL4 AD and DNA-BD domains will be brought into close proximity, and the domains will together function as a transcription factor, leading to expression of the reporter gene such as luciferase or GFP. Y2H has been shown to be an efficient method for large-scale screening.138 It has, for example, been used to detect interactions in a wide range of proteins from yeast, bacteria, animal and plant systems.132,139,140 Y2H interaction screening can be performed on a genome-wide scale using an ORFeome collection. Indeed, in a massive effort, Vidal and co-workers have been working on an all-by-all binary interaction mapping of human PPIs.141
Pros of Y2H
It allows binary interaction mapping.
It identifies both stable high-affinity interactions as well as finding more transient interactions.
It is an easy and economical approach that can be performed in most labs.
Cons of Y2H
It is prone to produce false positive hits. False positive hits may be caused by auto-induction, or by the high concentration of the overexpressed bait and prey proteins.
It is prone to produce false negatives. False negatives may result from low protein expression levels, problems with protein folding or lack of PTMs. Other problems relate to proteins that are excluded from the nucleus or are toxic to yeast. Further, the fusion of bait and prey proteins with the AD and the BD may lead to steric hindrance that masks the interaction sites.
1.7.2 MYTH and MaMTH
MYTH (membrane yeast two-hybrid) is a method to detect interactions involving membrane proteins first described in 1998.142 The MYTH technique is an adaptation of a split ubiquitin assay.143 In this assay, the bait, which is the membrane protein of interest, is tagged with the C-terminal module of the ubiquitin (Cub) followed by a chimeric transcription factor. The prey protein is fused to the N-terminal part of ubiquitin (Nub). The prey proteins can be membrane bound or cytosolic in nature. The spontaneous association between the Nub and the Cub is inhibited by a mutation in the Nub part of the protein. Yeast cells are transfected with the vectors encoding the bait and the prey proteins fused to the Nub and the Cub. The cells already contain a stably integrated region for binding the transcription factor followed by a region coding for expression of an output. Upon bait and prey interaction, pseudo-ubiquitin is reconstituted by the association of the Nub and the Cub. The reconstituted ubiquitin is recognized by deubiquitinating enzymes (DUBs), which are recruited to the site and proteolytically cleave the polypeptide chain, whereby the transcription factor is released. The transcription factor enters the nucleus and binds the UAS of the reporter.144 Different transcription factors can be used depending on the system, for example a lexA-based binary expression system. LexA is a bacterial transactivator, which can be introduced into the yeast cell where it binds to the lexA promotor leading to the transcription of selection markers such as LacZ, ADE2 or HSI2, which results in growth advantages.144 MYTH has, for example, been used to screen for interactions of 48 full-length human ligand-unoccupied GPCRs in their native membrane environment against a human cDNA library.145
MaMTH is a dedicated method for detecting interactions involving membrane proteins in mammalian cells.146 It is conceptually similar to MYTH but performed in mammalian cells. In MaMTH the Cub is linked to a chimeric transcription factor and fused to the bait protein (the membrane protein of interest). Different transcription factors can be used depending on the system (e.g. lexA or GAL4-based). The mammalian cells to be used for the assay are modified to contain a stable integrated transcription factor binding region (or operator) followed by a gene coding for an output signal, for example GFP or luciferase.146 The cytosolic or membrane bound prey is genetically fused to the Nub. The plasmids encoding the bait and the prey proteins are co-transfected into the cell. If the bait and the prey interact, pseudo-ubiquitin is formed by the association of Nub and Cub, which leads to the recruitment of DUBs and the proteolytic release of the transcription factor. The transcription factor then enters the nucleus and binds the operator, which leads to the production of the output signal, which can be detected by microscopy. MaMTH has, for example, been used together with MYTH to map the genome-wide interactome of receptor tyrosine kinases with phosphatases. In the initial MYTH screen, interactions between 48 receptor tyrosine kinases with 108 phosphatases were elucidated, which identified 310 unique PPIs. Using MaMTH, nine additional interactions were identified.147 MaMTH can further be used to explore how interaction patterns change in response to ligands or other external stimuli.146
Pros of MYTH and MaMTH
They constitute a low-technology approach that detects binary interactions.
They are easy to scale up.
They are suitable for identification of interactions of full-length membrane proteins in a suitable membrane environment.
Cons of MYTH and MaMTH
In the case of MYTH, expression and modification of proteins from another organism in a yeast host.
Protein overexpression may lead to artifacts.
At least one terminus of the membrane protein has to be cytosolic and linked to the Cub. This may block C-terminal binding motifs.
1.7.3 Mammalian PPI Trap (MAPPIT)
MAPPIT is another two-hybrid method that can be used to identify PPIs in a cellular environment.148,149 MAPPIT is based on insights into cytokine receptor signaling.150 It uses a STAT (signal transducer and activator of transcription)-dependent complementation assay approach. Under physiological conditions, cytokine receptors dimerize upon ligand binding, which leads to a close proximity of the receptor-associated Janus kinases (JAKs). JAKs cross-phosphorylate each other on tyrosine residues in their activation loops, which increases the activity of the kinase. The activated kinase domain then phosphorylates a tyrosine on the receptor tail, which creates a docking site for the phospho-Tyr-binding SH2 domain of STAT. Upon binding of STAT, JAK phosphorylates STAT, which leads to dissociation of STAT from the receptor. Phosphorylated STAT forms a dimer and enters the nucleus where it induces the transcription of target genes.151,152 The signaling system has been modified for the purpose of MAPPIT. The STAT SH2 target site is eliminated in the C-terminal tail of the chimeric receptor. The bait is fused to the cytosolic end of the receptor. The prey protein is genetically linked to a cytokine receptor fragment with functional STAT docking sites. Upon interaction of the prey with the bait, the receptor-associated JAK phosphorylates the binding site for the STAT SH2 domain, which in the next step leads to the phosphorylation of STAT. The phosphorylated STAT homo-dimerizes and enters the nucleus and induces the expression of an output signal under control of a STAT promotor, usually a luciferase activity or a resistance signal.148,152 The output signal level is dependent on several factors, including the affinity of the interaction. The approach can be used for proteome-wide screening of binary interactions.153
A recent study demonstrated that a high-throughput MAPPIT approach can further be used to identify and map interaction surfaces of various proteins.154 In reverse MAPPIT, small molecules or peptides can be screened for their potency to inhibit PPIs.155
Pros of MAPPIT
It can be used to examine the endogenous interaction in the cellular context.
It is scalable.
There is no need for specialized equipment other than cell culture and a tool to detect the output signal such as luciferase.
It detects binary binding events.
Cons of MAPPIT
The extremes of the output signals can be determined easily but in the medium growth range it is difficult to make clear statement about the results.
The relatively large tag may block physical interactions at the cytoplasmic submembrane region.
It is limited to interactions that can occur in the cytoplasm close to the membrane.
Interactions that activate the receptor or STAT are difficult to study.
It is not compatible with full-length transmembrane proteins.
1.8 Proximity Ligation Assay (PLA)
PLA is a method to detect if two proteins are in the proximity of each other in fixed cells or tissue.156 The assay requires two antibodies specific for the target proteins. The antibodies are conjugated to oligonucleotides that serve as the probe. Given that the two proteins of interest are interacting and are located close together (below 25 nm for direct conjugated primary probes), the oligonucleotides that are conjugated to the antibodies are brought into close proximity, and can hybridize to connector oligonucleotides.156 This leads to the formation of a circular molecule, which is amplified through rolling circle amplification using one of the probes as a primer and a highly processive polymerase. The rolling circle PCR results in a long single-stranded DNA with multiple repeats of the amplified sequence. Fluorophore-labeled oligonucleotides complementary to the amplified sequence are then hybridized, and are detected by standard fluorescence microscopy. The rolling circle reaction leads to a signal amplification, which allows for the detection of complexes under endogenous protein expression levels.156 The oligonucleotide design and conjugation chemistries, as well as affinity reagent can both decrease and extend the distance tolerated between the proteins of interest. The use of a secondary probe can further increase the acceptable distance between the proteins of interest.157 PLA is well suited for low-throughput validations of binary PPIs, but more challenging to apply for large-scale analysis. However, it can be performed on a larger scale given access to suitable antibodies. For example, Chen et al. tested over 1200 binary PPIs listed in a database of PPIs that they previously generated (POINeT) by PLA in HeLa cells and confirmed 46% of the suggested interactions.158,159 Recently, the probes for in situ PLA have been redesigned for increased sensitivity.160 ProxHCR is a further development of PLA that does not rely on an enzymatic step.161
Pros of PLA
PLA is a label-free method that can detect interactions between low-abundant proteins.
It is suitable for both stable and transient and weak interactions.
It provides the cellular localization of the interacting pair.
Reagents for PLA are available from commercial provider.
Cons of PLA
PLA has a strict requirement on a specific antibody pair.
The reagents and enzyme are rather costly.
The throughput of the method is limited and the method is better suited for low-throughput studies.
1.9 Concluding Remarks
In this chapter we have surveyed a number of methods that should be broadly applicable for interaction analysis. As outlined, the identification of cellular PPIs and networks requires the combination of several methods, each with its own benefits and shortcomings. The experiments generate data sets on PPIs of varying sizes, and the results are routinely deposited in PPI databases such as BioGrid and IntAct.162,163 The data is shared through the International Molecular Exchange (IMEx) consortium.164 Several large-scale studies have reported on proteome-wide maps of human PPIs. At this stage, the overlap between the large-scale interactome studies is still rather low, as different approaches uncover distinct parts of the interactome and several of the methods are prone to produce both false positives and false negatives. Condition-specific network profiling further unravels different parts of the network.
Newly generated data sets should be compared bioinformatically to previous reports on PPIs. Given the current coverage of the interactome, it is expected for most cases to find at least some overlaps of hits between a novel data set and the existing literature. The novel data sets should be further analyzed bioinformatically, taking into account the biological processes that the proteins are involved with, the cell types and the cellular compartments that the proteins are found in, to identify hits of biological relevance. Indeed, although interactions may be detected and confirmed by various approaches, it is far from given that the interaction will be of relevance in a biological context. Many of the methods discussed in this chapter may report on interactions between proteins that are located in different cellular compartments or in different cell types and will never meet in the cell. Hence, care must be taken before making far reaching conclusions based on PPI interaction screens and dedicated follow up studies are needed to elucidate the biological function of the interactions, but this is beyond the scope of this chapter.
This work was supported by a grant from the Swedish foundation for strategic research (SB16-0039).