Gygi Lab Yeast Triple Knockouts

Data downloaded from PRIDE PXD008009

The study was published in 2016:

Paulo, J.A., O’Connell, J.D. and Gygi, S.P., 2016. A triple knockout (TKO) proteomics standard for diagnosing ion interference in isobaric labeling experiments. Journal of the American Society for Mass Spectrometry, 27(10), pp.1620-1625.

Abstract: Isobaric labeling is a powerful strategy for quantitative mass spectrometry-based proteomic investigations. A complication of such analyses has been the co-isolation of multiple analytes of similar mass-to-charge resulting in the distortion of relative protein abundance measurements across samples. When properly implemented, synchronous precursor selection and triple-stage mass spectrometry (SPS-MS3) can reduce the occurrence of this phenomenon, referred to as ion interference. However, no diagnostic tool is available currently to rapidly and accurately assess ion interference. To address this need, we developed a multiplexed tandem mass tag (TMT)-based standard, termed the triple knockout (TKO). This standard is comprised of three yeast proteomes in triplicate, each from a strain deficient in a highly abundant protein (Met6, Pfk2, or Ura2). The relative abundance patterns of these proteins, which can be inferred from dozens of peptide measurements can demonstrate ion interference in peptide quantification. We expect no signal in channels where the protein is knocked out, permitting maximum sensitivity for measurements of ion interference against a null background. Here, we emphasize the need to investigate further ion interference-generated ratio distortion and promote the TKO standard as a tool to investigate such issues.

Introduction

The RAW files were downloaded and reprocessed using two pipelines. One is an in-house pipeline (PAW) (ref-1) developed at Oregon Health & Science University (OHSU) by Phil Wilmarth that uses MSConvert (ref-2) and Comet (ref-3) open source tools. The other processing was done with MaxQuant (ref-4). The analyses were done to compare the data for the three knock out genes to the data presented in the original publication. Different global factors like numbers MS2 scans for each instrument in respective acquisition modes and numbers of identified PSMs for each LC run were tabulated.

Plots of the total protein reporter ion intensities for each knock out gene were generated to see how platforms and acquisition methods performed. Two platforms were compared that could perform the cleaner SPS MS3 (ref-5) acquisition method: a Thermo Fusion and a Thermo Fusion Lumos mass spectrometer. The Lumos was also used to acquire TMT reporter ions in MS2 scans.

The PAW pipeline converts the RAW files into text files using MSConvert from the Proteowizard toolkit. These text files are parsed to produce MS2 peak lists for database searching and to extract to corresponding reporter ion peak heights.

Data from the MS2 scans are extracted for the Comet searches and the reporter peak heights are extracted from the MS3 scans. The pipeline uses the target/decoy method to make score histograms and determine score filtering thresholds. Accurate mass is used to create conditional score histograms where target/decoy delta mass histograms are used to set the accurate mass windows. Basic parsimony principles are used for protein inference and two peptides per protein were required. An additional protein grouping step was used to combine nearly identical peptide sets (often these are housekeeping genes).

The pipeline also has a script for performing internal reference scaling (IRS) normalization (ref-6). The three repeats of each 9-plex experiment that were done in separate mass spec runs were combined using the IRS method where averages across the 9 channels served as stand ins for proper reference channels. IRS is described in detail in several Jupyter notebook analyses available online.

MaxQuant version 1.6.2.10 was used with mostly default values and the same protein database as used in the PAW analysis. The Fuson and Lumos SPS MS3 data could be processed together in one analysis (the TMT label set was the same, namely, MS3 10-plex). The MS2 Lumos data was done in a separate analysis (MS2 10-plex). Second peptide identification and precursor isolation purity were not used. Mass tolerances and FDR cutoffs were kept at default values. The proteinGroups file was used to prepare the input for R.

References

1. Wilmarth, P.A., Riviere, M.A. and David, L.L., 2009. Techniques for accurate protein identification in shotgun proteomic studies of human, mouse, bovine, and chicken lenses. Journal of ocular biology, diseases, and informatics, 2(4), pp.223-234.

2. Chambers, M.C., Maclean, B., Burke, R., Amodei, D., Ruderman, D.L., Neumann, S., Gatto, L., Fischer, B., Pratt, B., Egertson, J. and Hoff, K., 2012. A cross-platform toolkit for mass spectrometry and proteomics. Nature biotechnology, 30(10), p.918

3. Eng, J.K., Jahan, T.A. and Hoopmann, M.R., 2013. Comet: an open‐source MS/MS sequence database search tool. Proteomics, 13(1), pp.22-24.

4. Cox, J. and Mann, M., 2008. MaxQuant enables high peptide identification rates, individualized ppb-range mass accuracies and proteome-wide protein quantification. Nature biotechnology, 26(12), p.1367.

5. McAlister, G.C., Nusinow, D.P., Jedrychowski, M.P., Wühr, M., Huttlin, E.L., Erickson, B.K., Rad, R., Haas, W. and Gygi, S.P., 2014. MultiNotch MS3 enables accurate, sensitive, and multiplexed detection of differential expression across cancer cell line proteomes. Analytical chemistry, 86(14), pp.7150-7158.

6. Plubell, D.L., Wilmarth, P.A., Zhao, Y., Fenton, A.M., Minnier, J., Reddy, A.P., Klimek, J., Yang, X., David, L.L. and Pamir, N., 2017. Extended multiplexing of tandem mass tags (TMT) labeling reveals age and high fat diet specific proteome changes in mouse epididymal adipose tissue. Molecular & Cellular Proteomics, 16(5), pp.873-890.

7. Kovalchik, K.A., Moggridge, S., Chen, D.D., Morin, G.B. and Hughes, C.S., 2018. Parsing and Quantification of Raw Orbitrap Mass Spectrometer Data using RawQuant. Journal of proteome research, 17(6), pp.2237-2247.

8. Liu, H., Sadygov, R.G. and Yates, J.R., 2004. A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Analytical chemistry, 76(14), pp.4193-4201.



Dataset details

Nine channels of 10-plex TMT reagents were used (in increasing tag mass order) for 3 replicates of a yeast Met6 gene knockout strain, 3 replicates of Pfk2 knockout strains, and 3 replicates of Ura2 knockout strains. There were three platform/acquisition modes evaluated:

  • Thermo Fusion classic in SPS MS3 mode
  • Thermo Fusion Lumos in SPS MS3 mode
  • Thermo Fusion Lumos in "old school" MS2 mode

The LC separations were kept similar, and each platform/mode 9-plex was repeated three times. Comparisons can be made between the Fusion and Lumos in the SPS MS3 mode. Comparisons can be made between the Lumos in MS2 mode or SPS MS3 mode. Comparisons can look at a few things, such as:

  • numbers of MS2 sequencing attempts and success rates
  • overall signal levels
  • how "knocked out" each of the three genes really were

Note: With high resolution instruments, any TMT experiment can be processed assuming 11-plex reagents. Depending on the actual tags uses, there may be several unused channels that can serve as controls. In this data, the 131N and 131C channels were not used. PAW and MaxQuant processing were both done assuming TMT 11-plex reagents.

Scan rates and numbers of identified PSMs

The table below shows the number of MS2 scans for each platform/mode and the (roughly) 1% FDR PSM identification numbers from PAW and MaxQuant. PAW uses wide parent ion mass tolerance Comet searches (1.25 Da), dynamic post-search accurate mass filtering, and discriminant function scoring. This has higher sensitivity compared to MaxQuant with its typical default settings. The consistency between replicates (within same Platform/Mode) is very good. Note that the numbers of identifications reported in the original publication in Table 1 are about 1.5 times larger than the PAW numbers. Those numbers look a little funny to me since such large differences between similar pipelines is not too likely.

LC Name Platform Mode MS2 Scans PAW PSMs MQ PSMs
L00761 Lumos MS2 10888 4710 4337
L00763 Lumos MS2 10843 4718 4267
L00765 Lumos MS2 10832 4739 4275
L00760 Lumos SPS MS3 7172 2709 2389
L00762 Lumos SPS MS3 7151 2738 2330
L00764 Lumos SPS MS3 7093 2691 2308
m06677 Fusion SPS MS3 6335 2532 2087
m06679 Fusion SPS MS3 6294 2544 2134
m06681 Fusion SPS MS3 6241 2560 2113

The Lumos MS2 mode acquisition seems considerably faster than the SPS MS3 mode. There are many more MS2 scans per unit time and more identified PSMs. This seems a little surprising since the methods should be limited by the orbitrap scan times. The survey scan ranges and resolutions are similar independent of mode. The linear trap events are fast and can occur while the orbitrap is in use, so IT MS2 and SPS selection should be relatively quick. The times for high resolution MS2 or MS3 to get reporter ions should be similar. I would have guessed overall cycle times to be more similar across the platform/modes.

Load in PAW data

The PAW pipeline adds the reporter ion data after regular PSM filtering and protein inference. The reporter ion data is extracted during RAW file processing so most PAW analysis is decoupled from the TMT data. At the end of the pipeline, when the final list of proteins has been produced and the status of unique and shared peptides have been redefined from the protein database context to the final protein list context, the reporter ions from all unique peptide PSMs can be summed into protein totals. Thus, combined protein and peptide reports for all 9 TMT LC runs are possible. This simplifies comparisons between platforms when all of the measurements are in the same table row for each protein.

The python scripts that extract the reporter ion intensities from the MSConvert text files are pretty simple. Narrow windows are located at the predicted theoretical accurate masses of the TMT reagents. The windows (in Da) are slightly asymmetric (the reporter ion peaks in profile mode have a little bit of a left-hand tail) but are about a plus/minus 20 ppm width. The reporter ion intensity is taken as the maximum value in the window. This works the same for profile or centroid data. Proteome Discoverer, RawQuant, and MaxQuant work with centroided data and return the centroid peak that is closest to the expected location (independent of peak height). Comparisons to Proteome Discoverer export files and RawQuant files have very good agreement to the maximum values from PAW.

MaxQuant has reporter ion values that are well correlated with those from other tools, but they are different in overall magnitude. No description of exactly how MaxQuant determines reporter ion intensities has been found and questions posted at the MaxQuant discussion group have not been answered yet. The closest centroid aspect has been tested, though. Older versions of MaxQuant had windows of 0.01 Da which is wide enough to encompass both N- and C-forms of reagents. Newer versions of MaxQuant have window defaults of 0.003 Da. The resulting reporter ion values are the same with either window width choice suggesting that the peaks closest to the expected positions are used.

The replicates in this dataset were done as separate experiments without any common reference channels. In the original publication, the reporter ions were expressed within each 9-plex as fractional signal-to-noise ratios to allow comparisons between replicates. The fraction was out of the total across the 9 channels in each TMT experiment replicate. MaxQuant and PAW do not compute signal-to-noise ratios so this analysis will use intensities. RawQuant (ref-7) does provide enough information to compute signal-to-noise values and those were used instead of peak heights in PAW. The range of data values for signal-to-noise ratios are different compared to peak heights. Signal-to-noise ratios are a unitless quality measurement and not a quantitative value. The noise values for each reporter ion channel do not vary much for each scan so dividing by them scales the repoter ions while preserving their relative values. Singal-to-noise ratios seem OK for lateral comparisons (the same peptide or protein between channels) but lose logitudinal information (relative abundances of proteins within a sample). The sum of TMT intensities into protein totals are well correlated with iBAQ values in MaxQuant, for example.

Here, I will use the balanced study design to create an internal reference standard proxy by averaging the 9-plex signals for each replicate. However, the IRS method is designed to put similar reporter ion intensities on a common intensity scale so different platform/method data should be processed separately due to the differing overall magnitudes of intensity values. The unified protein reports (with TMT data from all 9 LC runs) were edited to contain related sets of three replicates, a mock pooled standard channel created by averaging the 9 used TMT channels, and the IRS script ran to combine the data.

Normalization steps for proper IRS experiments

The more robust way to do multiple TMT labelings is not through balanced study designs. It is to allocate two channels per TMT experiment to a common pooled standard. The pooled standard consists of a small amount of protein from all (or a large fraction of) samples in the study. Aliquots of that protein mixture are digested and labeled like the biological samples. Duplicate channels within each TMT kit are recommended to give a good compromise between measurement accuracy of the standard and keeping reasonable numbers of channels free for biological samples. The basic steps for normalizations are listed below. They are pretty straightforward and can be done Excel, in Python with pandas, or in R. Some definitions are in order. It is assumed that the data is in a rectangular table where proteins are rows and reporter ions (and maybe other information) are in columns. It is usually a good idea to have the reporter ions labeled by groups (biological sample groups, pooled standards) and by TMT experiment. Embedding reporter ion tag information can also be done (i.e. 127N, 131C, etc.). The two reporter ion channels in a given TMT experiment will be called internal "standards". The average of those standards will be called the "reference" for that TMT experiment. The average (arithmetic or geometric mean) of the references will be the "target" common scale.

Before we do the internal reference scaling (IRS) normalization a couple of things should be done to the data. Any contaminant proteins (common contaminants like keratins, trypsin, and BSA, and any biological contaminants (proteins from adjacent tissue, blood, etc.) should be removed from the table. There can also be proteins that do not have any reporter ion intensities (quantification is typically stricter than identification) that can be dropped from the table.

Reporter ions within each individual TMT experiment should be normalized to a common level. The same total amount of protein digest is labeled for each channel so the total reporter ion signal for each channel should also be the same total amount (the reporter ion signals are proxies for protein abundances). The way I do this is to form the column totals for each channel. Then I average those totals to get the average grand total. The ratio of the individual totals to the average grand total gives single normalization factors to multiply each column by (each protein intensity in that channel is multiplied by its respective factor). The column sums after the normalization should all equal the average grand total target value. This normalization corrects for protein/peptide assay errors, pipetting errors, digestion differences, and label reaction differences, and should improve the internal standard measurements.

Now we are ready to perform the IRS steps:

  • Find the two pooled internal standard channel vectors in each TMT experiment
  • Average (arithmetic) the two standard vectors to make the reference vector for each TMT experiment
  • Average (arithmetic or geometric) the reference vectors to make the target vector
  • Compute the IRS factor vectors (target vector divided by reference vector) for each TMT experiment
  • Multiple the factor vector by each channel vector for each TMT experiment

After this, each TMT channel in each TMT experiment will be adjusted to the same common intensity scale for the whole study. The internal reference standard channels also get adjusted so we can double check the IRS steps. The standards should be very similar after IRS adjustment. The pairs within each TMT experiment will be similar independent of IRS. Standards between TMT experiments will be different because of pseudo random MS2 triggering (ref-8) (major effect) and other sources of instrument variation.

In [31]:
# load libraries first
library("psych")
library("tidyverse")
library("cowplot")
library("edgeR")
library("limma")
In [32]:
# load the data files into data frames. PAW analysis could be done for all 9 LC runs in
# a unified report. This simplifies aligning protein and peptide identifications between samples.
# The IRS method puts related reporter ion intensities on a common scale and was done separately
# for each platform/method. The main protein reports were edited to retain columns from each platform/method
# only before running the IRS script.

fusion <- read_tsv(file = "labeled_Fusion_PAW_grouped_protein_summary_TMT_8_IRS_normalized.txt")
lumos <- read_tsv(file = "labeled_Lumos_PAW_grouped_protein_summary_TMT_8_IRS_normalized.txt")
ms2 <- read_tsv(file = "labeled_MS2_Lumos_PAW_grouped_protein_summary_TMT_8_IRS_normalized.txt")
Parsed with column specification:
cols(
  .default = col_double(),
  ProtGroup = col_integer(),
  Counter = col_integer(),
  Accession = col_character(),
  Identical = col_character(),
  Similar = col_character(),
  OtherLoci = col_character(),
  Filter = col_character(),
  Missing = col_character(),
  Coverage = col_character(),
  SeqLength = col_character(),
  MW = col_character(),
  Description = col_character(),
  CountsTot = col_integer(),
  UniqueTot = col_integer(),
  Total_m06677 = col_integer(),
  Total_m06679 = col_integer(),
  Total_m06681 = col_integer(),
  Unique_m06677 = col_integer(),
  Unique_m06679 = col_integer(),
  Unique_m06681 = col_integer()
  # ... with 3 more columns
)
See spec(...) for full column specifications.
Parsed with column specification:
cols(
  .default = col_double(),
  ProtGroup = col_integer(),
  Counter = col_integer(),
  KO = col_character(),
  Accession = col_character(),
  Identical = col_character(),
  Similar = col_character(),
  OtherLoci = col_character(),
  Missing = col_character(),
  Filter = col_character(),
  Coverage = col_character(),
  SeqLength = col_character(),
  MW = col_character(),
  Description = col_character(),
  CountsTot = col_integer(),
  UniqueTot = col_integer(),
  Total_L00760 = col_integer(),
  Total_L00762 = col_integer(),
  Total_L00764 = col_integer(),
  Unique_L00760 = col_integer(),
  Unique_L00762 = col_integer()
  # ... with 4 more columns
)
See spec(...) for full column specifications.
Parsed with column specification:
cols(
  .default = col_double(),
  ProtGroup = col_integer(),
  Counter = col_integer(),
  KO = col_character(),
  Accession = col_character(),
  Identical = col_character(),
  Similar = col_character(),
  OtherLoci = col_character(),
  Missing = col_character(),
  Filter = col_character(),
  Coverage = col_character(),
  SeqLength = col_character(),
  MW = col_character(),
  Description = col_character(),
  CountsTot = col_integer(),
  UniqueTot = col_integer(),
  Total_L00761 = col_integer(),
  Total_L00763 = col_integer(),
  Total_L00765 = col_integer(),
  Unique_L00761 = col_integer(),
  Unique_L00763 = col_integer()
  # ... with 4 more columns
)
See spec(...) for full column specifications.

Get the before and after IRS normalization intensity columns

The main results tables have unnormalized columns, sample-loading (SL) normalization columns, and internal reference scaling (IRS) normalization columns (33 each, plus some intermediate columns). We only need the SL and IRS normed columns from each platform/mode. We will use a helper function to extract the columns.

In [33]:
# get quantifiable rows for each platform/mode, before and after IRS norm
# exclude contams and rows without intensities in all three replicates

# helper function for extracting columns with specified prefix strings
get_tmt_cols <- function(df, prefix) {
  # gets TMT columns from dataframe "df" beginning with "prefix" string
  sub_df  <- df %>%
  filter(is.na(Filter), is.na(Missing)) %>% # drop contam and mssing data rows
  select(starts_with(prefix), Accession) %>% # get data columns, and accessions
  select(-contains("_Pool_"))  %>% # drop the mock pooled channels
  mutate(Accession = sapply(strsplit(Accession, " "), `[`, 1)) # clean up accesssions
}

fusion_sl <- get_tmt_cols(fusion, "SLNorm_")
fusion_irs <- get_tmt_cols(fusion, "IRSNorm_")
lumos_sl <- get_tmt_cols(lumos, "SLNorm_")
lumos_irs <- get_tmt_cols(lumos, "IRSNorm_")
ms2_sl <- get_tmt_cols(ms2, "SLNorm_")
ms2_irs <- get_tmt_cols(ms2, "IRSNorm_")

See how similar the yeast strains are to each other in each TMT experiment

In [34]:
# get replicate subsets of the data (we will use SL normalized data)
fusion_rep <- fusion_sl %>%
  select(ends_with("_1"))
lumos_rep <- lumos_sl %>%
  select(ends_with("_2"))
ms2_rep <- ms2_sl %>%
  select(ends_with("_3"))

# we will do a representative within replicate check for each platform/method
pairs.panels(log10(fusion_rep[1:9]), lm = TRUE, main = "Fusion SPS MS3 within replicate 1")
pairs.panels(log10(lumos_rep[1:9]), lm = TRUE, main = "Lumos SPS MS3 within replicate 2")
pairs.panels(log10(ms2_rep[1:9]), lm = TRUE, main = "Lumos MS2 within replicate 3")

How similar are the three knockout strains?

Each TMT experiment has 9 yeast samples: 3 replicates of each of three different knockout yeast strains (Met6, Pfk2, and Ura2 genes). Each of those TMT experiments was repeated 3 times for each platform/mode. That is a lot of replicates to visualize. I picked one TMT experiment for each platform/mode and compared the 9 yeast channels to each other. I used the first TMT experiment for the Fusion, the second TMT experiment for the Lumos, and the third TMT experiment for the Lumos MS2 configuration. What we can see above is the same pattern across the 3 platform/modes; namely, each knockout has very similar replicates (they are in adjacent set of 3 TMT channels). The Pfk2 knockout differs quite a bit from the other two knockouts. The Met6 and Ura2 knockouts are more similar to each other (lower left block of 9).

We can also use distributions of protein coefficients of variance (CV) to quantify similarity. Using a little helper function, we can make boxplots of various sample groups. I will look at the CVs for each block of knockout replicates from each of the above platform/modes. The CVs are in percent.

In [35]:
# create a row CV computing function for whole data frames
# input: data frame, output: vector of CVs (%)
CV <- function(df) {
  ave <- rowMeans(df) # compute averages
  sd <- apply(df, 1, sd) # compute standard deviations
  cv <- 100 * sd / ave # compute CVs in percent (last thing gets returned)
}

# make CV distributions within KO strains for each platform representative replicate
par(mfrow = c(3, 3))
boxplot(CV(fusion_sl[1:3]), ylim = c(0, 50), notch = TRUE, main = "Fusion Met6 Rep1")
text(x = 0.65, y = boxplot.stats(CV(fusion_sl[1:3]))$stats[3], 
     labels = round(boxplot.stats(CV(fusion_sl[1:3]))$stats[3], 1))                       
boxplot(CV(fusion_sl[4:6]), ylim = c(0, 50), notch = TRUE, main = "Fusion Pfk2 Rep1")
text(x = 0.65, y = boxplot.stats(CV(fusion_sl[4:6]))$stats[3], 
     labels = round(boxplot.stats(CV(fusion_sl[4:6]))$stats[3], 1))                       
boxplot(CV(fusion_sl[7:9]), ylim = c(0, 50), notch = TRUE, main = "Fusion Ura2 Rep1")
text(x = 0.65, y = boxplot.stats(CV(fusion_sl[7:9]))$stats[3], 
     labels = round(boxplot.stats(CV(fusion_sl[7:9]))$stats[3], 1))                       
boxplot(CV(lumos_sl[1:3]), ylim = c(0, 50), notch = TRUE, main = "Lumos Met6 Rep2") 
text(x = 0.65, y = boxplot.stats(CV(lumos_sl[1:3]))$stats[3], 
     labels = round(boxplot.stats(CV(lumos_sl[1:3]))$stats[3], 1))                       
boxplot(CV(lumos_sl[4:6]), ylim = c(0, 50), notch = TRUE, main = "Lumos Pfk2 Rep2") 
text(x = 0.65, y = boxplot.stats(CV(lumos_sl[4:6]))$stats[3], 
     labels = round(boxplot.stats(CV(lumos_sl[4:6]))$stats[3], 1))                       
boxplot(CV(lumos_sl[7:9]), ylim = c(0, 50), notch = TRUE, main = "Lumos Ura2 Rep2") 
text(x = 0.65, y = boxplot.stats(CV(lumos_sl[7:9]))$stats[3], 
     labels = round(boxplot.stats(CV(lumos_sl[7:9]))$stats[3], 1))                       
boxplot(CV(ms2_sl[1:3]), ylim = c(0, 50), notch = TRUE, main = "MS2 Met6 Rep3")
text(x = 0.65, y = boxplot.stats(CV(ms2_sl[1:3]))$stats[3], 
     labels = round(boxplot.stats(CV(ms2_sl[1:3]))$stats[3], 1))                       
boxplot(CV(ms2_sl[4:6]), ylim = c(0, 50), notch = TRUE, main = "MS2 Pfk2 Rep3")
text(x = 0.65, y = boxplot.stats(CV(ms2_sl[4:6]))$stats[3], 
     labels = round(boxplot.stats(CV(ms2_sl[4:6]))$stats[3], 1))                       
boxplot(CV(ms2_sl[7:9]), ylim = c(0, 50), notch = TRUE, main = "MS2 Ura2 Rep3")
text(x = 0.65, y = boxplot.stats(CV(ms2_sl[7:9]))$stats[3], 
     labels = round(boxplot.stats(CV(ms2_sl[7:9]))$stats[3], 1))                       
par(mfrow = c(1, 1))

What about the KO replicates between TMT experiments?

We saw above that each of three within TMT replicates of a given knockout are very similar, as might be expected. For each platform/mode we have 3 separate TMT replicates. We have 9 total channels for each knockout. They should also be highly similar to each other. This is another opportunity to demonstrate the IRS concept (or, if you have worked through other notebooks on my site, starting to sound like a broken record...). We will look at the groups of the 9 knockouts, picking one knockout from each platform/mode. We will compare before IRS normalization to after.

In [36]:
# separate into the three different KO channel sets
fusion_Met6 <- fusion_irs %>%
  select(contains("Met6"))
fusion_Met6_sl <- fusion_sl %>%
  select(contains("Met6"))

# compare the same KO runs before and after IRS
pairs.panels(log10(fusion_Met6_sl), lm = TRUE, main = "Fusion Met6 before IRS")
pairs.panels(log10(fusion_Met6), lm = TRUE, main = "Fusion Met6 after IRS")
In [37]:
# separate into the three different KO channel sets
lumos_Pfk2 <- lumos_irs %>%
  select(contains("Pfk2"))
lumos_Pfk2_sl <- lumos_sl %>%
  select(contains("Pfk2"))

# compare the same KO runs before and after IRS
pairs.panels(log10(lumos_Pfk2_sl), lm = TRUE, main = "Lumos Pfk2 before IRS")
pairs.panels(log10(lumos_Pfk2), lm = TRUE, main = "Lumos Pfk2 after IRS")
In [38]:
# separate into the three different KO channel sets
ms2_Ura2 <- ms2_irs %>%
  select(contains("Ura2"))
ms2_Ura2_sl <- ms2_sl %>%
  select(contains("Ura2"))

# compare the same KO runs before and after IRS
pairs.panels(log10(ms2_Ura2_sl), lm = TRUE, main = "Lumos MS2 Ura2 before IRS")
pairs.panels(log10(ms2_Ura2), lm = TRUE, main = "Lumos MS2 Ura2 after IRS")

IRS corrects inter-TMT variation from MS2 sampling effects

The random sampling of MS2 scans (ref-8) is the main reason that data without an IRS correction is variable between TMT experiments. Designing a proper IRS experiment with reference channels or, in special cases of very balanced study designs, approximating one allows for a straightforward correction procedure.

We can also look at the protein CV distributions before and after IRS and see how they compare to the within-TMT results from above.

In [39]:
# make CV distributions within KO strains for each platform representative replicate
par(mfrow = c(3, 2))
boxplot(CV(fusion_Met6_sl), ylim = c(0, 50), notch = TRUE, main = "Fusion Met6 before IRS")
text(x = 0.65, y = boxplot.stats(CV(fusion_Met6_sl))$stats[3], 
     labels = round(boxplot.stats(CV(fusion_Met6_sl))$stats[3], 1))                       
boxplot(CV(fusion_Met6), ylim = c(0, 50), notch = TRUE, main = "Fusion Met6 after IRS")
text(x = 0.65, y = boxplot.stats(CV(fusion_Met6))$stats[3], 
     labels = round(boxplot.stats(CV(fusion_Met6))$stats[3], 1))                       
boxplot(CV(lumos_Pfk2_sl), ylim = c(0, 50), notch = TRUE, main = "Lumos Pfk2 before IRS") 
text(x = 0.65, y = boxplot.stats(CV(lumos_Pfk2_sl))$stats[3], 
     labels = round(boxplot.stats(CV(lumos_Pfk2_sl))$stats[3], 1))                       
boxplot(CV(lumos_Pfk2), ylim = c(0, 50), notch = TRUE, main = "Lumos Pfk2 After IRS") 
text(x = 0.65, y = boxplot.stats(CV(lumos_Pfk2))$stats[3], 
     labels = round(boxplot.stats(CV(lumos_Pfk2))$stats[3], 1))                       
boxplot(CV(ms2_Ura2_sl), ylim = c(0, 50), notch = TRUE, main = "MS2 Lumos Ura2 before IRS")
text(x = 0.65, y = boxplot.stats(CV(ms2_Ura2_sl))$stats[3], 
     labels = round(boxplot.stats(CV(ms2_Ura2_sl))$stats[3], 1))                       
boxplot(CV(ms2_Ura2), ylim = c(0, 50), notch = TRUE, main = "MS2 Lumos Ura2 after IRS")
text(x = 0.65, y = boxplot.stats(CV(ms2_Ura2))$stats[3], 
     labels = round(boxplot.stats(CV(ms2_Ura2))$stats[3], 1))                       
par(mfrow = c(1, 1))

After IRS CV distributions are similar to within TMT CV distributions

The CV distributions are much tighter after IRS and at small values that are similar to the within TMT values a few cells above. The before IRS distributions are a mixture of within (low variance) and between TMT (higher variance) data and represent an intermediate variance situation. Many biological experiments will have samples from each condition in such mixed scenarios (particularly when randomizing samples to available measurement slots). We also see that the less sensitive Fusion platform in the SPS MS3 mode {5} has more variability than the Lumos (in either mode).

What does clustering tell us?

We need to load data into edgeR, do TMM notmalization, then look at the MDS plots to see how the data cluster. We will want to have just the 9 used channels from each replicate, so we need to drop the "unused" channels. The unused channels are not compatible with the assumptions for TMM normalization (or grand total or median normalizations) and will just give us trouble if we keep then in the mix.

In [40]:
# helper function for generating MDS plots
get_MDS_plot <- function(df, title) {
    df_temp <- df %>% select(-contains("unused"), -Accession) # drop the "unused" channels
    group <- c(rep( c(rep("met6", 3), rep("pfk2", 3), rep("ura2", 3)), 3)) # define groups
    colors = c(rep( c(rep("red", 3), rep("green", 3), rep("blue", 3)), 3)) # set colors
    y <- DGEList(counts = df_temp, group = group) # load edgeR data object
    y <- calcNormFactors(y) # run TMM normalization
    plotMDS(y, col = colors, main = title) # make the MDS plot
}

par(mfrow = c(2, 1))
get_MDS_plot(fusion_sl, "Fusion Before IRS")
get_MDS_plot(fusion_irs, "Fusion After IRS")
get_MDS_plot(lumos_sl, "Lumos Before IRS")
get_MDS_plot(fusion_irs, "Lumos After IRS")
get_MDS_plot(fusion_sl, "Lumos MS2 Before IRS")
get_MDS_plot(fusion_irs, "Lumos MS2 After IRS")
par(mfrow = c(1, 1))

IRS removes batch-like TMT experiment effect

The TMT channels correctly cluster by yeast knockout strains after IRS.


Let's get to the point - what do the KO proteins look like?

The first order of business is to get the intensities for the three knockout gene products. We can do some tidyverse calls and make use of pipes. We will be using ggplot2 so we want data in long form instead of wide form. "filter" gets the row of interest, "gather" transposes, "select" drops the accession column, "separate" splits the column headers so we can more easily group replicated by gene. We will end up with 9 data frames (3 KO genes by 3 platforms/modes).

In [41]:
# get the data for each KO gene by platform/mode
# drop Accession column and transpose so data is in long form
get_gene_row <- function(df, gene) {
    # get a row for specified protein from df and transposes
    row <- df %>% 
      filter(Accession == gene) %>% # get row with the right accession
      gather(var, value, -Accession) %>% # transpose row
      select(-Accession) %>% # drop accession column
      separate(var, into = (c("norm", "gene", "within", "between")),"_|-") # split up sample/replicate
    row$gene<- factor(row$gene, levels = c("Met6", "Pfk2", "Ura2", "unused"), ordered = TRUE) # order factors
    row # return this
}

fusion_Met6 <- get_gene_row(fusion_irs, "sp|P05694|METE_YEAST")
fusion_Pfk2 <- get_gene_row(fusion_irs, "sp|P16862|PFKA2_YEAST")
fusion_Ura2 <- get_gene_row(fusion_irs, "sp|P07259|PYR1_YEAST")

lumos_Met6 <- get_gene_row(lumos_irs, "sp|P05694|METE_YEAST")
lumos_Pfk2 <- get_gene_row(lumos_irs, "sp|P16862|PFKA2_YEAST")
lumos_Ura2 <- get_gene_row(lumos_irs, "sp|P07259|PYR1_YEAST")

ms2_Met6 <- get_gene_row(ms2_irs, "sp|P05694|METE_YEAST")
ms2_Pfk2 <- get_gene_row(ms2_irs, "sp|P16862|PFKA2_YEAST")
ms2_Ura2 <- get_gene_row(ms2_irs, "sp|P07259|PYR1_YEAST")

Time to plot the data

What kind of plot is better? Bar plots were used in the original paper. The 3 repeated TMT experiments get separate bars and the 3 replicated channels within each TMT experiment constitute the error bars on the plots. Since we can get all 9 data points onto a common intensity scale using IRS, we can plot the distributions of the observations as boxplots and even show the actual data points. I will keep the platform/modes separate (so the plots are tall enough to see the differences) and horizontally stack the data from the 3 knockout genes.

What should we expect to see? For a given protein in each TMT experiment, we will have 3 channels of replicates for the Met6 KO strains, 3 channels of replicates for the Pfk2 KO gene strains, and 3 channels of replicates for the Ura2 KO strains. When we look at the data for the Met6 protein product, we should see little or no signals in the Met6 KO strains, and "normal" signal levels for the other strains. Similarly, Pfk2 should be knocked down in Pfk2 KO strains, and Ura2 knocked down in the Ura2 KO strains.

In [42]:
# plot helper function (makes it easier to change plot stuff)
gene_boxplot <- function(df, title) {
    # makes the subplots given a dataframe and title
    plot <- ggplot(df, aes(x = gene, y = value, color = gene)) + # select x, y; use color
      geom_boxplot() + # create boxplots
      geom_jitter(shape = 16) + # add actual data points
      ylab("Intensity") + xlab("KO Genes") + # axis labels
      ggtitle(title) + # plot title
      theme(legend.position = "none") + # hide the legend
      scale_color_brewer(palette="Dark2") + # change the colors
      theme(axis.text.x = element_text(angle=70, vjust=0.5)) # rotate x-axis labels
}

# collect sub plots for Fusion SPS MS3
F_Met6_p  <- gene_boxplot(fusion_Met6, "Fusion - Met6")
F_Pfk2_p  <- gene_boxplot(fusion_Pfk2, "Fusion - Pfk2")
F_Ura2_p  <- gene_boxplot(fusion_Ura2, "Fusion - Ura2")

# put plots in a grid
plot_grid(F_Met6_p, F_Pfk2_p, F_Ura2_p, ncol = 3)

Met6 KO looks pretty good on the Fusion...

The left plot above is for the Met6 protein product. The Met6 knockout channels are at very low levels. The two 131 channels were not used but reporter ions were tallied at those m/z positions to serve as negative controls. If the knockouts were perfectly knocked down, they should be at intensity levels like the unused channels.

The middle plot is the data for the Pfk2 protein product. Pfk2 levels are low but not as low as the unused channels. The distribution of intensities is also somewhat broad. The right plot is for Ura2 protein product. We have more scatter in all data and the Ura2 knockout strain channels are not very close to the unused channel levels. It could be that the Ura2 levels are getting too close to the lower limits of detection for the Fusion with this short, single LC experiment.

Let's see how the higher sensitivity Lumos does:

In [43]:
# collect sub plots for Lumos SPS MS3
L_Met6_p  <- gene_boxplot(lumos_Met6, "Lumos - Met6")
L_Pfk2_p  <- gene_boxplot(lumos_Pfk2, "Lumos - Pfk2")
L_Ura2_p  <- gene_boxplot(lumos_Ura2, "Lumos - Ura2")

# put plots in a grid
plot_grid(L_Met6_p, L_Pfk2_p, L_Ura2_p, ncol = 3)

Higher signals on the Lumos are beneficial

Across the plots, the data just looks better than for the Fusion. The distributions are all tighter and there is good separation between the knocked down levels and the normal levels. The knocked down levels are still a little above the levels of the unused channels.

Does SPS MS3 improve on MS2-based TMT?

In [44]:
# collect sub plots for Lumos in MS2 mode
M_Met6_p  <- gene_boxplot(ms2_Met6, "Lumos MS2 - Met6")
M_Pfk2_p  <- gene_boxplot(ms2_Pfk2, "Lumos MS2 - Pfk2")
M_Ura2_p  <- gene_boxplot(ms2_Ura2, "Lumos MS2 - Ura2")

# put plots in a grid
plot_grid(M_Met6_p, M_Pfk2_p, M_Ura2_p, ncol = 3)

Hmmm...

Reporter ion intensities are at the highest levels for the Lumos MS2 data and that is seen in how vertically small the boxplots are. The CV distributions in the cells above were also rather small for the Lumos (in either mode, with MS2 a bit better). However, the knocked down protein intensity levels are relatively much higher for MS2 TMT data. If we were to normalize the data so that the "normal" levels (6 channels) were 100%, then the knocked down levels are at about 30% for MS2 TMT data.

There is a sharp contrast between the knocked down levels and the unused channel levels. The unused channel levels still measure near zero in all three platform/modes. This offers some insight into the MS2 signal distortions. I seem to recall that the early work characterizing the "ratio" compression tested whether precursor purity thresholds reduced the distortion or not. My recollection is that it do not have much affect suggesting that the background did not come from a second peptides most of the time, but is a more non-specific background effect. We see evidence of that here because we have other peptides labeled with the tags that correspond to the knocked down channels and those channels do not have as small signals as they should have. This suggests that there is some background signal that is probably present in all of the 9 used channels. When there are no tags used (the 131 channels), there is no effect.

I am guessing that there are non-specific peptides (background noise) across the m/z range that are always co-fragmented for any isolation window. I would not expect this background level to be the same for all m/z values, but then again, real peptides are not uniformly distributed in m/z space. Anyway, these non-specific co-isolated peptides probably do not generate any fragment ions intense enough to get selected for the SPS MS3 isolation step.

While the "mean biasing" of the signals in MS2 TMT (a better term to use than ratio compression because you do not need to take ratios) certainly compromises measuring accurate fold changes, it is less clear what affect it has on statistical testing. Testing is usually (but not always) to find proteins that are not at the same levels. Larger mean differences at similar variances helps make finding differences easier. Smaller variances at similar mean differences also helps. MS2 TMT is less good for the former and probably better for the latter.

Determining the real-world performance difference given all of these considerations is challenging. These types of comparisons are never as apples-to-apples as one hopes. There are many subtle layers to proteomics results. The PAW processing presents a unified protein report across all 9 LC runs (3 for each platform/mode). The final result list is a union of all results, and poor performance in some runs can be masked by good performance in other runs.

The Python script that does the IRS normalization reports many useful things. To do the IRS step correctly, it was done separately for each platform/mode (Fusion, Lumos, Lumos in MS2). We can compare some important metrics from each IRS normalization script run's console output. Identification and quantification are independent, so we can (and do) have identifications that have no quantitative data (quantification is always a subset of identification). We can count how many proteins were quantifiable for each platform/mode.

Platform Mode Identified Proteins Quantifiable Proteins Proteins with no signals
Lumos MS2 771 700 2
Lumos SPS MS3 771 431 202
Fusion SPS MS3 771 307 329

Now, we can see a pretty big difference between the SPS MS3 methods and the MS2 method. Is being able to quantify twice as many proteins worth the trade off in quantification accuracy? I suspect the answer is pretty obvious. The most interesting proteins to the biologists are always barely detectible. The Lumos seems like a significantly better instrument for the SPS MS3 method than the first-generation Fusion instruments. The identification of only about 800 proteins says that this experiment could be scaled-up (more protein and more fractions) and that would help less sensitive platforms like the Fusion. My feeling is that this experiment is on the low end for protein levels. A scaled up experiment might reduce the gap between the SPS MS3 methods and the MS2 method significantly. We also do not know how platforms like the Q-Exactives compare to the Lumos in MS2 mode. The interplay between scan speed and sensitivity is always hard to predict, and it is different for higher abundance peptides compared to peptides near detection limits.

So what platform/mode is the clear winner here? Like everything in life, the answer is that it depends. You are usually stuck with the instrument you have. Modes like SPS MS3 clearly have less interference, but also lower sensitivity. Lower sensitivity can often be improved at the cost of increased instrument time (more fractionation). MS2 modes and platforms will likely have the greatest proteome profiling depths. The biggest challenge in proteomics data analysis (in my experience) is that every experiment is different. Each experiment has different goals and the tradeoffs will have different weights. What is useful is a thorough understanding of the strengths and weaknesses of all of the reasonable options. More than one option might have to be explored to find the technique that can best meet the experimental goals.

Hey, where is the MaxQuant data?

Good question! Working up the PAW data took longer than I thought. The PAW data is a little simpler since its tables are more concise, and I can tweak my IRS normalization script for these special cases where a balanced study design is used as an approximation to reference channels. I have added several Excel sheets in the repository (described in the README.md file) including MaxQuant results. The data from MaxQuant is very similar in appearance to the PAW data when plotted similarly. I will add the MaxQuant work up in the near future. There is also a sheet where signal-to-noise ratios were used in the PAW pipeline instead of intensities. Again, the results were qualitatively very similar.

I did not explore lower level peptide intensities in this analysis. That data is present in the PAW peptide reports and in the MaxQuant evidence files. Things are quite a bit more ragged at the PSM/peptide level. This should give one some pause when thinking about doing expression comparisons at these lower levels. The attempts to correct tag intensities by the isotopic purity factors in the manufacturer's spec sheets are usually done at the PSM level. It is not clear that the data is good enough to do that accurately for all PSMs. It is also not clear if the data sheets are accurate enough to do this at all, especially with N- and C- forms of the tags.

The lessons I learn from these analyses efforts is that isobaric tagging is not a perfect quantitative method and the data analyses of real experiments will have limitations. MS2-based TMT data has non-specific background levels present (probably in all channels) and that will confound intensity measurements within single TMT experiments. There are also isotopic effects where a tag has some cross talk with "adjacent" channels (I have not thought through how the different delta masses of C12 impurities in C13 enriched isotopes and N14 impurities in N15 enriched isotopes affect the "-1" situation). This cross talk varies tag-by-tag depending on its composition. This means that the tags confound each other in an extremely complicated way. How this tag confounding contributes to confounding between biological samples depends on how the samples are allocated to channels within and between different TMT experiments. I think trying to worry too much about all of these factors is a fool's errand. Most studies are not limited by the accuracy of the reporter ion signals, but by the quality of the study design and sample preparations.

-Phil Wilmarth, September 22, 2018.

In [45]:
sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] bindrcpp_0.2.2  edgeR_3.22.2    limma_3.36.3    cowplot_0.9.3  
 [5] forcats_0.3.0   stringr_1.3.1   dplyr_0.7.6     purrr_0.2.5    
 [9] readr_1.1.1     tidyr_0.8.1     tibble_1.4.2    ggplot2_3.0.0  
[13] tidyverse_1.2.1 psych_1.8.4    

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.18         locfit_1.5-9.1       lubridate_1.7.4     
 [4] lattice_0.20-35      assertthat_0.2.0     digest_0.6.16       
 [7] IRdisplay_0.5.0      R6_2.2.2             cellranger_1.1.0    
[10] plyr_1.8.4           repr_0.15.0          backports_1.1.2     
[13] evaluate_0.11        httr_1.3.1           pillar_1.3.0        
[16] rlang_0.2.2          lazyeval_0.2.1       uuid_0.1-2          
[19] readxl_1.1.0         rstudioapi_0.7       labeling_0.3        
[22] foreign_0.8-71       munsell_0.5.0        broom_0.5.0         
[25] compiler_3.5.0       modelr_0.1.2         pkgconfig_2.0.2     
[28] base64enc_0.1-3      mnormt_1.5-5         htmltools_0.3.6     
[31] tidyselect_0.2.4     crayon_1.3.4         withr_2.1.2         
[34] grid_3.5.0           nlme_3.1-137         jsonlite_1.5        
[37] gtable_0.2.0         magrittr_1.5         scales_1.0.0        
[40] cli_1.0.0            stringi_1.2.4        xml2_1.2.0          
[43] IRkernel_0.8.12.9000 RColorBrewer_1.1-2   tools_3.5.0         
[46] glue_1.3.0           hms_0.4.2            parallel_3.5.0      
[49] colorspace_1.3-2     rvest_0.3.2          pbdZMQ_0.3-3        
[52] bindr_0.1.1          haven_1.1.2