|
|
MSRC Proteomics Core Tools
Protein identification from MS-MS data is done with database search algorithms, which match spectra to peptide sequences. MS-MS data acquired with LC-MS-MS instruments are matched to database sequences using Sequest on a Linux cluster in the Vanderbilt Advanced Computing Center for Research and Education (ACCRE). We also have begun using the open-source application X!Tandem, which also matches MS-MS spectra to database sequences. For MS peptide maps and MS-MS data acquired on the ABI 4700 MALDI-TOF-TOF instrument, matching database sequences are identified with Mascot. Because the database search algorithms atempt to match all spectra to database sequences, regardless of spectral quality, only a subset of the matches are of high quality. Evaluation of sequence-to-spectrum matches is done with Peptide Prophet, which estimates rates of false-positive identifications from database searches with Sequest, X!Tandem, and Mascot. Evaluation of protein identifications from peptide sequence-to-spectrum matches is done with the companion program Protein Prophet. Database search algorithms for protein identification have limited utility for identifying MS-MS spectra of modified or variant sequences, particularly when the modifications cannot be anticipated prior to database searching. We have developed the P-Mod and SALSA algorithms and software, which identify MS-MS spectra for modified peptides and facilitate the discovery and mapping of protein posttranslational modifications and chemical adducts. Outputs of database search programs and peptide- and protein identification applications are maintained in a custom-designed database system called CHIPS (Complete Heirarchical Integration of Protein Searches). CHIPS allows filtering, sorting and comparisons between multiple datasets using flexible, user-defined criteria. |
