Friedman directs the proteomics laboratory, part of the Mass
Spectrometry Research Center. Mass spectrometry and bioinformatics
are used to identify proteins separated by 2D-gel electrophoresis
and other methods.
yields protein answers
by Leigh MacMillan
Bioinformatics is the key final step in assuring that the proteomics
shared resource can do what it does identify proteins.
Those proteins might be the ones that change in cells treated with
a new chemotherapy drug, or they might be the ones associated with
a large complex. Whatever the proteins, the proteomics laboratory
draws on several different methods to separate them, and then uses
mass spectrometry and bioinformatics to identify them.
We generate the mass spectrometry data and then rely on the
bioinformatics field to get our answers, says David Friedman,
director of the proteomics laboratory, which was established as
a component of the Mass Spectrometry Research Center under the leadership
of Richard Caprioli.
A common approach for identifying proteins uses 2D-gel electrophoresis
to separate mixtures of proteins based on physical attributes
isoelectric point and molecular weight. Individual proteins
spots on the 2D-gel can be cut out of the gel, digested into
peptides, and analyzed by mass spectrometry.
This technology is most often directed to finding the proteins
that are changing, for example under different experimental conditions,
or in disease tissue versus normal tissue. For higher throughput,
the core takes advantage of fluorescent dye labels and laser imaging.
This is another way we use bioinformatics, Friedman
says. We can directly compare two or three samples, labeled
with different dyes and separated at the same time. The computer
algorithm will tell us whos changing, whos not changing,
and by how much. Its very powerful.
The cores automated system allows users to select spots for
automatic sampling, digestion, and mass spectrometry analysis. Each
protein has a characteristic signature of tryptic peptides,
Friedman says. Bioinformatic search algorithms compare an experimental
signature to a theoretical digest of every protein in
a selected database and return a match, if one exists in the database.
Our approach is completely dependent on the protein being
in the database, Friedman says. We rely on the databases
being properly annotated, maintained, and continuously updated.
The core makes use of databases containing complete annotated proteins
as well as those for expressed sequence tags (ESTs).
The search algorithms for matching experimental mass spectra are
either commercially available or free, Friedman says. Like the databases,
these algorithms are regularly updated and improved.
The algorithms have to be especially powerful to conduct searches
on data from complex mixtures of proteins. Andrew Link, assistant
professsor of Microbiology & Immunology, and collaborators developed
a technology and analysis algorithm called SEQUEST to directly analyze
and identify all of the proteins present in a purified protein complex.
To speed the analysis, Link built a 20-node parallel processor.
The parallel processor, Friedman says, makes the experiment possible
reducing the database search to hours as opposed to many
Improvements to processing speed will likely be the limit of bioinformatics
development efforts for the proteomics shared resource, Friedman
Were advancing the field of proteomics by improving
the technologies for protein separation and detection and by developing
new technologies, Friedman says. We rely on the expertise
of bioinformaticians to keep database searching state-of-the-art.