Software
Sites with software catalogues
DAVID (Database for Annotation, Visualization and Integrated Discovery; National Institute of Allergy and Infectious Diseases (NIAID), NIH)
Proteomics Research Resource for Integrative Biology Tools (Pacific Northwest National Laboratory)
Biological MS Data and Software Distribution Center (Pacific Northwest National Laboratory)
PatternLab for proteomics: a tool for analyzing shotgun proteomic data
The Global Proteome Machine Organization
Proteomics Tools (Proteomics centre, Medical College of Wiskonsin)
Microarray, SAGE and other gene expression databases (Health Sciences Library System; University of Pittsburgh/UP Medical Center)
Microarray, SAGE and other gene expression data analysis tools (Health Sciences Library System; University of Pittsburgh/UP Medical Center)
Protein Sequence Databases and Analysis Tools (Health Sciences Library System; University of Pittsburgh/UP Medical Center)
Proteomics Resources (Health Sciences Library System; University of Pittsburgh/UP Medical Center)
Max Planck Society Wiki
Max-Planck-Institut fuer Biochemie-Bioinformatics Support Services
Analysis of proteomes
- The Proteome Analyst Specialized Subcellular Localization Server (PA-SUB); part of Proteome Analyst (PA). PA is a web server built to predict protein properties, such as general function, in a high-throughput fashion. PA-SUB is specialized to predict the subcellular localization of proteins using established machine learning techniques
Comprehensive subcellular location analysis for all E. coli proteins, created using the publicly available prediction algorithms together with experimental data and in-house manual curation.
Predicted subcellular localization for entire proteomes (Human, Arabidopsis, Fruit Fly, Worm& Yeast)
Interactomics/Protein-Protein interactions tools
Comparative proteomic analysis software suite is designed to better assess the confidence level of proteinΠprotein interactions that are not well-suited for the analysis of large nonreciprocal parallel data sets (e.g. antibody pull downs and identification of interacting proteins by MS/MS). CompPASS includes components that store, organize, and process data, and these components are linked to networking and functional-analysis tools. An unbiased comparative method assigns scores to the identified proteins. In addition to a conventional Z score, the algorithm calculates a D score that takes into account the uniqueness, reproducibility, and abundance of the interacting protein.
iHop:
Literature navigation tool for protein interactions
Jenna Centre for Bioinformatics Protein-Protein Interaction Website
P I C :
Protein Interactions Calculator
Database of Interacting Proteins
Protein Interaction Network Analysis (PINA) platform is an integrated platform for protein interaction network construction, filtering, analysis, visualization and management. It integrates protein-protein interaction data from six public curated databases and builds a complete, non-redundant protein interaction dataset for six model organisms (Human, Arabidopsis, Fruit Fly, Worm, Mouse,Rat & Yeast). Moreover, it provides a variety of built-in tools to filter and analyze the network for gaining insight into the network. At the same time, PINA allows users to either edit the network generated from the public data, or combine these with uploaded private data to build more complete protein-protein interaction networks.[info]
PPIspider: Implements a robust statistical framework for the interpretation of protein lists in the context of a global PPI network.
PLIPS (Protein Lists Identified in Proteomics Studies)
The ultimate result of a proteomics study is a list of proteins found to be present (or differentially present) at various cell physiological conditions. Normally the results are presented in a publication in one or several tables. The bulk of his type of information remains dispersed in hundreds of proteomics related publications. We have developed a web mining tool which allows collecting this information by searching through full text papers and automatically selecting tables, which report a list of protein identifiers. By searching through major proteomics journals, we have collected approximately 800 independent studies published recently, which reported about 1000 different protein lists. Based on this data, we developed a computational tool PLIPS (Protein Lists Identified in Proteomics Studies). PLIPS accepts as input a list of protein/gene identifiers. Using statistical analyses PLIPS identifies recently published proteomics studies, which report protein lists that significantly intersects with a query list.
Reactome: a curated knowledgebase of biological pathways
Mass Spectrometry data analysis tools
integrated suite of browser-driven, multi-functional analysis software for interpreting, comparing, and displaying tandem mass spectrometry results. BIGCAT relies on a back-end Oracle database utilizing a simple, adaptable schema for results storage. Interactive, web-deployed front-end applications allows users to filter, view, compare, analyze, and visualize mass spectrometry results.(from the Link Lab, Vanderbilt University Medical Cente)
Corra
computational framework and tools for LC/MS proteomics. Extends and adapts existing algorithms used for LC-MS-based proteomics, and statistical algorithms, originally developed for microarray data analyses, appropriate for LC-MS data analysis. Corra also adapts software engineering technologies (e.g. Google Web Toolkit, distributed processing) so that computationally intense data processing and statistical analyses can run on a remote server, while the user controls and manages the process from their own computer via a simple web interface. Also allows the user to output significantly differentially abundant LC-MS-detected peptide features in a form compatible with subsequent sequence identification via tandem mass spectrometry (MS/MS). [info] [ref]
The MAss SPECTRometry Analysis System is a platform for management and analysis of proteomics LC-MS/MS data. MASPECTRAS is based on the Proteome Experimental Data Repository (PEDRo) relational database schema and follows the guidelines of the Proteomics Standards Initiative (PSI). Analysis modules include: 1) import and parsing of the results from the search engines SEQUEST, Mascot, Spectrum Mill, X! Tandem, and OMSSA; 2) peptide validation, 3) clustering of proteins based on Markov Clustering and multiple alignments; and 4) quantification using the Automated Statistical Analysis of Protein Abundance Ratios algorithm (ASAPRatio).The system provides customizable data retrieval and visualization tools, as well as export to PRoteomics IDEntifications public repository (PRIDE). [pubmed]
estimates false discovery rates for protein IDs in large data sets of all sizes, especially those composed of ³100 LC/MS/MS runs.MAYU software is publicly available as an integrated part of the Trans-Proteomic Pipeline suite of tools and as a stand-alone package.[abstract]
Site includes useful programs for analyzing the mass spectra of proteins and peptides.
Sheffield ChemPuter: Isotope Patterns Calculator
Site predicts isotope pattern based on molecular formula.
Virtual Proteomics Data Analysis Cluster
ViPDAC is a set of free tools to be used in combination with Amazon's inexpensive "cloud computing" service, which provides the option to rent processing time on its powerful servers; and free open-source software from the National Institutes of Health (NIH) and the University of Manitoba.
open source is software that can match tandem mass spectra with peptide sequences, in a process that has come to be known as protein identification.All of the X! Series search engines calculate statistical confidence (expectation values) for all of the individual spectrum-to-sequence assignments. They also reassemble all of the peptide assignments in a data set onto the known protein sequences and assign the statistical confidence that this assembly and alignment is non-random.(from the The Global Proteome Machine Organization)
MS dissociation methods
Prior knowledge of a precursor ionΥs charge state is propitious before submitting its MS2 to a search engine. If the charge state is unknown, the search must be executed for every charge state hypothesis and can generate a great computational overburden. Charge Prediction Machine (CPM), is a software for infering precursor charge state from ETD low resolution mass spectra. CPM has roots in the Bayesian decision theory and introduces methods to account for different co-fragmenting precursor ion species. It also carries a nifty graphic user interface as seen on the right; however, it can also be executed in the command prompt as to be integrated into bioinformatic pipelines (This version is available upon request).[info]
MRM-SRM analysis
MRMAtlas
It is a compendium of targeted proteomics assays to detect and quantify yeast proteins in complex proteome digests by mass spectrometry.
Multiple-reaction monitoring (MRM) is a powerful method for the quantitation of specific proteins by MS/MS. The technique involves the selective MS analysis of peptide ions and fragmented product ions from proteins of interest. The ratio of the m/z of a user-specified peptide to that of its chosen product ion is called the MRM transitionΣ. MRMaid helps select optimal MRM transitions. To identify and score the most reliable MRM transitions for a particular protein, the MRMaid program uses known optimal MRM transitions in combination with mining of an MS proteomic data repository. In this way, MRMaid obviates the need for time-consuming MS discovery studies or theoretical fragmentation predictions. In addition, the program predicts the RP-HPLC retention times of peptides; this allows the ordering of multiple transition [info]
Post-translational modifications
Free on-line KnowledgeBase; features detailed information on over 92,400 phosphorylation sites in over 13,800 human proteins, including their evolutionary analysis in over 20 other species. Highly conserved phospho-sites are likely to be the most functionally important. Kinexus has recently developed a proprietary algorithm that predicts the optimum phosphorylation site specificities of 500 human protein kinases. Our new Kinase Predictor Module list the top 10% of these kinases that best match each of the human phospho-sites in PhosphoNET. Over 4.6 million kinase-substrate phospho-site pairs are quantified. Find out which kinases target your favorite proteins at which phospho-sites with PhosphoNET
It is a project to support systems biology signaling research by providing interactive interrogation of MS-derived phosphorylation data from 4 different organisms (fly, human, worm and yeast).
Predicts acetylation sites across a proteome
Protein Disulphide Linkage Modeler
Systematic and sophisticated platform for proteomic PTM research, equipped not only with a knowledge base of manually curated multi-type modification data, but also with four fully developed, in-depth data mining tools. Currently, SysPTM contains data detailingnearly 50 PTM types, curated from public resources. Protein annotations including Pfam domains, KEGG pathways, GO functional classification, and ortholog groups are integrated into the database. Four online tools have been developed and incorporated, including: PTMBlast, to compare a userΥs PTM dataset with PTM data in SysPTM; PTMPathway, to map PTM proteins to KEGG pathways; PTMPhylog, to discover potentially conserved PTM sites; and PTMCluster, to find clusters of multi-site modifications
Software/Search engines
2D-DB
Bioinformatics solution for storage, integration and analysis of quantitative proteomics data. It is based on a core data model describing fundamentals such as experiment description and identified proteins. The extended data models are built on top of the core data model to capture more specific aspects of the data. A number of public databases and bioinformatical tools have been integrated giving the user access to large amounts of relevant data. A statistical and graphical package, R, is used for statistical and graphical analysis. The current implementation handles quantitative data from 2D gel electrophoresis and multidimensional liquid chromatography/mass spectrometry experiments. Available for download at SourceForge.[pubmed]
The Computational Proteomics Analysis System (CPAS) integrates many openly available software infrastructure for systematic proteomic data analyses and data management into a single Web-based platform for mining liquid chromatography-tandem mass spectrometry (LC-MS/MS) proteomic experiments. CPAS incorporates several tools currently used in proteomic analysis, including the X! Tandem search engine and the PeptideProphet and ProteinProphet data mining tools. The application is built on the open-source LabKey platform, an extensible architecture for developing high-throughput biological applications. The CPAS analysis pipeline acts on data in standardized file formats, so that researchers may use CPAS with other search engines, including Mascot or SEQUEST, that follow a standardized procedure for reporting search engine results.[pubmed]
pProRep
Web application integrating electrophoretic and mass spectral data from proteome analyses into a relational database. The graphical web-interface allows users to upload, analyse and share experimental proteome data. It offersthe possibility to query all previously analysed datasets and can visualize selected features, such as the presence of a certain set of ions in a peptide mass spectrum, on the level of the two-dimensional gel. Download: from http://www.ptools.ua.ac.be/pProRep. Requires a web server that runs PHP 5 (http://www.php.net) and MySQL. [pubmed]
Label Free Quantitation -Support for replicate analyses View protein expression plots
Informatics Solutions for Life Science Research
Proteios SE is built around a Web-based local data repository for proteomics experiments. The application features sample tracking, project sharing between multiple users, and automated data merging and analysis. ProSE has built-in support for several quantitative proteomics workflows, and integrates searching in several search engines, automated combination of the search results with predetermined false discovery rates, annotation of proteins and submission of results to public repositories. ProSE also provides a programming interface to enable local extensions, as well as database access using Web services. ProSE provides an analysis platform for proteomics research and is targeted for multiuser projects with needs to share data, sample tracking, and analysis result. ProSE is open source software available at http://www.proteios.org meant to be installed on a local server in a proteomics laboratory. The server is accessed using any web browser using personal login accounts with administrated access levels. With his or her own account, a user can enter data into the database, group experiments together into projects, run protein identification engines and other analysis tools. Users can choose to share almost any database item (e.g. samples, data, experiments, files, etc) with other users to facilitate online collaboration.[publication]
PRoteomics IDEntifications database.
PRIDE Converter- makes it straightforward to submit proteomics data to PRIDE from most common data format [Publication]
Proteomics, Protein ID & Characterization, ESI-MS, 2DE
Protein Information Retrieval Online World Wide Web Lab
Intact protein analysis
Using a web portal, ProSight PTM allows identification and characterization of intact proteins (and their post- translational modifications, PTMs) using the Top-Down Approach. This site has many tools and graphical features to facilitate analysis of single (recombinant) proteins, proteins in mixtures, and proteins fragmented in parallel. Our ProSight Warehouses are annotated with all known post-translational modifications (PTMs), alternative splicing events and single nucleotide polymorphisms (SNPs) using the technique of Shotgun Annotation developed in the Kelleher Research Group. ProSightPTM is the only proteomics software that allows the user to search their tandem MS data against proteome warehouses containing the known biological complexity present in UniProt.[PDF]
Cross-linked peptides
A search engine for cross-linked peptides from complex samples. Presented at the ASMS07;published in Nature Methods. xQuest works with small and large protein databases and features flexible fragment ion assignment, advanced scoring, and interactive evaluation tools.
Data visualization
(LC-MS)
Cytoscape
Open source bioinformatics software platform for visualizing molecular interaction networks and biological pathways and antergrating these networks with annotations, gene expression profiles and other state data.
Integrates complex datasets with 3D virtual world platforms. The platform supports some advanced scientific operations, especially relevant for the field of genetic analysis, such as overlay of mass spectrometry and/or microarray information onto images of specimens; integration with all the major proteomics and genomics databases; and import of molecular models into the virtual space.
An Open-Source Viewer for Mass Spectrometry Data.
(2D-GE)
tool to visualize theoretical distributions of peptide pI on a given pH range, and proposes a fractionation scheme that generates fractions with similar peptide frequencies
Software for the simulation and analysis of proteomics data. Determines the theoretical isoelectric points (pI) and the calculated molecular weights (MW) of proteins and visualizes these as a virtual two-dimensional (2D) protein map. The user is able to control the presentation of the calculated 2D gel interactively by selecting a pI/MW range and an electrophoretic timescale of interest.
Normalized and filtered expression files can be analyzed using TIGR Multiexperiment Viewer (MeV). MeV is a versatile microarray data analysis tool, incorporating sophisticated algorithms for clustering, visualization, classification, statistical analysis and biological theme discovery. MeV can handle several input file formats. These include the .mevΣ and .tavΣ files generated by TIGR Spotfinder and TIGR MIDAS, and also Affymetrix¨ (.txtΣ) and Genepix¨ (.gprΣ) files.
Protein lists/MicroArrays
Utilizes a variety ofpowerful information visualization techniques, including Treemap, Heat Map, Heat Matrix, Line Graph, Scatter Plot, Barseries, Horizon Graph, and Stack Graph information visualizations. These tools help find hidden patterns that can be buried in large datasets data. Makes it easy to synthesize, present and share results with colleagues and customers.
A tool which takes advantage of the Gene Ontology (GO) to extract the proteinsΥ main attributes
Compare multiple protein lists and get to the biology of proteomics data
GOEx :: Gene Ontology Explorer
The Gene Ontology project is a major bioinformatics initiative with the aim of standardizing the representation of gene and gene product attributes across species and databases. The project provides a controlled vocabulary of terms for describing gene product characteristics and gene product annotation data from GO Consortium members, as well as tools to access and process this data
fully automated distributed pipeline for large-scale structural and functional annotation of all major proteomes via the use of cutting edge computer GRID technologies.
Tool for biological interpretation of 'omic' data Π including data from gene expression microarrays. everages the Gene Ontology (GO) to identify the biological processes, functions and components represented in these lists. Instead of analyzing microarray results with a gene-by-gene approach, GoMiner classifies the genes into biologically coherent categories and assesses these categories. The insights gained through GoMiner can generate hypotheses to guide additional research.
Correlation of Transcriptomics and Proteomics data
protein abundance and RNA expression correlation tool
LIMS (Laboratory Information management Systems)
LIMS for 2-DGE-based proteomics workflow (open source)
Main features of its design are compactness, flexibility and connectivity to public databases. It supports the handling of data imported from mass spectrometry software and 2-D gel image analysis software. The LIMS is equipped with the same input interface for 2-D gel information as a clickable map on public 2DPAGE databases. The LIMS allows researchers to follow their own experimental procedures by reviewing the illustrations of 2-D gel maps and well layouts on the digestion plates and MS sample plates. [pubmed]
Workflow-optimized laboratory information management system for 2-D electrophoresis-centered proteomics [info]