NBIC Product showcase
On this page NBIC lists (in reverse chronological order) software and database projects that you should know about. These are projects created by NBIC or in collaboration with NBIC for which we invite anyone to start using them. User feedback on these tools is very welcome!
Rite
Rite is a pilot job framework written in Java, that allows you to submit jobs to various compute resources (e.g. cluster, grid). It consists of a robust pilot job framework client and a server with an integrated MongoDB database. Key features of the system are:
- Robust pilot framework that will retry failed or timed-out jobs
- Recipes describing reusable jobs via json documents or a java API
- On the fly resolution of files through indirection
- Central storage of console output and status of jobs
- Querying of job status and results with the MongoDB's native query language or through the MongoDB web services.
Rite is an open source project released under the GNU Lesser General
Public License version 3 and can be downloaded from the NBIC trac
NBIC Galaxy server
NBIC Galaxy is built based on the Galaxy system developed by Penn State University. BioAssist task forces use this server to build and publish their workflows. This server is maintained as an academic best effort and anyone is welcome to use it.
We try keep this machine as stable as possible, but beware that your data may vanish without notice, so make sure you keep backups of your precious data. Each registered user is entitled to a disk quota of 10GB, an anonymous user has 10MB disk quota on the system.
Peregrine
Peregrine is a very fast software package used to recognize interesting multi-word terms in human text. Peregrine was originally developed by Martijn Schuemie at the department of Medical Informatics of the Erasmus University Medical Center (EMC) in Rotterdam. The package was the first project in 2009 to be taken up by NBIC's BioAssist Engineering team, who have been preparing the open source release together with the EMC by making the program easy to use, and the code more easy to extend and maintain.
Peregrine can now be found at https://trac.nbic.nl/data-mining/ and downloaded under an AGPL license.
PDB_REDO
PDB_REDO is a databank of experimental macromolecular structure models optimised for bioinformatics research. Unlike experimental structures from the PDB, that were solved with the methods of their era by a diverse group of experimentalists, PDB_REDO entries are optimised using a fully automated procedure that employs latest methods in X-ray crystallography. This procedure improves the fit with the original experimental data as well as the geometric quality of the structures. Because all structure models in PDB_REDO are treated by the same procedure, they form a more consistent and uniform data set suited for high throughput studies. PDB_REDO covers 98% of all crystallographic PDB entries for which experimental data was deposited.
The databank can be found at: http://www.cmbi.ru.nl/pdb_redo
CLI-mate: Galaxy tool generator
CLI-mate is a service to facilitate developers in creating user-friendly interfaces for a command line tool.
In the agile development environment of bioinformatics, many command line tools are created quickly to fill in gaps between complex information processes. A command line interface (CLI) is sometimes sufficient for the task, but it limits adoption by a broader audience. Therefore it's often necessary for the developer to create a wrapper that provides a more user friendly interface. The CLI-mate interface generator makes this easy: it can generate different wrappers: one of them is turning the program into a Galaxy tool.
CLI-mate was developed at the Department of Human Genetics, Leiden University Medical Center (LUMC).
PMID2DOI
PMID2DOI is a service that provides the conversion between two types of identifiers; the PubMed Identifier (PMID) which is a unique number assigned to PubMed citations of life science journal articles and the Digital Object Identifier (DOI™) which is used for identifying digital content. DOI™’s are used to provide current information, including where the content (or information about it) can be found on the Internet. DOI™’s can be used as part of the provenance information for each nanopublication. There are SOAP and REST web services available for this conversion. In addition, a SPARQL endpoint can be used to query the conversion system. http://www.pmid2doi.org/
The service can be found at: http://www.pmid2doi.org/
Taverna-Galaxy Tool Generator
Galaxy and Taverna are two widely-used tools for combining bioinformatics tools to perform a larger analysis. Taverna is the more sophisticated workflow system, while Galaxy is popular among genomics researchers and used by many bioinformaticians to make scripts available for colleagues. Each has its own strengths. Therefore, we built a generator that constructs a Galaxy tool from a Taverna workflow, enabling it to run seamlessly in Galaxy.
The generator is available for download, and is part of http://myExperiment.org/, a community web site for computational scientists. Here, you can simply download a workflow as a Galaxy tool and install it into a Galaxy server.
Nutritional Phenotype Database
The Nutritional Phenotype Database (DbNP) helps biologists to interpret the results of biology studies which involve multiple 'omics' techniques. Initially, it was aimed at medium sized nutrigenomics intervention studies (hence the name), but it is now also used for storing studies from different biology areas, such as environmental plant studies. DbNP can be used to store detailed information about the design of your studies, to link those study designs to actual 'omics' data, and to interpret the measured data along the axes of your study design.
See: http://www.dbnp.org/
Galaxy Virtual Machine
Galaxy is a system that makes many bioinformatics tools, especially for next generation sequencing, available through a user friendly web interface. It is being developed by Penn State University. NBIC has its own Galaxy server that includes tools developed by BioAssist task forces (http://galaxy.nbic.nl/). That server is available to anyone, but is practically limited to rather small data sets. A Galaxy Virtual Machine is now available (19GB download from the web or using BitTorrent) for users that want to run their own copy of Galaxy with all NBIC tools preinstalled.
CitedIn
CitedIn is a web service with API to find citations of scientific publications in online public data. CitedIn contains literature citations from a broad selection of online resources, including bibliographic databases (Pubmed, Google Scholar, etc), biomedical databases (Uniprot, Kegg), Wikis (Wikipedia, Wikipathways, Brede Wiki) , social networks (Connotea, CiteULike), or Blogs (Nature Blogs, Google Blogs).
CitedIn is available at http://www.citedin.org/
Warp2D
Warp2D is a tool containing a new algorithm for time alignment of multiple MS spectra, mainly in proteomics. Since it can take quite a long time to run pairwise time alignment on a large set of spectra, NBIC's BioAssist program and the NPC have created a web service that allows users to run warp2d on the life science grid. This is the first tool that is made available using the DAF software under development in BioAssist.
The Warp2D web service is available through the web site of the Netherlands Bioinformatics for Proteomics Platform, NBPP.
CytoscapeRPC
CytoscapeRPC is an extension to Cytoscape that allows your own software to use it as a graphical front end for your data visualisation. CytoscapeRPC is used through a standard XML-RPC interface, and can therefore be used from almost any imaginable programming language.
CytoscapeRPC is open source and runs on all systems that support Cytoscape. The package can be found on the NBIC project server with descriptions on the NBIC wiki.
R/GPU
R/GPU is a package that allows programmers to use a GPU (graphical processor in a computer) to speed up bioinformatics analysis using R. It behaves like magic: once R/GPU is installed, your R scripts will automatically use it and achieve much higher speeds. Large matrix multiplications may e.g. run 50x faster.
R/GPU is available as open source project. It is in beta release and can be found on the NBIC project development site.
MOLGENIS
MOLGENIS is a system that takes a relatively simple description of the kind of information you would like to store, and at the push of a button generates a complete database system with associated web site that allows you to add data to the database and query it.
MOLGENIS is open source software written in Java, and it is available through its own web site.
GPCRDB
GPCRDB is an information system for G-protein coupled receptors (GPCRs). It collects, combines, validates, and disseminates large amounts of heterogeneous data. The GPCRDB contains experimental data on sequences, ligand binding constants, mutations, and oligomers, as well as many different types of computationally derived data such as multiple sequence alignments and homology models.
GPCRDB is a web resource that is providing different access methods. The authors are open to collaboration on the data.
StatQuant
StatQuant is an analysis toolbox for quantitative mass spectrometry. It offers a set of statistical tools to process, filter, compare and represent data from several quantitative proteomics software packages such as MSQuant. StatQuant offers the researcher post processing methods to achieve improved confidence on the obtained protein ratios.
StatQuant runs on Windows, Mac and Linux and is available as Open Source from the NBIC project repository.


