Print this page

Resources & Publications

Resources

Utopia

Utopia is a collection of interactive tools for analysing protein sequence and structure. Up front are user-friendly and responsive visualisation applications, behind the scenes a sophisticated model that allows these to work together and hides much of the tedious work of dealing with file formats and web services.

Utopia:Documents

The freely available Utopia:Documents scientific PDF-reader allows new ways of thinking about your documents: it brings your papers to life by linking to live resources on the web and turning static data into live interactive content. 

BioCatalogue

The BioCatalogue: providing a curated catalogue of Life Science Web Services

Linked Life Data

Linked Life Data v.0.4.1: more than 4 billion triples; more than half a billion resources! New: UMLS auto-completion, better linking and more

W3C blog

The Semantic Web Health Care and Life Sciences Interest Group of W3C (HCLS) maintains a blog. The mission of the Semantic Web Health Care and Life Sciences Interest Group, part of the Semantic Web Activity, is to develop, advocate for, and support the use of Semantic Web technologies for biological science, translational medicine and health care. These domains stand to gain tremendous benefit by adoption of Semantic Web technologies, as they depend on the interoperability of information from many domains and processes for efficient decision support.

Quote from a recent post: 

"A development that I am excited about is the collaborative effort involving HCLS and CWA and the Swiss Institute of Bioinformatics (SIB) to create a SPARQL endpoint for Uniprot. Such an endpoint could make it possible to perform essential bioinformatics information retrieval without ever leaving the comfort of your SPARQL query interface."

Publications

  • The Roots of Bioinformatics. An article by David Searls in PLoS Computational Biology. The unique status of bioinformatics vis-à-vis science and technology is considered, historical trends in biology and related fields that anticipated and prepared the way for bioinformatics are explored. Finally, the context of key moments when computers were first taken up by early adopters is examined, which reveals how deep the roots of bioinformatics go.
  • Scientists are struggling to make sense of the expanding scientific literature. A News Feature in Nature (Vol 463, 28 january 2010) on how semantic computational tools can help researchers deal with the over-abundance of scientific literature.
  • Nano-Publication in the e-science era

Abstract. The rate of data production in the Life Sciences has now reached
such proportions that to consider it irresponsible to fund data generation
without proper concomitant funding and infrastructure for storing, analyzing
and exchanging the information and knowledge contained in, and extracted
from, those data, is not an exaggerated position any longer. The chasm between
data production and data handling has become so wide, that many data go
unnoticed or at least run the risk of relative obscurity, fail to reveal the
information contained in the data set or remains inaccessible due to ambiguity,
or financial or legal toll-barriers. As a result, inconsistency, ambiguity and
redundancy of data and information on the Web are becoming impediments to
the performance of comprehensive information extraction and analysis. This
paper attempts a stepwise explanation of the use of richly annotated RDFstatements
as carriers of unambiguous, meta-analyzed information in the form
of traceable nano-publications.

A New Way of Opening Up Scientific Literature. Slides presented by Jan Velterop at the "Berlin 7" conference in Paris, 2-4 December, 2009.

Open Data

Many of the Linked Open Data resources are of interest to bioinformaticists. The Freie Universität in Berlin maintains an overview of these Linked Open Data resources that is active, i.e. clicking on any of the resources will take you to their home page. The URL to the overview is: http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-03-05.html

The Open Government Data Wiki has a set of Open Data Principles. Although these principles are specifically about Government data, they are so generic that they can easily be read to apply to scientific data as well, in order to earn the label 'open'. The principles are:

[Government] data shall be considered open if they are:

  1. Complete. All public data are made available. Public data are data that are not subject to valid privacy, security or privilege limitations.
  2. Primary. Data are collected at the source, with the finest possible level of granularity, not in aggregate or modified forms.
  3. Timely. Data are made available as quickly as necessary to preserve the value of the data.
  4. Accessible. Data are available to the widest range of users for the widest range of purposes.
  5. Machine processable. Data are reasonably structured to allow automated processing.
  6. Non-discriminatory. Data are available to anyone, with no requirement of registration.
  7. Non-proprietary. Data are available in a format over which no entity has exclusive control.
  8. License-free. Data are not subject to any copyright, patent, trademark or trade secret regulation. Reasonable privacy, security and privilege restrictions may be allowed.

Compliance must be reviewable.