Concept Web Alliance
CWA is an open collaborative community that is actively addressing the challenges associated with the production of unprecedented volumes of academic and professional data. This international effort seeks to organize the massive amounts of information flooding the biological sciences and other scientific disciplines. Challenges include storage, interoperability and analysis of such massive and disparate data sets.
CWA‘s agreed approach is a 'Semantic Web' strategy, meaning that disparate data on the internet are now structurally connected to each other. As the amount of scholarly communication increases, it is increasingly difficult for specific core scientific statements to be found, connected and curated. Additionally, the redundancy of these statements in multiple fora makes it difficult to determine attribution, quality, and provenance. To tackle these challenges, the Concept Web Alliance has promoted the notion of nanopublications (core scientific statements with associated context) in a manner allowing for meaningful Web-wide interconnectivity.
For this interconnectivity, and the necessary interoperability, to occur, the unambiguous and established meaning (the ‘concept’) of terms used in texts and data collections is first uniquely identified. A human validation process then completes the identification where the technology cannot (yet) resolve ambiguities. Next, the ‘assertions’ that can be made with each concept are captured in a simple, computer readable format, including the ability to weigh the validity of each individual assertion. Highly scalable and flexible, the technology will be deployed across academic disciplines, corporate applications and community-building efforts.
NBIC and CWA
The ever-increasing availability of scientific information and data has been the driver of the extraordinary advances in the biosciences in the last decade. Yet, scientific progress is constrained by the difficulty of efficiently and effectively using these large volumes of data and information, and extracting consistent knowledge from them across disparate data sources (often referred to as interoperability). As the national expertise centre for Bioinformatics in the Netherlands and a proponent of continued scientific advancement in biosciences, NBIC is a strong supporter of initiatives that solve these problems. The Concept Web Alliance (CWA) is one of the major collaborative international partnership efforts of NBIC.
User-friendly forum for biosciences and beyond
Current text-mining efforts in biological sciences, the lack of standardization, the lack of uniform access to numerous and disparate data sources, all lead to redundancy and ambiguity in content. Extraction of knowledge from such ambiguous and redundant content is highly inefficient, time consuming and often frustrating. A major initial goal of the Concept Web Alliance is to organize knowledge into a unified, user-friendly, online platform where it is stored as unique concepts and linked together into uniquely-identified assertions. Often referred to in Semantic Web jargon as ‘triples’, these assertions are obtained from published texts, databases, and offline resources. They are then ‘enriched’ with attributes that preserve the multiple origins and provenance of these ‘triples’. Ultimately their ‘validity level’ becomes apparent, creating de-facto ‘nano-publications’. When reasoning with the data, 'values' or 'weight' (like provenance, credibility, etc.) can be taken into account, and they can be cited for reference. Scientists can further enrich these ‘triples’ by adding their own knowledge, and will have an incentive to do so to enhance their expert user profile.
Interoperability of multiple data sets requires creation of a common identifier that links concepts and meaning. Highly scalable, this common identifier provides a method for assembly and sharing of information and data enabling reuse by various people, across different disciplines. The CWA identifier is configured to allow incorporation of multiple, mapped identifiers for the same concept (treated as synonyms). Organisations can continue to use their internal identifiers, thus providing for the bottom up standard setting needed for long-term sustainability and growth.
It requires more than a single organization to tackle these challenges and address all issues. CWA proposes to treat interoperability as more than just a technological challenge but as a strategy that:
- mobilizes the international community towards a common goal;
- supports the development and application of relevant technology;
- scales according to demand;
- advocates for adoption in areas and disciplines where the data integration problems are especially severe; and
- establishes a lead organization to guide and motivate the user groups and to develop the resources to ensure sustainability.
Current organisational status
After over a year of preparatory work, on May 8, 2009, the Alliance formalized its intent to actively collaborate and seek solutions to the problems of data explosion and interoperability. Numerous representatives from academia and the private sector participated in the founding meeting in New York City, creating a robust international alliance. The Alliance operates under the ad-hoc leadership of a Founding Board and an Executive Director on behalf of the participants. In 2010 the exact operational and governance structure is to be established by the participants.
Currently, the Netherlands Bioinformatics Center (NBIC) functions as the European regional office. Concept Web 4.0 is NBIC’s USA counterpart, a 501 (c) 3, tax exempt supporting organization (IRS provisional status), hosted at Stanford University.
Active participation, leadership and advocacy
The CWA’s unique structure is highly evolvable and designed for flexibility. A small common secretariat manages the workflow as defined by participants and client organizations, and ensures equitable distribution of resources. The Alliance, led by a globally respected governing board, is quickly becoming a trusted party for evaluation and certification of data models for semantic triples, setting standards of interoperability, best practices, and is advocating for development of scalable methods of knowledge discovery.
A CWA declaration has been issued and has, so far, been signed by an impressive group of international group of representatives from both the public and private sectors worldwide. CWA actively collaborates with organizations complementing its strategy, such as W3C (World Wide Web consortium Semantic Web initiatives), and has formed a strong global consortium of like-minded partners, effectively reaching over 200 leading institutions.
More information and details on CWA’s mandate, organizational structure and signatories can be found via the links on the left of this page.
International response and ideas about the CWA initiative have appeared in several web publications: