Specific Aims

In the 1980s and 1990s a grand vision was articulated for a “Matrix of Biological Knowledge” that would link together vast amounts of biomedical information in a worldwide information system [2]. Such a matrix would permit seamless access both to published biological knowledge and to the raw data that give rise to that knowledge, in such a manner that diverse forms of widely distributed information sources could be synthesized into new insights about biological structure and function. Although this vision may have been somewhat premature, the compelling need it addresses is even greater today than it was in the past. The vast amounts and diverse types of raw data collected in areas such as neuroscience, cancer biology and developmental biology have gone far beyond the ability of any individual to comprehend even the information in his or her own field. Yet many of the advances in science have come about through connections made between seemingly different areas, most often through serendipitous means. For this reason the vision continues to be articulated in such initiatives as the Semantic Web [3, 4], the Digital Human [5], the Human Brain Project (HBP) [6], and the recent NIH draft policy on data sharing [7].


The goal of this planning proposal is to create a Center for Excellence in Biomedical Computing that will explore the role of structure as a unifying fundamental framework to bring us closer to this grand vision. Such a framework should ultimately span the range from bioinformatics (information related to basic biological science) to clinical informatics, although we will mainly concentrate on bioinformatics during the planning process.


In planning for the Center we will bring together computer scientists, informaticists and biomedical investigators to address what we see as the fundamental problems inherent in widespread information sharing: representing and relating diverse forms of data and knowledge, and sharing data and knowledge stored in widely distributed systems. In addressing these problems we make the following assumptions: 1) no single consolidated database will be feasible for all biological data, both for technical and sociological reasons (most researchers are not willing to give up control of their data); 2) the most reasonable means for relating diverse forms of data is through the physical organization of biological systems (e.g. structure ranging from cellular to organ to organismal level); and 3) a promising strategy for achieving information sharing is through enabling the dynamic interconnection of information sources a few at a time in various combinations, rather than creating a monolithic combined information resource.


At the University of Washington there is demonstrated expertise in each of the areas implied by these problems and assumptions: a) the School of Medicine (SOM) is a nationally recognized leader in basic biomedical research; b) the structural informatics group (SIG) has developed methods for organizing information around a structure-based framework; c) the Biomedical and Health Informatics program (BHI), with its long standing history of grant support for broad based informatics research and its graduate degree program (augmented by an institutional pre- and postdoctoral research training grant funded National Library of Medicine), is a source of talented interdisciplinary students, who can contribute to, and gain experience from, the proposed center’s research goals; d) the database group in Computer Science is well known for its work in data integration and peer data management systems, and is actively collaborating with BHI in the bioinformatics arena; and e) the UW Human Brain Project (UW-HBP) has demonstrated the feasibility of an incremental approach to information sharing. The purpose of the Center will be to fully leverage these resources in order to achieve the goal of widespread information sharing. We will plan for this expansion in terms of the following specific aims:

  1. Plan for the organization of the full Center
  2. Understand the information management needs of biological researchers in targeted areas of cancer biology, developmental biology and neuroscience, as a basis for building shared information systems of direct use in their research
  3. Develop generalizable anatomical structure-based methods for representing and managing biological information
  4. Develop methods and tools for ontology alignment, whereby ontologies are linked by a set of explicit mapping relations that provide coherent semantics across multiple ontologies and their data
  5. Develop a peer data management system (Bio-PDMS) and toolkit for sharing of heterogeneous biological data from diverse existing data sources leveraging new and existing ontologies.

The name of the proposed Center will be the Interdisciplinary Center for Structural Informatics. This name is based on the interdisciplinary nature of our work, and on the observation that the study of structures ranging in size from molecules to organisms is the foundation for understanding in biology. Therefore, we hypothesize that:


The creation of an Interdisciplinary Center for Structural Informatics will greatly facilitate the exploration of these hypotheses, the validation of which will be a major step towards the grand vision of widespread information sharing in biology. Moreover, as an organization that understands and promotes interdisciplinary work, the Center will provide an ideal environment for the training of investigators in biology, bioinformatics, computer science, and clinical informatics (in synergy with the existing pre- and post- doctoral training programs in the UW BHI program).

References

1. Noy NF, Mejino JL, Musen MA, Rosse C. Pushing the envelope: challenges in a frame-based representation of human anatomy. Submitted 2002.

2. Holden C. An omnifarious databank for biology? Science 1985;228:1412-1413.

3. Berners-Lee T, Hendler J, Lassila O. The semantic web. Scientific American 2001;284(5):34-43.

4. World Wide Web Consortium. The Semantic Web. http://www.w3.org/2001/sw/; 2002.

5. Federation of American Scientists. The Digital Human. http://www.fas.org/dh/; 2002.

6. Koslow SH, Huerta MF, editors. Neuroinformatics: an overview of the Human Brain Project. Mahwah, New Jersey: Lawrence Erlbaum; 1997.

7. National Institutes of Health. Draft statement on sharing research data. http://grants1.nih.gov/grants/policy/data_sharing/index.htm; 2002.