UW Structural Informatics Group: Overview

UW-SIG Overview

The University of Washington Structural Informatics Group (SIG) is an interdisciplinary team of computer scientists, engineers and biologists, which is part of the Department of Biological Structure and the Division of Biomedical and Health Informatics, Department of Medical Education and Biomedical Informatics, with strong ties to the Department of Computer Science and Engineering. SIG pursues bioinformatics research, with emphasis on the development of methods for representing, managing, visualizing, and sharing information about the physical organization of the body, both for its own sake, and as a basis for organizing non-structural information. The group is currently directed by Jim Brinkley.

Goals

To develop methods for representing both spatial and symbolic information about the physical organization of the body.
To develop methods for using these structural representations as a basis for organizing non-structural information, under the hypothesis that structure is a logical foundation for organizing and inter-relating most information in biomedicine.
To develop web-accessible computer programs which utilize these representations to solve practical problems in clinical medicine, research and education.
To initially apply these methods to the domains of biological structure and neuroscience, including macroscopic, microscopic, cellular and subcellular anatomy, the structure of biological macromolecules, and the relationship of these structures to functional properties of the brain.

History

The UW Structural Informatics group is an outgrowth of earlier work in the UW Department of Biological Structure, dating at least 20 years ago.

The group had its beginning in work related to computer graphics modeling of protein molecules by Lyle Jensen and the X-ray crystallography group in the Department of Biological Structure. Based on this initial work John Prothero and John Sundsten developed methods for reconstruction and 3-D display of biological objects from serial sections [Prothero1982, Sundsten1983, Prothero1985, Prothero1986]. The software for accomplishing this, as well as most of the graphics software still used by this group, was created by Jeff Prothero. The group was initially called the Biological Structure Computer Graphics group.

After seeing the 3-D reconstructions, Cornelius Rosse became interested in the use of these images for anatomy teaching, and was awarded a grant from National Library of Medicine in 1988 to work in this direction. Around 1992 Cornelius became director of the group, changing its name to the Digital Anatomist Program.

Jim Brinkley was recruited from Stanford in 1988, after which he developed a classification of structural information, as well as a conceptual framework for organizing and delivering this information, that became the basis for our subsequent bioinformatics research and development [Brinkley 1989].

In 1990 Jim Brinkley coined the term Structural Informatics to capture the kind of work he had been doing throughout his career, work ranging from 3-D ultrasound reconstruction to 3-D protein structure, to the current work in gross anatomy. Cornelius Rosse and Jim Brinkley then applied this term to the previously-described classification of structural information, concentrating on its application in anatomy [Rosse 1990]. The National Library of Medicine picked up this term in their long range report that led to the Visible Human project, and Jim Brinkley expanded on the term in 1991 [Brinkley1991].

The name of the group was changed to the UW Structural Informatics Group in 1997 to reflect its increasingly technical nature (many of the members are now from computer science or engineering), and to provide room for expansion beyond anatomy education. The award of a Human Brain Project grant to this group in 1994 was further motivation for the change.

Rationale for a group in Structural Informatics

A large amount of information in medicine relates to physical structures in the body, encompassing the spatial arrangement of macromolecular complexes, cells, tissues, organs and body parts. Therefore, if we can find ways to represent these structures and their relationships, at levels ranging from gross anatomy to molecules, the resulting structural information framework can serve as a rational basis for organizing medical information. The reason that we define a field called "structural informatics", is our hypothesis that the same methods and representations are applicable at all levels of organization, and therefore can be applied at multiple levels once the common problems are recognized.

Classification of structural information

In order to approach the development of representations for structural information we classify it along two dimensions: spatial versus symbolic, and data versus knowledge,

We define spatial information as information that is described in a coordinate system, in one or more dimensions, as for example:

One dimension: A molecular sequence
Two dimensions: A 2-D image
Three dimensions: A 3-D volume image, or a 3-D anatomic reconstruction
Four dimensions: A time varying 3-D volume image

In fact higher dimensions are possible with multi-spectral images, in which each spatial position (voxel or pixel) has multiple values representing different measured aspects of the underlying structures.

Symbolic information is all the other kinds of structural information: anatomic terminology, definitions, glossaries, semantic relationships. The symbolic information gives meaning to the corresponding spatial information. Symbolic information is conventionally expressed in natural language and can be formalized for computational processing by various knowledge representation paradigms using methods of artificial intelligence.

In an anatomy, histology or molecular biology textbook spatial information is generally conveyed by figures, whereas symbolic information is conveyed by text. However, this distinction is not precise because symbolic information (for example, "anterior to") can describe spatial information as well. Other ways to distinguish these categories are quantitative versus qualitative, or image-based versus text-based. We continue to debate among ourselves as to the best way to define these categories.

We also classify structural information along the dimension of data versus knowledge. The definition of knowledge is not precise, but for our purposes we define structural knowledge as information about classes of objects, as opposed to structural data, which is information about single objects. For example, data might be an individual reconstruction of a single kidney, whereas knowledge might be a model which describes the range of variation of all normal kidneys. Or, data might be the medical record for a patient with diabetes, while knowledge might be the characteristics of all patients with diabetes.

Conceptual framework

Based on the above classifications we have designed a conceptual framework for organizing structural information, and for making that information accessible to problem-solving programs. Structural information is represented in four kinds of information resources that are made available over the Internet by means of one or more Structural Information servers. These servers are accessed by both authoring client programs, for entering new information into the resources, and by end user programs for utilizing the information. As in many other applications, the client-server approach permits us to choose the hardware and software platforms most suited to the task, and to distribute the resources over a wide area. For our purposes the resources are concerned primarily with gross anatomy, although we believe that these methods will apply at other levels as well.

The Spatial Database consists of spatial information about individual structural objects; for example, 2-D images, 3-D volume datasets, and 3-D surface reconstructions. Representations and methods for spatial data are studied in imaging and computer graphics.

The Symbolic Database consists of symbolic information about individual structural objects, often used to identify the spatial information; for example, the name of the patient who had an imaging study, where the images are represented as files in the spatial database, the image resolution and date of acquisition, the person who segmented the images, and the names of the files containing the spatial data.

The combination of spatial and symbolic databases is studied in multimedia databases.

The Spatial Knowledge Base consists of spatial models describing the shape and range of variation of structural objects, such as all normal kidneys, or models describing the relationships among different objects. Studied in model based image understanding.

The Symbolic Knowledge Base consists of non-spatial information about classes of structural objects. This kind of information is often studied in artificial intelligence, and forms the basis for expert systems, belief networks, decision models, etc.

Research Strategy

Our research strategy follows that taken in some of the early expert systems work at Stanford. Difficult problems are chosen whose solutions can not only be of practical use, but can also drive the development of new methods and representations for structural information. Because the problems are difficult to solve we take an incremental approach: partial solutions are implemented as specific instances of our conceptual framework, which are then used as the basis for improved iterations.

In most cases the network becomes both the integration medium and the means to deliver the information sources to the greatest number of users.

The practical problems we choose are determined opportunistically, as long as they have something to do with structural information. Our current problem areas are described in the Projects pages; the research problems to which they give rise are described in the Research pages; and prototype software and content products to which these projects give rise are described in our Products pages.