September 2007 Archives

There are data-related resources all over the Twin Cities campus. This guide highlights major sources of assistance with software, hardware, analysis and locating data.

Social Sciences Data - Resources across the University of Minnesota

Biomedical Informatics and Computational Biology

"The University of Minnesota in collaboration with other leading biotechnology and health science institutions in southeast Minnesota will establish a center for biomedical informatics and quantitative and computational studies in the life sciences. This center will be located in Rochester. The research focus of this center will be quantitative biomedical research. The academic focus of the center will initially be on a graduate program in computational biology and biomedical informatics in life and health sciences. The University of Minnesota Rochester (UMR) will provide administrative and infrastructure support."

Brain Sciences Center/The Mind Institute

“The MIND Institute (Mental Illness and Neuroscience Discovery) is a consortium of universities, schools of medicine, brain research institutions and laboratories in Minnesota, Massachusetts and New Mexico. The institute supports clinical and basic neuroscience research, as well as research and technology development in the areas of instrumentation, data analysis and computational modeling for functional brain imaging. Through this unique collaboration of scientific expertise and advanced neuroimaging technology, such as MEG, fMRI and MRS; the MIND Institute’s partners have combined forces to advance research and understanding of the brain to make it into one of the most powerful scientific consortiums in the world.?

Sequence of genome of Fusarium graminearum

From the University of Minnesota News Service

"Scientists led by a team from the University of Minnesota have sequenced the genome of the fungal
pathogen that causes the deadly grain disease Fusarium Head Blight (FHB). Their findings [were] published in Science (Cuomo et al, "The Fusarium graminearum genome reveals a link between localized polymorphism and pathogen specialization," vol. 317, 7 September 2007, 1400-1402).

The disease can have devastating effects on wheat and barley crops, because it creates toxins that can sicken humans and livestock who consume infected grain. It is one of the most significant plant pathogens worldwide...Fungicides are costly and not always effective.

The FHB pathogen has about 13,000 genes. However, "(because of the sequencing breakthrough), now we can look at the genes in detail, in terms of their ability to allow the pathogen to cause disease and produce toxin," said H. Corby Kistler, an adjunct professor of plant pathology at the U of M and a research geneticist for the U.S.
Department of Agriculture's Agricultural Research Service.

According to Kistler, sequencing the genome will help researchers determine how the disease operates at a genetic level and eventually, how to prevent it. The research could also help fight other plant diseases."

The research was a multinational effort, with the participation of university researchers in Germany, Spain, Netherlands, Austria, UK, Canada, France, and the Ukraine, along with U.S.-based groups (Broad Institute of MIT and Harvard, Purdue, Michigan State, Cornell, Pacific Northwest National Laboratory, St. Louis University, Universities of Arizona and Tennessee, and the USDA). Further analyses of gene products also will prove to be data-intensive.

Podcast on cyberinfrastructure and e-research/e-science

"The Institutional Challenges of Cyberinfrastructure and E-Research"

is the title of the closing keynote address by Clifford Lynch delivered on August 8 at the 2007 Seminars On Academic Computing in Snowmass Village, CO. It is podcast here:

The abstract published with the podcast:
It has become clear that scholarly practice and scholarly communication across a wide range of disciplines are being transfigured by a series of developments in IT and networked information. While this has been widely discussed at the national and international levels in the context of large-scale advanced scientific projects, the challenges at the level of individual universities and colleges may prove more complex and more difficult. This presentation will focus on these challenges, as well as the development of truly institution-wide strategies that can support and advance the promises of e-research.

More e-science at the University of Minnesota

Cedar Creek Natural History Area is an NSF-sponsored Long-Term Ecological Research (LTER) site. There are 4 major themes to this research:

*Theme 1:* What are the impacts of major perturbations -- especially climatic variation, N deposition, land use history, changes in fire frequency, elevated CO2, exotic species, and changes in trophic structure --on species composition, diversity and ecosystem functioning?
*Theme 2:* What processes, interactions and positive and negative feedbacks control species abundances, community assembly, and community composition, diversity and dynamics in Cedar Creek grasslands and savanna?
*Theme 3:* How do composition and biodiversity directly and indirectly impact ecosystem functioning?
*Theme 4:* What general principles allow integration across scales ranging from ecophysiological and population processes to ecosystem functioning; from single trophic levels to whole foodwebs; from single plots to landscapes; and from snapshots in time to long time series?

"We are pursuing these four themes in five inter-related types of long-term studies that form the heart of the Cedar Creek LTER. Each is guided by our research philosophy and each addresses several themes. LTER funding supports this core long-term work and the research infrastructure of Cedar Creek (computer network, analytical chemistry laboratory, herbarium and insect collections, data management and software development, and shared research equipment)."

The Long Term Ecological Research (LTER) Network is a collaborative effort involving more than 1800 scientists and students investigating ecological processes over long time periods and broad geographical scales. The Network promotes synthesis and comparative research across sites and ecosystems and among other related national and international research programs. The 26 sites that constitute the Network represent diverse ecosystems and research emphases.

UMN has 11 principal investigators working on the LTER at Cedar Creek, mostly faculty from Ecology, Evolution, and Behavior, with one from Plant Biology and two from Forest Resources. There are a couple dozen additional UMN faculty involved (see

HarvestChoice Funded by the Gates Foundation. (Phil Pardey, et al.)

"Over the coming three years, HarvestChoice and its growing number of partners will deliver a series of databases, tools, analyses, findings, and syntheses designed to improve strategic investment and policy decisions. The overriding objective is to accelerate and enhance the performance of those crops and cropping systems most likely to bring significant benefits to the world's poor and undernourished."

There are a several collaborative NSF plant/crop genome mapping research projects involving the University of Minnesota.

Nevin Young (Plant Pathology/Plant Biology) has an NSF grant to sequence the model legume Medicago Truncatula Project goals include: Genome Sequencing: Complete, high quality genome sequence for all eight chromosomes of Medicago.

Informatics: An integrated database of clone, map, assembly, and sequence information combined with coordinated, automated annotation of the Medicago genome sequence.
Project Organization: This NSF project is coordinated with partners in the EU who receive funding from the 6th Framework Programme, BBSRC in the UK and ANR in France. EU partners
come from the UK, France, Germany, Netherlands, and Belgium

Ronald Philips (CFANS) has an NSF grant to work on mapping the corn genome

Goals include:
* To produce an efficient means to map genes of corn, our major cereal crop.
* To implement a system analogous to one used in the mapping of human genes, where special Chinese hamster cell lines containing portions of the human genome are used. We will use oat lines containing portions of the corn genome.
* To complete the series of the 10 different oat-corn chromosome addition lines and, for each chromosome, produce a series of approximately 100 radiation hybrid lines.
* Arrays of DNA samples ultimately will be available to researchers for mapping any gene or DNA fragment to a small region of a given chromosome.

Electronic laboratory notebooks

Declan Butler discusses the development, applications, and advantages of e-notebooks in his article, "A New Leaf" (Nature, vol.436, 7 July 2005, 20-21). An editorial in Nature, dated approximately two years later, further emphasizes the benefits of e-notebooks, but notes that "academic acceptance of e-notebooks will not improve unless universities promote their use and recognize that e-notebooks can help them fulfill their responsibilities as the owners of most grant-funded data...Institutions therefore need to show leadership in this area, and funding agencies should provide additional infrastructure support earmarked for the development and upkeep of electronic notebook systems" (vol.447, 3 May 2007, 1-2).

Pharmaceutical companies routinely use e-notebooks to document their proprietary research; the U. S. Food and Drug Administration accepts e-notebook records during a drug's evaluation and approval processes.

The University of Minnesota provides some guidelines about the 'use of electronic records' in "Guidelines for Maintaining Laboratory Notebooks" (; however, the document emphasizes record-keeping in a paper format.

Universities, as well as businesses, continue to develop e-notebook software. A selected list follows, with brief descriptions.

ELN (Electronic Laboratory Notebook) from the EMSL Collaboratory, Pacific Northwest National Laboratory ( The software, available for download at the EMSL site, is "a shared web-based version of the traditional paper laboratory notebook. Each notebook page shows data and time-stamped entries that include static information such as text and images, as well as dynamic information ranging from animated GIF images and video clips to rotatable 3-D protein structures and X-Y graphs of spectra that support zoom and other capabilities."

The NeuroSys Project from Montana State University (, "provides a set of easy to use tools for data sharing by the scientific community. Neurosys enables users to construct and store a coherent description of their data, according to the hierarchical organizational scheme that makes the most sense for their specific set of applications; allows users to design and construct their own custom GUI screens for data entry, data query and retrieval, combined with automated links to external analytical software tools; automatically creates a controlled vocabulary, and supports the extension and/or migration of that vocabulary to whatever standard might be chosen at a later date..."

The Smart Tea Project (, based at the University of Southampton and part of the Combechem eScience Research Initiative. Its "first phase will support the complete life cycle of chemists' interactions with the lab; the next phase will integrate the lab aether with the larger network of chemistry on the grid for shared information services." A related project is the "myTea Project" (, which focuses on "improving capture of experiments for bioinformatics practitioners."

CERF (Collaborative Electronic Research Framework) - The Electronic Lab Notebook for Biology and Multidisciplinary Life Sciences ( developed by Rescentris, Ltd. It is an "enterprise scientific information system designed specifically for managing and sharing information in life sciences research organizations. CERF combines a full-featured electronic lab notebook with scientific content management, and extensible knowledge and data integration framework, and a science-driven informatics platform....The CERF server provides a central management of data storage, system functions, projects, organization of experiments, content, annotations, and rich metadata."

This National Institute of Corrections manual ( provides guidance on how information affects policy decision making. Topics include good management; data collection; how to locate and capture information; analyzing, interpreting, and sharing information; and getting the most from your information system.

Biometerology and Micrometerology
Heat and mass transfer between the biosphere and atmosphere can have important consequences for the climate system. We use biometeorological techniques to better understand the processes and feedback mechanisms that control heat and mass transfer near the Earth's surface from ecosystem to regional scales. Measurement technologies have been developed that allow rapid and continuous measurement of atmospheric properties, such as turbulence and trace gases, providing an opportunity to answer important questions related to the cycles of energy, water, carbon, and many other scalars.

There is access to real-time data, plus archived data (in cooperation with USDA-Agricultural Research Service).

Monitoring Minnesota's Landscapes

This website provides a series of maps and statistics about land cover, impervious surface area and landscape change, derived from satellite imagery, in Minnesota from 1986 to the present. Minnesota is one of the first states to have multiple dates of land cover and impervious surface, and change data, mapped statewide using satellite imagery. Other surveys have been performed by various means on smaller scales, but none have had the large area coverage as well as the historical depth of information. Quantifying the amount of impervious surface area, an important indicator of environmental quality, is particularly valuable because of its effects on stormwater runoff and lake and stream quality.

Again, computational intensiveness isn't clear, but project uses a variety of sources for data, works with the Minnesota Pollution Control Agency and displays data via UMN MapServer.

Interagency Information Cooperative

The Interagency Information Cooperative (IIC) was created from the Sustainable Forest Resources Act of 1995 (M.S. Chapter 89A.09). The overall mission of the IIC is to enhance the access and use of forest resources data in Minnesota. The following public organizations have representatives on the IIC: Minnesota Association of County Land Commissioners, United States Forest Service, Land Management Information Center, University of Minnesota, and Department of Natural Resources. The IIC Memorandum of Understanding was created in 1997, and goes into greater depth on the purposes, membership, and duties of the IIC.

Focus here is more on harmonization of data through metadata rather than computationally intensive work. However, it is boundary-spanning in that it is attempting to harmonize biological, geospatial, social science and other kinds of data.


MapServer ( an Open Source development environment for building spatially-enabled internet applications. MapServer is not a full-featured GIS system, nor does it aspire to be. Instead, MapServer excels at rendering spatial data (maps, images, and vector data) for the web.

Beyond browsing GIS data, MapServer allows you create "geographic image maps", that is, maps that can direct users to content. For example, the Minnesota DNR Recreation Compass provides users with more than 10,000 web pages, reports and maps via a single application. The same application serves as a "map engine" for other portions of the site, providing spatial context where needed.

MapServer was originally developed by the University of Minnesota (UMN) ForNet project in cooperation with NASA and the Minnesota Department of Natural Resources (MNDNR). Presently, the MapServer project is hosted by the TerraSIP project, a NASA sponsored project between the UMN and consortium of land management interests.

The software is maintained by a growing number of developers (nearing 20) from around the world and is supported by a diverse group of organizations that fund enhancements and maintenance.


The Environmental Protection Agency (EPA), through its Science To Achieve Results (STAR) competitive grants research program, has established five regional Estuarine & Great Lakes (EaGLe) research centers at major academic research institutions with strong expertise in coastal environmental science. Additionally, NASA is supporting associated remote sensing research at three of these institutions. The researchers at these five regional centers are developing the next generation of environmental indicators to assess the biological health of the Great Lakes coast and estuaries and wetlands along the Atlantic, Pacific and Gulf coasts. Indicators evaluated and developed by the EaGLe centers will be used by the states in their long-term monitoring programs to establish the integrity and sustainability of the nation's coastal ecosystems.

Great Lakes Environmental Indicators (GLEI) Project, a subset of EaGLe, is led by the Natural Resources Research Institute at the University of Minnesota Duluth (UMD). Other cooperators include the following: the University of Minnesota Twin Cities; Minnesota Sea Grant; the University of Wisconsin Green Bay; the University of Wisconsin Madison; Cornell University, New York; John Carroll University, Ohio; the University of Michigan; the University of Windsor, Ontario; and the US EPA Mid-Continent Ecology Division, Duluth, Minnesota, and Grosse Ile, Michigan. STAR grant R828675.

GLEI is developing and testing a suite of indicators across the range of habitats that make up the Great Lakes coastal margins. The following indicator types will be tested for their efficacy and technical soundness within three subcategories: 1) the basin as a whole: climate measures, land uses, and landscape characteristics; 2) estuaries, bays and coastal margin waters: water quality, contaminant levels, and the relative abundances of amphibian, bird, diatom, fish, macroinvertebrate and plant species and communities, and 3) the land margins: measures of bird community structure. Each of these indicator types has linkages with habitat condition measures and other stressors.

Data is not currently available and is stored on EPA servers. For details, see


"The National Cancer Institute (NCI) has launched the caBIG™ (cancer Biomedical Informatics Grid™) initiative to speed research discoveries and improve patient outcomes by linking researchers, physicians, and patients throughout the cancer community. caBIG™ is a voluntary network of infrastructure, tools, and ideas that enables the collection, analysis, and sharing of data and knowledge along the entire research pathway from laboratory bench to patient bedside."

"Today, there are more than 800 individuals -- from over 80 organizations -- working on caBIG™ projects."

Excerpted from:

About this Archive

This page is an archive of entries from September 2007 listed from newest to oldest.

August 2007 is the previous archive.

October 2007 is the next archive.

Find recent content on the main index or look in the archives to find all content.