Anecdotes about Data http://blog.lib.umn.edu/westx045/data/ en-us Copyright 2008 Tue, 08 Jan 2008 16:40:15 -0600 http://www.sixapart.com/movabletype/?v=3.33.uthink http://blogs.law.harvard.edu/tech/rss BioConductor Bioconductor is

"an open source and open development software project for the analysis and comprehension of genomic data...Bioconductor is primarily based on the R programming language but we do accept contributions in any programming language. There are two releases of Bioconductor every year (they appear shortly after the corresponding R release). At any one time there is a release version, which corresponds to the released version of R, and a development version, which corresponds to the development version of R. Most users will find the release version appropriate for their needs. In addition there are a large number of meta-data packages available. They are mainly, but not solely oriented towards different types of microarrays...Although initial efforts focused primarily on DNA microarray data analysis, many of the software tools are general and can be used broadly for the analysis of genomic data, such as SAGE, sequence, or SNP data."

]]>
http://blog.lib.umn.edu/westx045/data/2008/01/bioconductor.html http://blog.lib.umn.edu/westx045/data/2008/01/bioconductor.html Biological Sciences Tue, 08 Jan 2008 16:40:15 -0600
Quantifying the transmission potential of pandemic influenza Quantifying the transmission potential of pandemic influenza

From the abstract:

"This article reviews quantitative methods to estimate the basic reproduction number of pandemic influenza, a key threshold quantity to help determine the intensity of interventions required to control the disease."

]]>
http://blog.lib.umn.edu/westx045/data/2007/11/quantifying_the_transmission_p.html http://blog.lib.umn.edu/westx045/data/2007/11/quantifying_the_transmission_p.html Biological Sciences Mon, 26 Nov 2007 15:32:36 -0600
How Global Is the Global Biodiversity Information Facility? How Global Is the Global Biodiversity Information Facility?
Abstract:
There is a concerted global effort to digitize biodiversity occurrence data from herbarium and museum collections that together offer an unparalleled archive of life on Earth over the past few centuries. The Global Biodiversity Information Facility provides the largest single gateway to these data. Since 2004 it has provided a single point of access to specimen data from databases of biological surveys and collections. Biologists now have rapid access to more than 120 million observations, for use in many biological analyses. We investigate the quality and coverage of data digitally available, from the perspective of a biologist seeking distribution data for spatial analysis on a global scale. We present an example of automatic verification of geographic data using distributions from the International Legume Database and Information Service to test empirically, issues of geographic coverage and accuracy. There are over 1/2 million records covering 31% of all Legume species, and 84% of these records pass geographic validation. These data are not yet a global biodiversity resource for all species, or all countries. A user will encounter many biases and gaps in these data which should be understood before data are used or analyzed. The data are notably deficient in many of the world's biodiversity hotspots. The deficiencies in data coverage can be resolved by an increased application of resources to digitize and publish data throughout these most diverse regions. But in the push to provide ever more data online, we should not forget that consistent data quality is of paramount importance if the data are to be useful in capturing a meaningful picture of life on Earth.

]]>
http://blog.lib.umn.edu/westx045/data/2007/11/how_global_is_the_global_biodi.html http://blog.lib.umn.edu/westx045/data/2007/11/how_global_is_the_global_biodi.html Biological Sciences Thu, 08 Nov 2007 12:51:50 -0600