June 13, 2006

"TreeView" of NCBI BLAST Results Now Available

BLAST results may now be shown using a new "TreeView" dendrogram, clustered on the basis of BLAST score. Either of two methods, "Fast Minimal Evolution" or "Neighbor Joining" can be used to construct the tree. To see a "TreeView", run a BLAST search from the NCBI BLAST page using the default 'pairwise' formatting option and click on the "TreeView" link on the results page. The NCBI BLAST page is found at: http://www.ncbi.nlm.nih.gov/BLAST/

For a fun sample sequence to BLAST, see http://blog.lib.umn.edu/messn006/blastit/

While BLAST results are highly heuristic and should not be construed to constitute an accurate sequence phylogeny, the graphic representation provided by TreeView may be helpful in analyzing similarity patterns in one's BLAST results.

Posted by Kevin Messner at 10:18 AM

May 12, 2006

NCBI Field Guide Coming to Campus June 8-9

Please help spread the word! On Thursday June 8 – Friday June 9, the Bio-Medical Library is sponsoring a short course, including a hands-on workshop, taught by bioinformatics specialists with the NCBI User Services staff. The course, “NCBI Field Guide to GenBank and Molecular Biology Resources,? discusses effective use of tools including the Entrez databases and search service, the BLAST similarity search engine, genome data, Map Viewer, NCBI’s structure visualization software, and the conserved domain database.

The course consists of a 3 hour lecture the morning of June 8, followed by a 2 hour hands-on computer workshop, several duplicate sessions of which will be held the afternoon of June 8 and all day June 9.

Registration for the workshop will open on Monday, May 15, and is required. Please note that seating is limited, and registrations will be handled on a first-come, first-served basis. Please visit http://www.biomed.lib.umn.edu/services/instruction/ncbifieldguide for more information on the course and complete registration information.

Posted by Kevin Messner at 11:36 AM

March 16, 2006

NCBI Powerscripting Workshop

NCBI presents NCBI PowerScripting, a 4-day course including both lectures and computer workshops on effectively using the NCBI E-utilities within scripts to automate search and retrieval operations across the entire suite of Entrez databases.

Dates: May 2-5, 2006
Location: Lister Hill Center (Bldg. 38A), NLM, NIH, Bethesda, MD
Attendance: The course is free but an application is required.

For more information and to apply for the course, see the course Web page: http://www.ncbi.nlm.nih.gov/Class/PowerTools/eutils/course.html

Posted by Kevin Messner at 9:50 AM

January 24, 2006

NCBI Coffee Break: Readings and Bioinf Tutorials

Just a FYI: NCBI Coffee Break provides short articles and tutorials on recent biological advances, with a focus on applications of computational biology and bioinformatics technologies. Good reading for students and researchers new to the field. http://www.ncbi.nlm.nih.gov/books/bv.fcgi?call=bv.View..ShowSection&rid=coffeebrk.TOC

Posted by Kevin Messner at 2:48 PM

January 18, 2006

NCBI Search Toolbar Now Available

A Search Toolbar from NCBI is now available for download. This program for your Web browser (IE and Firefox) allows you to quickly initiate searches of several NCBI databases using a browser-based toolbar, and to highlight where your search statement was found in your results.

For more information see: http://www.ncbi.nlm.nih.gov/projects/toolbar/

Posted by Kevin Messner at 3:11 PM

May 15, 2005

PubMed implements Ajax

Pubmed has incorporated the Ajax methodology employed by Google Suggest in its "Single Citation Matcher":

Type in a journal name: http://www.ncbi.nlm.nih.gov/entrez/query/static/citmatch.html

For comparison: http://www.google.com/webhp?complete=1&hl=en

The downside is that the code seems to force the user to click the "Go" button, rather than use the return key on the keyboard.

Posted by Kevin Messner at 4:37 PM

April 11, 2005

PubMed Spell Check Feature

A spell checking feature has been added to PubMed to suggest alternative spellings for PubMed search terms that include misspellings. The spell check feature tries to determine what the user intended and then displays an alternative spelling. See the NLM Technical Bulletin article for more information: http://www.nlm.nih.gov/pubs/techbull/nd04/nd04_spell.html

Posted by Kevin Messner at 2:52 AM

March 24, 2005

NCBI Consensus CoDing Sequence (CCDS) Project

The Consensus CoDing Sequence (CCDS) project aims to identify a core set, or master list, of human protein-coding regions that are consistently annotated and of high quality. CCDS was made public on March 2, 2005 and is available at http://www.ncbi.nlm.nih.gov/CCDS/

The long-term goal is to support convergence toward a standard set of gene annotations on the human genome.

The CCDS set is built by consensus among the collaborating members, which include:
- European Bioinformatics Institute (EBI)
- National Center for Biotechnology Information (NCBI)
- University of California, Santa Cruz (UCSC)
- Wellcome Trust Sanger Institute (WTSI)

The Approach:
- Compare NCBI vs Ensembl and Vega annotation of coding sequence regions.
- Identify those that have identical locations on the genome.
- Quality tests are applied to the initial candidate set
- Candidates that fail tests are rejected
- Those that pass QC are given a CCDS ID and version
- The CCDS will be tracked through annotation and genome sequence updates.
- Rejected candidates may be added to the CCDS set in a future release. This is dependent on:
- additional transcript data becomes available
- continued improvements to automatic annotation methods
- curation

Assessing Quality. CCDS status is conservatively applied:
- Any member of the collaboration may contribute to quality testing.
- Any member of the collaboration may unilaterally 'reject' an annotation from the candidate set.

Quality assessment tests may change over time. For the first release, these measures included:
- Consensus splice sites
- Valid start and stop codons
- No internal stops - translation of genome sequence coordinates does result in the protein expected
- Protein homology
- Supporting transcripts
- Genome conservation
- Pseudogene unlikely

The genome browsers available at each of the project collaborators' web sites contain links to CCDS records. The links are displayed in various ways, depending on the browser. In MapViewer, if a gene has a CCDS, you will see that acronym in the links that are displayed when the Genes_seq map is the master map.

Posted by Kevin Messner at 2:28 AM

February 14, 2005

Open Mass Spectrometry Search Algorithm (OMSSA)

The Open Mass Spectrometry Search Algorithm [OMSSA] is an efficient search engine for identifying MS/MS peptide spectra by searching libraries of known protein sequences. OMSSA scores significant hits with a probability score developed using classical hypothesis testing, the same statistical method used in BLAST. OMSSA is free and in the public domain.

OMSSA is available at http://pubchem.ncbi.nlm.nih.gov/omssa/ A research article on OMSSA is available to U Minn TC affiliates at J Proteome Res. 2004 Sep-Oct;3(5):958-64.

Posted by Kevin Messner at 1:47 AM

Entrez Genome Project

A new database, Genome Project, has been added to the Entrez home page
(http://www.ncbi.nlm.nih.gov/Entrez/).
Entrez Genome Project is a companion database to Entrez Genome. The actual data from genome sequencing projects are contained in Entrez Genome (as complete genomes chromosomes) and Entrez Nucleotide (as chromosome or genome fragments such as contigs). The Genome Project database, on the other hand, provides an umbrella view of the status of each genome project, links to project data in the other Entrez databases, and links to a variety of other NCBI and external resources associated with a given genome project. A genome project's status can be complete or in-progress, and the project can include large-scale sequencing, assembly, annotation, and mapping efforts. New genome sequencing projects can be registered through the Genome project submission form. More information about the submission of data from complete genomes is provided in the Resource Guide section on Submission of complete genomes. (Although the Entrez Genome Project database does not include viral genome sequencing projects, data from those projects are submitted to GenBank and are available in the Entrez Nucleotide and Entrez Genome databases. There is also a special set of resources at NCBI dedicated to Viral Genomes.)

Posted by Kevin Messner at 1:40 AM

My NCBI Replaces the Cubby: Includes Automatic E-mailing of Search Updates and Filters

he PubMed Cubby will soon be replaced by My NCBI. My NCBI works similarly to the Cubby in that it retains user information in order to provide additional services. To use My NCBI you must be signed in. You can sign in using an existing Cubby account, or if you do not have an account, you can register for a My NCBI account.

[Editor's Note: This feature was implemented in PubMed on February 1, 2005.]

Read the full NLM Technical Bulletin article at http://www.nlm.nih.gov/pubs/techbull/jf05/jf05_myncbi.html

Posted by Kevin Messner at 1:06 AM

January 14, 2005

Entrez Genome Project Resource for Microbial Genomes

Entrez Genome now provides the Entrez Genome Project Resource for microbial genomes offering three tabular displays for complete microbial genomes as well as those for which sequencing is in progress. New features available from the Organism Info tab include data sorting based on genomic properties, such as %GC and genome size, or biological characteristics, such viable temperature range, oxygen requirements, or the habitat of the organisms listed. There is also a breakdown of the sequencing progress for each organism record.

URL: http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi

Posted by Kevin Messner at 12:13 AM

October 29, 2004

PubChem Databases searchable through NCBI Entrez

From NCBIannounce (http://www.ncbi.nlm.nih.gov/mailman/listinfo/ncbi-announce):
The PubChem Substance, Compound, and BioAssay databases are now available through the Entrez system. PubChem is a catalog of small organic molecules that contains chemical structures and information on biological activity. PubChem is intended to support the Molecular Libraries and Imaging component of the NIH Roadmap Initiative. PubChem's chemical structure database may be searched on the basis of descriptive terms, chemical properties, and structural similarity. When possible, PubChem's chemical structure records are linked to other NCBI databases, which include PubMed and NCBI's protein 3D structure database, for example. PubChem also contains the results of high-throughput biological screening experiments.

Posted by Kevin Messner at 12:39 PM