March 11, 2005

Why are there two entries for my title in the index? Can we fix it?

If the difference involves an initial article, this is an insoluble problem for now, but it may be solved in the future. For more information, read on.

Aleph derives two versions of an access field for browse indexing. One is heavily normalized for rough sorting; the other is more delicately normalized. Aleph uses the latter form to display the heading in the index with punctuation, upper and lower case, etc., which makes the headings more legible. When there are different forms of the delicately normalized heading, Aleph has to decide what to do.

In version 14, Aleph usually decided that it had separate entries. This meant that a host of small differences--capitalization, punctuation, spacing--could cause what should be a single index entry to split into two or more entries. Version 16 improves on this significantly--Aleph is now able to accept that these kinds of differences are ignorable, and to display only one of them as representing all. But, in the case of initial articles, Aleph is still unable to see the two entries as the "same." If we could eliminate the initial articles from titles in the bib records, that would solve the problem; but that would also violate standard cataloging rules, which call for including the initial article in 245s, but excluding it in $t's and 130 uniform title entries.

What is needed is for Aleph to take the filing indicator into account when constructing the more delicate filing form. If it were able to use the filing indicator to offset the initial article correctly, the entries would merge. Possible downsides: no more non-filing initial articles in the index (some like seeing them there); and the possibility that the form selected for display in the index might be one of the 245 forms, which often (and properly) do not uppercase the first filing word. In any case, these would be smaller problems than the split in the headings we see now. I expect we'll see a fix for this at some point; but there's nothing we can do now. The underlying data is correct.

Posted by s-hear at March 11, 2005 02:44 PM
Comments
Post a comment









Remember personal info?






The views and opinions expressed in this page are strictly those of the page author. The contents of this page have not been reviewed or approved by the University of Minnesota.