« Memory Leak in DSPACE | Main | SQL to get number of new items in DSPACE after a certain date »

Looking for non unicode characters in AgEcon metadata

Problem and general solution

Some non unicode characters have gotten into the dspace metadata. We need to find them. I will print out the meta data fields to an file of the form below. <doc> text from metadata pull </doc> Then I will run the file through xmllint.

sql needed

The line below will get all the valid item_ids.
SELECT item.item_id from item, handle where handle.resource_id=item.item_id;

The line below will pull a metadata field for a given item id.
select text_value from metadatavalue where metadata_field_id=43 AND item_id=36450;
For this query, the Series/Report will be obtained for an item with item_id=36450.

Metadata fields to check

metadata_field_id name
3 author
15 date issued
25 uri
27 abstract
40 Institution/Association
43 Series/Report
57 Keyword
63 JEL Codes
64 Title
67 email
This list came from AgEconMetadata.htm

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)