In UDC there are Dspace Handles that are collections(resource_type_id =3) but do not show up in the collections table.
Finding the problem in the media filter log
I was looking at the filter media log, dspace-ir_filter-media.log, and found many errors of the form:
Exception in thread "main" java.lang.IllegalArgumentException: Cannot resolve 4394 to a DSpace object
This means that when the handle was put into the static method
HandleManger.resolveObject, a null resulted.
list (handle set 1)
of these handles was obtained using the UNIX command line below:
grep 'Cannot.resolve' dspace-ir_filter-media.log | perl -p -i -e 's/^.*Cannot resolve (\d+).*$/\1/g' | sort | uniq | sort -g
Look at handles that produce error inside of Postgres
One can go to the handle table and get the resource_id for one of the handles on the list. Using this resource_id, no entry can be found in the collection table. I think these are collections that were deleted. They were removed from the collection table but not from the handle table.
sql to get collection handles
old sql command ... pulls handles that have null and valid values in the collection table
The handles that were input to filter-media came from the sql cmd below:
SELECT handle FROM handle WHERE resource_type_id=3;
The above command will grab both good collection handles and handles that have no entry in the collection table.
Handles using this command
(good and bad collection handles combined: handle set 2).
new sql command only pulls handles that have valid values in the collection table
The command below will only pull handles that have valid collection_ids (i.e. exist in the collection table).
SELECT handle FROM collection, handle WHERE collection_id=resource_id AND resource_type_id=3 ORDER
Handles using improved command handle set 3
(only handles that exist in the collection table).
Quick sanity check
handle set 1 maps to null collections.
handle set 2 maps to all collections in the handle table both null and non-null.
handle set 3 maps to non-null collections
So we would expect:
1) There to be no overlap between handle set 1 and handle set 3.
2) The combined contents of handle set 1 and handle set 3 should be equal to the contents of handle set 2.
Both 1 and 2 are correct.