mime types in AgEcon
Overview
I have found that the metadata type:dc.format.mimetype (metadata_field_id = 36)
THe field dc.format.mimetype only contains entries for items from before the migration to DSPACE. In general, we will need to use the Format field from the bitstream table. This is valid for both before and after the migration.
dc.format.mimetype
The little shell script below pulls the handles of items that have non-null values for dc.format.mimetype.query="select handle from metadatavalue,handle where metadata_field_id=36 AND item_id=handle.resource_id AND handle.resource_type_id=2;" echo $query > temp_sql_file psql -U dspace_sr dspace_sr < temp_sql_file rm temp_sql_fileThis shell script pulls all item handles.
query="select handle from handle where handle.resource_type_id=2;" echo $query > temp_sql_file psql -U dspace_sr dspace_sr < temp_sql_file rm temp_sql_fileWith these two scripts, I was able to find the handles that did and did not have the dc.format.mimetype field populated.
Comparison of metadata pre and post migration to DSPACE
Here is a comparison of pre vs post migration to DSPACE and the metadata fields related to mimetype.| DSPACE Table Element | Present on Pre-Migration (example handle 36676) |
Present on Post- Migration (example handle 96677) |
| dc.format.mimetype | yes | no |
| Bitstream.name | yes | yes |
| Bitstream.source | no | yes |
| Bitstream.description | no | yes |
| Bitstream.format | yes | yes |
| Bitstream.user format description | no | no |
| Bitstream.license | no | yes |
Bitstream table for handle 96677:
Summary of Bitstream.format
I have found the unique mime types in AgEcon using the bitstream.format field:application/pdf 45264
application/octet-stream 1
application/vnd.ms-excel 2
The handles for the non-pdf files are:
application/octet-stream
62242
application/vnd.ms-excel
42187
92231
This field will be used to determine mime type.