« putting captchas into dspace using jcaptcha | Main | David Naughton gave me some fast perl code for hashes »

dspace batch ingest

Format of files to ingest

Here is a breakdown of the files : batch_files - top level directory (name does not matter)
       I
   Ingest1   - This a prototype for the directories that you will create.  Give these directories any name you want
            I
         contents                                            -    contains  the asset name.  The fields are separated by a single tab.  File must  be called "contents"
        dublin_core.xml                                - contains the DC metadata.  This file must be called "dublin_core.xml" 
        UDCsubmissionguidelines.pdf     - This is the asset.  You may use whatever name you want.  However the name of the
                                                                       asset must appear in the "contents" file.

tarball that gives working sample of the directory structure.

command

/dspace/dspace-ir/bin$ ./dsrun org.dspace.app.itemimport.ItemImport -a -c CollectionHandle -e Eperson -s /PATH_TO_BATCH_FILES/batch_files -m /home//PATH_TO_BATCH_FILESs/Ingest1/mapfile.txt

Resources

Dorothea Salo's EXCELLENT blog
ingest-export.ppt ARD Prasad
ScalabilityIssues - DSpace Wikis

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)