« August 2009 | Main | October 2009 »

September 17, 2009

David Naughton gave me some fast perl code for hashes

#!/exlibris/sfx_ver/sfx_version_3/app/perl-5.8.6/bin/per use strict; use warnings; # Takes export file from SFX and generates host definitions to append to # the ezproxy.cfg file. # Open SFX export file for manipulation my $sfx_url_file = shift @ARGV; open (my $fh_sfx, '<', $sfx_url_file) or die "File Open Failed: $!"; # get hostnames into a hash my %hostnames; while (<$fh_sfx>) { my $line = $_; next if $line =~ /^#/; my ($sfx_target, $sfx_url) = split /\t/, $line; if ($sfx_url =~ m/\:\/\/(.*?)\//) { # Hash keys are always unique, so if a key for $1 # already exists, this line will clobber its value: $hostnames{$1} = undef; # If you want to keep track of how many times each # hostname appears, you can use this magic: # $hostnames{$1}++; # More verbose version of the code above: # if (!(exists $hostnames{$1})) { # $hostnames{$1} = 0; # } # $hostnames{$1} = $hostnames{$1} + 1; } } # print each unique hash key, with some added text for my $hostname (keys %hostnames) { print "HJ $hostname\n"; } # close file handle close $fh_sfx or die "File Close Failed: $!";

September 15, 2009

dspace batch ingest

Format of files to ingest

Here is a breakdown of the files : batch_files - top level directory (name does not matter)
       I
   Ingest1   - This a prototype for the directories that you will create.  Give these directories any name you want
            I
         contents                                            -    contains  the asset name.  The fields are separated by a single tab.  File must  be called "contents"
        dublin_core.xml                                - contains the DC metadata.  This file must be called "dublin_core.xml" 
        UDCsubmissionguidelines.pdf     - This is the asset.  You may use whatever name you want.  However the name of the
                                                                       asset must appear in the "contents" file.

tarball that gives working sample of the directory structure.

command

/dspace/dspace-ir/bin$ ./dsrun org.dspace.app.itemimport.ItemImport -a -c CollectionHandle -e Eperson -s /PATH_TO_BATCH_FILES/batch_files -m /home//PATH_TO_BATCH_FILESs/Ingest1/mapfile.txt

Resources

Dorothea Salo's EXCELLENT blog
ingest-export.ppt ARD Prasad
ScalabilityIssues - DSpace Wikis

September 14, 2009

putting captchas into dspace using jcaptcha

ingest-export.ppt

code needed for a captcha

The file form.jsp had to be modified so that if the captcha was not set or was not correct the form for the email form was not produced.
If the email form was not called then a jsp was called that produced the captcha: captcha_main.jsp The jsp captcha_main.jsp called in order the following java classes to make the captcha image:
ImageCaptchaServlet.java
CaptchaServiceSingleton.java
MyImageCaptchaEngine.java
Also the file dspace-web.xml had to be modified.

X11 not on the box where tomcat runs

If X11 is not the box with tomcat, you will get an error like "port 6000 not available". To fix this put:

-Djava.awt.headless=true

into the catalina.sh file.

Helpful links

How to use jsp-forward tag
How do I perform browser redirection from a JSP pages
Breaking a Visual CAPTCHA