We had assumed that it was the XACML permissions this was not the case. The problem was actually due to a TN element in the foxml that was the problem.
Here is the contents of the
/swadm/local/fedora3/server/fedora-internal-use/fedora-internal-use-repository-policies-approximating-2.0
deny-apim-if-not-localhost.xml
deny-inactive-or-deleted-objects-or-datastreams-if-not-administrator.xml
deny-policy-management-if-not-administrator.xml
deny-purge-datastream-if-active-or-inactive.xml
deny-purge-object-if-active-or-inactive.xml
deny-reloadPolicies-if-not-localhost.xml
deny-unallowed-file-resolution.xml
LOGPATH
permit-anything-to-administrator.xml
permit-apia-unrestricted.xml
permit-dsstate-check-unrestricted.xml
permit-oai-unrestricted.xml
permit-serverStatus-unrestricted.xml
readme.txt
I changed the Media Filter so that it would not use the unix nice command when it launches. This should speed up the process.
Crontab
@reboot /sbin/service httpd start
@reboot sudo -u tomcat /dspace/bin/start_tomcat.sh
# day of week (0 - 6) (Sunday=0)
10 1 * * 6 /dspace/dspace-ir/bin/media_launch.sh
30 22 * * 1 /dspace/dspace-sr/bin/index-all-cron
30 22 * * 2 /dspace/dspace-ir/bin/index-all-cron
30 22 * * 3 /dspace/dspace-sr/bin/index-all-cron
30 22 * * 4 /dspace/dspace-ir/bin/index-all-cron
30 22 * * 5 /dspace/dspace-sr/bin/index-all-cron
media_launch.sh
tstamp=`date "+%Y%m%d_%H:%M"`
echo $tstamp
nice /dspace/dspace-ir/bin/filter-media.sh > /dspace/dspace-ir/log/filter-media.sh_$tstamp.log 2>&1
cd /dspace/dspace-ir/bin/
/dspace/dspace-ir/bin/index_check_and_email.sh
filter-media.sh
Note the "-n" in filter-media means that the index will not be made after each collection is OCRed. Also in the runs using "nice" the "-n" was also used.
#!/bin/sh
# This script grabs the handles of each collection
# in a DSpace DB instance. Then loops through the
# handles and run the full-text indexer against each
# collection.
# This is done to fix out of memory errors,
# PDFs that are too large for full-text indexing,
# and when filter-media (java app) fails now full
# text indexing continues on other collections.
# Setup the environment
JAVA_HOME=/opt/jdk1.5.0_10
PATH=$JAVA_HOME/bin:/opt/ant/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin
export PATH JAVA_HOME
dbname="dspace_ir"
username="read_only"
hostname="strip3.oit.umn.edu"
# Determine if we have Postgres client installed
which psql > /dev/null
if [ $? -ne 0 ]
then
echo
echo "psql not found in your PATH, please add to your PATH and re-run script"
echo
exit 1
fi
print_usage()
{
echo 1>&2 "Usage: $0 [-d dbname] [-u username]"
exit 1;
}
while getopts d:hu: o
do case "$o" in
d) dbname="$OPTARG";;
h) print_usage;;
n) hostname="$OPTARG";;
u) username="$OPTARG";;
[?]) print_usage;;
esac
done
echo_cmd="echo SELECT handle FROM handle WHERE resource_type_id=3;"
psql_cmd="psql -t -U $username -h $hostname $dbname"
BINDIR=`dirname $0`
for handle in `$echo_cmd | $psql_cmd`
do
$BINDIR/filter-media -n -i $handle
done
$BINDIR/index-all
List of datastreams in a Fedora Object
curl example
curl --user user:password http://128.101.146.59:8080/fedora/objects/urepository:97023/datastreams
browser example
http://128.101.146.59:8080/fedora/objects/urepository:97023/datastreams
MODs Datastream
curl example
curl --user user:password http://128.101.146.59:8080/fedora/objects/urepository:97023/datastreams/MODS/content
browser example
http://128.101.146.59:8080/fedora/objects/urepository:97023/datastreams/MODS/contentss
RDF Datastream
curl example
curl --user user:password http://128.101.146.59:8080/fedora/objects/urepository:97023/datastreams/RELS-EXT/content
browser example
http://128.101.146.59:8080/fedora/objects/urepository:97023/datastreams/RELS-EXT/content
Get all triples
curl example
curl --user user:password http://128.101.146.59:8080/fedora/risearch?type=triples\&lang=spo\&format=N-Triples\&stream=on\&query=*+*+*
browser example
http://128.101.146.59:8080/fedora/risearch?type=triples&lang=spo&format=N-Triples&stream=on&query=*+*+*
All triples with 36284 handle as a predicate
curl example
curl --user user:password http://128.101.146.59:8080/fedora/risearch?type=triples\&lang=spo\&format=N-Triples\&stream=on\&query=*+*+\
browser example
http://128.101.146.59:8080/fedora/risearch?type=triples&lang=spo&format=N-Triples&stream=on&query=*+*+
All triples with 36284 handle as a subject
curl example
curl --user user:password http://128.101.146.59:8080/fedora/risearch?type=triples\&lang=spo\&format=N-Triples\&stream=on\&query=\+*+*
browser example
http://128.101.146.59:8080/fedora/risearch?type=triples&lang=spo&format=N-Triples&stream=on&query=+*+*
Determine if 36284 is a collection
curl example
curl --user user:password http://128.101.146.59:8080/fedora/risearch?type=triples\&lang=spo\&format=N-Triples\&stream=on\&query=*+\+\
Find the title of 36284
curl --user user:password http://128.101.146.59:8080/fedora/risearch?type=triples\&lang=spo\&format=n-triples\&stream=on\&query=\+\+*