Main

October 28, 2011

Islandora was not properly pointing at https fedora sites

The issue

When we changed to https for stage we got the white screen of death with drupal.

Location of code

The file where the troubled code is:
sites/all/modules/Islandora-islandora-6589c7c/ConnectionHelper.inc
function _fixURL

The bug

This is the basic problem. The code below will yield:
$new_url = 'http'
No matter what the input is.
<?php                   
   $url = 'https://ddd';
    if (strpos($url, 'http://') == 0) {
        $new_url = 'http';
    }
    elseif (strpos($url, 'https://') == 0) {
      $new_url = 'https';
    }
    else {
        drupal_set_message(t('Invalid URL: !url', array('!url' => $url)));
        return NULL;
    }
   echo $new_url . "\n";
?>
Rewrite of the code that properly selects https:
<?php                   
   $url = 'https://ddd';

    $url_start = substr ($url , 0, 5 );

    if ($url_start == 'http:') {
        $new_url = 'http';
    }
    elseif ($url_start == 'https') {
      $new_url = 'https';
    }
    else {
        drupal_set_message(t('Invalid URL: !url', array('!url' => $url)));
        return NULL;
    }
   echo $new_url . "\n";
?>
I inserted this code into the function _fixURL code of the file:
sites/all/modules/Islandora-islandora-6589c7c/ConnectionHelper.inc
and all is well.

September 29, 2011

Bill's Solution to the Solr problem

We had problems connecting SOLR to Drupal. Here are some answers that Bill Tantzen found:

Some issues with bad characters

Hex values should be inserted in the solr REST call:
http://128.101.146.59:8080/solr/select/?version=1.2&start=0&rows=10&indent=on&wt=standard&q=fullText:food&sort=modstitle desc&qt=mods

Should be:

http://128.101.146.59:8080/solr/select/?version=1.2&start=0&rows=10&indent=on&wt=standard&q=fullText:food++%26sort%3Dmodstitle%2Bdesc&qt=mods

Fields that Solr searches on should be single valued

Bill found out that fields that are being used for sort must be single values. Our title field is not. In the past SOLR would handle multivalued fields for sort.

The MODS schema

	<titleInfo>
 	 	<title>[dc.title]</title>
 	</titleInfo>
<titleInfo type="translated">
 	 	<title>[dc.title.alternative]</title>
 	</titleInfo>

The XSLT to make SOLR indices

Note the XSLT will not distinguish between different values of the type attribute, so there is more than version of title, it will no longer be single valued.
/Users/birage/fedora/tomcat/webapps/fedoragsearch/WEB-INF/classes/config/index/GSearch_solr/demoFoxmlToSolr.xslt  

    <xsl:for-each select="foxml:datastream[@ID='MODS']/foxml:datastreamVersion[last()]/foxml:xmlContent//mods:titleInfo/mods:title">
         <xsl:if test="text() [normalize-space(.) ]"><!--don't bother with empty space-->
            <field>./WEB-INF/classes/config/index/GSearch_solr/demoFoxmlToSolr.xslt
               <xsl:attribute name="name">
                  <xsl:value-of select="concat('mods.', 'title')"/>
               </xsl:attribute>
               <xsl:value-of select="text()"/>
            </field>
         </xsl:if>
      </xsl:for-each>

Possible values for the type attribute of title element in MODS

From: http://www.loc.gov/standards/mods/v3/mods-userguide-elements.html
type - This attribute is applied when it is necessary to identify what type of title is recorded.
For the main title (MARC 21 field 245), no type is indicated. The following values may be used with the type attribute:
abbreviated (equivalent to MARC 21 field 210)
translated (equivalent to MARC 21 field 242, 246)
alternative (equivalent to MARC 21 fields 246, 740)
uniform (equivalent to MARC 21 fields 130, 240, 730)

Reason why Islandora could not connect to SOLR on stage

The bug

When we tried to connect to solr from islandora we got the error:
Unable to connect to Solr server 

Islandora code connected to the problem

This error is generated in the Islandora file:
./sites/all/modules/Islandora-islandora_solr_search-9e474f7/solr.admin.inc: The following function is the root cause of the problem:

/**
 *
 * @param String $solr_url
 * @return boolean
 *
 * Checks availability of Solr installation
 *
 */
function solr_available($solr_url) {
  // path from url is parsed to allow graceful inclusion or exclusion of 'http://'
  $pathParts = parse_url($solr_url); 
  $path = 'http://' . $pathParts['host'] . ':' . $pathParts['port'] . $pathParts['path'] . '/admin/file';
  $test = @fopen($path, "r");
  if ($test) {
    return true;
  }
  return false;
}
    

The fix (upgrade SOLR)

It turns out that solr 3.1 cannot recognize the
"/admin/file"
at the end of a URL. We upgraded to SOLR 3.4 and it worked.

September 23, 2011

Reason why we could not upload foxml file (agecon_top.xml)

We had assumed that it was the XACML permissions this was not the case. The problem was actually due to a TN element in the foxml that was the problem.

Here is the contents of the
/swadm/local/fedora3/server/fedora-internal-use/fedora-internal-use-repository-policies-approximating-2.0
deny-apim-if-not-localhost.xml deny-inactive-or-deleted-objects-or-datastreams-if-not-administrator.xml deny-policy-management-if-not-administrator.xml deny-purge-datastream-if-active-or-inactive.xml deny-purge-object-if-active-or-inactive.xml deny-reloadPolicies-if-not-localhost.xml deny-unallowed-file-resolution.xml LOGPATH permit-anything-to-administrator.xml permit-apia-unrestricted.xml permit-dsstate-check-unrestricted.xml permit-oai-unrestricted.xml permit-serverStatus-unrestricted.xml readme.txt

September 6, 2011

Varios curls for fedora RDF (REST interface)

List of datastreams in a Fedora Object

curl example

curl --user user:password http://128.101.146.59:8080/fedora/objects/urepository:97023/datastreams

browser example

http://128.101.146.59:8080/fedora/objects/urepository:97023/datastreams

MODs Datastream

curl example

curl --user user:password http://128.101.146.59:8080/fedora/objects/urepository:97023/datastreams/MODS/content

browser example

http://128.101.146.59:8080/fedora/objects/urepository:97023/datastreams/MODS/contentss

RDF Datastream

curl example

curl --user user:password http://128.101.146.59:8080/fedora/objects/urepository:97023/datastreams/RELS-EXT/content

browser example

http://128.101.146.59:8080/fedora/objects/urepository:97023/datastreams/RELS-EXT/content

Get all triples

curl example

curl --user user:password http://128.101.146.59:8080/fedora/risearch?type=triples\&lang=spo\&format=N-Triples\&stream=on\&query=*+*+*

browser example

http://128.101.146.59:8080/fedora/risearch?type=triples&lang=spo&format=N-Triples&stream=on&query=*+*+*

All triples with 36284 handle as a predicate

curl example

curl --user user:password http://128.101.146.59:8080/fedora/risearch?type=triples\&lang=spo\&format=N-Triples\&stream=on\&query=*+*+\<info:fedora/urepository:36284\>

browser example

http://128.101.146.59:8080/fedora/risearch?type=triples&lang=spo&format=N-Triples&stream=on&query=*+*+<info:fedora/urepository:36284>

All triples with 36284 handle as a subject

curl example

curl --user user:password http://128.101.146.59:8080/fedora/risearch?type=triples\&lang=spo\&format=N-Triples\&stream=on\&query=\<info:fedora/urepository:36284\>+*+*

browser example

http://128.101.146.59:8080/fedora/risearch?type=triples&lang=spo&format=N-Triples&stream=on&query=<info:fedora/urepository:36284>+*+*

Determine if 36284 is a collection

curl example

curl --user user:password http://128.101.146.59:8080/fedora/risearch?type=triples\&lang=spo\&format=N-Triples\&stream=on\&query=*+\<info:fedora/fedora-system:def/relations-external#isMemberOfCollection\>+\<info:fedora/urepository:36284\>

Find the title of 36284

curl --user user:password http://128.101.146.59:8080/fedora/risearch?type=triples\&lang=spo\&format=n-triples\&stream=on\&query=\<info:fedora/urepository:36284\>+\<http://purl.org/dc/elements/1.1/title\>+*

June 23, 2011

Changes required to have islandora_ContentModelCollection.xml upload to fedora repository

Summary

Biraj pulled some xml directly from his repository but it could not be ingested into the new Fedora repository on https://umetadata-stage.lib.umn.edu:8443/fedora/admin/. However I could ingest into umetadata-stage the examples that the Fedora repository people gave us.

Original file that would not load and modified file that does load

Version of the xml directly from Biraj's Fedora repository that would not ingest into the umetadata-stage Fedora repository:
islandora_ContentModelCollection.xml
Modified XML that can be ingested:
islandora_ContentModelCollection_noaudit.xml

Changes

Modification Comment
Eliminate audit elements not certain that this is required
Eliminate CREATED timestamp elements not certain that this is required
Eliminate element with ID="TN.0" LABEL="Thumbnail" required

June 10, 2011

crons on strip3 (DB side of DSPACE)

# Clean up the databases nightly
20 0 * * * vacuumdb -U dspace_ir --analyze dspace_ir > /dev/null 2>&1
40 0 * * * vacuumdb -U dspace_sr --analyze dspace_sr > /dev/null 2>&1

# Backup the databases nightly
2 1 * * * /var/lib/pgsql/backup.sh

# MySQL backup
5 0 * * * /opt/mysql/bin/backup.sh

March 29, 2011

I gave a talk a Macalester with Jason (March 17, 2011) below is the power point

techConf2011-roySilvis.ziptechConf2011-roySilvis.ziptechConf2011-roySilvis.zip

January 13, 2011

ABBY OCR System does not work well in a VM

I tried to run ABBY 3.0 in a VM. I used a 34 page pdf as input to ABBY:
MRC-72-3 A consumer test of canned, seasoned salad tomatoes.pdf I ran the microsoft performance monitor while using ABBY to produce the plot below.
FinalperformanceTest.gif
red line is \\DLS-OCR\Processor(_Total)\% Processor Time
green line is \\DLS-OCR\Memory\Page Faults/sec ( this maxed out at above 60K)


Also we have a csv version of the data This run was completed after several modifications of the VM were made to enhance performance. However even at this point, it still takes almost 2 seconds per page.
Because We will not be using a VM with ABBY.

June 29, 2010

Canididate data stream for METS struct map

In our Fedora archive, there will be complex objects that have children objects. The order of these objects will be specified by a METS struct map in a Fedora data stream. The METS xml will look something like:

<METS:mets OBJID="demo:UMNcard.Will001" xmlns:METS="http://www.loc.gov/METS/"> <METS:structMap> <METS:div ID="UMNcard.Will001.STRUCT"> <METS:div ORDER="1" CONTENTIDS="demo:UMNcard.Will001.01" LABEL="FRONT"/> <METS:div ORDER="2" CONTENTIDS="demo:UMNcard.Will001.02" LABEL="BACK"/> </METS:div> </METS:structMap> </METS:mets>

May 26, 2010

Using an axis client to call fedora modifyDatastreamByValue method

Below is code that modifies a fedora data stream using an axis client.

I had an error message like:
SAXException: Found character data inside an array element while deserializing
I did some looking around and found out that the line:
call.setOperationStyle(org.apache.axis.constants.Style.WRAPPED);
had to be inserted (see http://www.opensubscriber.com/message/axis-user@ws.apache.org/1855611.html and thanks Anne)
public void insertSequenceData(String NewXMl, String PID, String LogMessage) throws Exception {
  byte[] normalarr = NewXMl.getBytes("UTF-8");
  String[] altIds = new String[1];
  altIds[0] = "";
  // Use an axis client to call the Fedora webserver
  Service service = new Service();
  Call call = (Call) service.createCall();
  call.setOperationName(new QName(APIM_NS, "modifyDatastreamByValue") );
  call.setTargetEndpointAddress( new URL(URL_API_M));
  call.setUsername(fedoraUser);
  call.setPassword(fedoraPswd);
  // if the WRAPPED stlye is not used you get the evil error:
  // SAXException: Found character data inside an array element while deserializing
  // see 
  // http://www.opensubscriber.com/message/axis-user@ws.apache.org/1855611.html
  // for the solution.
  call.setOperationStyle(org.apache.axis.constants.Style.WRAPPED);
  Object[] obj_arr = new Object[] {
    PID, // The PID of the object.
    "STRUCT",        // The datastream ID.
    altIds, // Alternate identifiers for the datastream, if any.
    "METS StructMap for this object", //  The label for the datastream.
    "text/xml", //  The mime type.
    "",  //  Optional format URI of the datastream.
    normalarr,  //  The content of the datastream.
    null, //  The algorithm used to compute the checksum. One of "DEFAULT", "DISABLED", "MD5", "SHA-1", "SHA-256", "SHA-385", "SHA-512".
    null,  //  The value of the checksum represented as a hexadecimal string.
    LogMessage, //  A log message.
    false // Force the update even if it would break a data contract.
    };
  call.invoke( obj_arr); 
}

May 5, 2010

AXIS2 client for fedora repository

Below is the source code for an AXIS2 client that talks to the fedora repository. I based this off of a very helpful code fragment that I found, but I have lost the link. So thank you friend. I really appreciate the help.
 
import org.apache.axis.client.Service;
import org.apache.axis.client.Call;
import javax.xml.namespace.QName;
import java.net.*;

public class Axis2ClientToFedora
{

 public static void main(String[] argv){
  try{
    Service service = new Service();
    Call call = (Call) service.createCall();
    call.setOperationName(new QName("http://www.fedora.info/definitions/1/0/api/", "purgeObject") );
    call.setTargetEndpointAddress( new URL("http://chaucer.lib.umn.edu:8080/fedora/services/management") );
    call.setUsername("FedoraUserName");
    call.setPassword("FedoraPassword");
    Object[] obj_arr = new Object[] {
        "basic:data",
        "purge basic:data",
        false
        };
    call.invoke( obj_arr);
  }
      catch ( Exception e ){
         e.printStackTrace();
  }
  return;
}
}

rels-ext for collections

Below is an example of the rdf that I plan to use for collection objects in fedora:
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="info:fedora/collection:data"> <hasModel xmlns="info:fedora/fedora-system:def/model#" rdf:resource="info:fedora/basic:content_model"></hasModel> <hasModel xmlns="info:fedora/fedora-system:def/model#" rdf:resource="info:fedora/collection:content_model"></hasModel> <hasMember xmlns="info:fedora/fedora-system:def/relations-external#">basic:data</hasMember> </rdf:Description> </rdf:RDF> this object has all the basic methods, all the collection objects and it has a child called basic:data.

April 22, 2010

Part of the fedora API-M methods that deal with rdf

addRelationship
Creates a new relationship in the object. Adds the specified relationship to the object's RELS-EXT or RELS-INT Datastream. If the Resource Index is enabled, the relationship will be added to the Resource Index.
A rdf tuple consist of an object or datastream (the subject), having a predicate relating it to a target (the object). The object can either be a literal value, or a URI (which can identify for example a Fedora object or a datastream).
Input parameters: • String subject The subject. Either a Fedora object URI (eg info:fedora/demo:333) or a datastream URI (eg info:fedora/demo:333/DS1).
• String relationship The predicate.
• String object The object (target).
• boolean isLiteral A boolean value indicating whether the object is a literal.
• String datatype The datatype of the literal. Optional.
Returns:
• boolean True if and only if the relationship was added.

getRelationships
Get the relationships asserted in the object's RELS-EXT or RELS-INT Datastream that match the given criteria.
Input parameters:
• String subject The subject. Either a Fedora object URI (eg info:fedora/demo:333) or a datastream URI (eg info:fedora/demo:333/DS1).
• String relationship The predicate to match. A null value matches all predicates. Returns:
• RelationshipTuple[]** String subject - The subject of the relation. Either a Fedora object URI (eg info:fedora/demo:333) or a datastream URI (eg info:fedora/demo:333/DS1).
• String predicate - The predicate relating the subject and the object. Includes the namespace of the relation.
• String object - The URI of the object (target) of the relation
• boolean isLiteral - If true, the subject should be read as a literal value, not a URI
• String datatype - If the subject is a literal, the datatype to parse the value as. Optional.
purgeRelationship
Delete the specified relationship. This method will remove the specified relationship(s) from the RELS-EXT or RELS-INT datastream. If the Resource Index is enabled, this will also delete the corresponding triples from the Resource Index.
Input parameters:
• String subject The subject.  Either a Fedora object URI (eg info:fedora/demo:333) or a datastream URI (eg info:fedora/demo:333/DS1).
• String relationship The predicate, null matches any predicate.
• String object The object, null matches any object.
• boolean isLiteral A boolean value indicating whether the object is a literal.
• String datatype The datatype of the literal. Optional.
Returns:
• boolean True if and only if the relationship was purged.

Extracted from here

January 28, 2010

API to enable complex objects

Bill Jason and I are working on an API for Complex objects in Fedora.

January 11, 2010

UML for media

UML.jpg

December 10, 2009

Postcard Complex object from UW

PostcardObjectRelations.png See: University of Wisconsin Digital Collections Center - Sample Postcard Object in FedoraCommons.

ESciDoc TOC (Table Of Contents)

I. ESciDoc Content Models

The primary type (or the category) of the content resources depicted with the CModel. Allowed values are:
* Item
* Container
* TOC (Table Of Contents)
escidoc_conceptual_model.jpg Diagram from: eSciDoc(4).pdf A TOC is optional and not shown above.

II. TOC Description

In eSciDoc hierarchical structures are build by means of container resources. A container resource refers to its members which are again containers or items. The set of references is represented as structural map (struct-map) inside the representation of a container resource. Additionally a container may contain a table of content (TOC) which contains an ordered selection of members.

III. Example of TOC

Some attributes in the TOC xml.

div element attributes:
* ORDER: The physical pagenumber of the scan. The physical order must begin with number "1".
* ORDERLABEL: The logical pagenumber of the scan

* ID: The identification number of this scan (id of the item)
* TYPE: The type of this structural element (see List of List of structural element types
* LABEL: The elements title
* VISIBLE: Indicates if this div (and its sub-elements should be displayed when displaying this toc

ptr element attributes:
* ID: The identification of this pointer
* USE: The type of the file described with this locator
MIN = thumbnail size
DEFAULT = Web size
MAX = Full size
ITEM = item which contains these files
* xlink:href: The locator for this file
* LOCTYPE: The locator type
* MIMETYPE: The scans MIME type

III. Used by VIRR

Welcome to the "Virtueller Raum Reichsrecht" Collection of the Max Planck Institute for European History of Law The solution will provide a published digital collection and a cooperative working environment for various artefacts of the legislation in the period of the German Holy Empire. The compilation will be indexed, structured via METS, transcripted and linked to further relevant scientific literature. Max Plank Wiki says it has more than 20,000 scans.

IV. escidoc and CMAs

A) escidoc has the concept of content models see ESciDoc Logical Data Model
Content models defines in general:
* the type and structure of the content resources (item, container, members)
* a set of services that may be associated with the content resources


Seems like CMA
B) Plans to bring CMAs in
From Roadmap Infrastructure: Status: March 16, 2009
Content Model Content Model Handler
propose XML-representation
Specification needed. May be based on the new Fedora CMA (content model architecture)

November 24, 2009

tomcat user runs fedora

Instructions from tomcat install

From tomcat manual
3.5. Running Tomcat as Non-Root User I don't believe there any issues with running Tomcat as root user. However, for the more security-conscious readers out there, here are some instructions on running Tomcat as a non-root user. At this stage, the Tomcat packages, files and binaries are owned by root. We will first need to create a Tomcat user and group that will own these files, and under which Tomcat will run. Tomcat User :: tomcat Tomcat Group :: tomcat Not too imaginative, huh ? We will now create the Tomcat user and group. Open a terminal window and, as root, # groupadd tomcat # useradd -g tomcat -d /opt/tomcat tomcat # passwd tomcat Notice that we specified the home directory of Tomcat to be /opt/tomcat. Some people believe that this is good practice because it eliminates an additional home directory that needs to be administered. Now, we will put everything in /opt/tomcat under Tomcat user and group. As root, # chown -R tomcat:tomcat /opt/tomcat If /opt/tomcat is a symlink to your Tomcat install directory, you'll need to do this: # chown -R tomcat:tomcat /opt/jakarta-tomcat-5.x.xx Verify that JAVA_HOME and CATALINA_HOME environment variables are setup for tomcat user, and you should be good to go. Once the Tomcat binaries are under Tomcat user, the way you invoke it will be different. To start Tomcat, # su - tomcat -c /opt/tomcat/bin/startup.sh To stop Tomcat, # su - tomcat -c /opt/tomcat/bin/shutdown.sh
In my case replace these commands with
su - tomcat -c /usr/local/fedora/tomcat/bin/startup.sh and
su - tomcat -c /usr/local/fedora/tomcat/bin/shutdown.sh
Also, be aware that your web applications will need to be deployed (i.e. copied to the web application directories) as user tomcat, instead of root. A little more hassle, but possibly a little safer too.

Lines added to /etc/profile

The tomcat user needed access to a few envirnoment variables so I added the following lines to /etc/profile

# User specific aliases and functions
export JAVA_HOME=/usr/lib/jvm/jre-1.6.0-openjdk.x86_64
export FEDORA_HOME=/usr/local/fedora
export PATH=$PATH:$FEDORA_HOME/server/bin:$FEDORA_HOME/client/bin:$JAVA_HOME/bin
export CATALINA_HOME=$FEDORA_HOME/tomcat
export UWDCUTIL_HOME=/usr/local/uwdcutils-1.0

November 17, 2009

Reaching the Fedora Repository from a remote box ... localhost issue

I was trying to contact a fedora box from a remote box and I had the problem that when I hit links within Fedora, the word "localhost" kept appearing in the URL. This resolved to a "404" because Fedora is not on my box. I fixed this by editing:
$FEDORA_HOME/server/config/fedora.fcfg and changing the line: <param name="fedoraServerHost" value="localhost"> In the line above, I replaced the word "localhost" with the actual IP address.

November 13, 2009

Turning XACML ON and OFF in Fedora

2.1 Enabling/Disabling XACML Policy Enforcement To enable/disable XACML policy enforcement in Fedora, use the Fedora configuration file (fedora.fcfg). Whether Fedora uses XACML for authorization decisions is controlled by the ENFORCE-MODE parameter in the Authorization module: <param name="ENFORCE-MODE" value="enforce-policies"/> The ENFORCE-MODE parameter can contain one of three values, with the following meanings: 1. enforce-policies – enable XACML enforcement to determine whether a request is permitted or denied
2. permit-all-requests – disable XACML enforcement; PERMIT every request by default
3. deny-all-requests – disable XACML enforcement; DENY every request by default
The enforce-policies setting is used to enable the enforcement of XACML policies, and is the default setting for a Fedora repository. The permit-all-requests setting can facilitate testing code independent of security. The deny-all-requests setting can be used to quickly shut down access to the server, but requires a server restart to affect this. Tomcat container security is, of course, still a first barrier to authentication/authorization (i.e., Fedora's Tomcat web.xml specifies access protection earlier than XACML. Tomcat container security is always in place regardless of the setting for parameter ENFORCE-MODE. see Fedora Commons on XACML

October 6, 2009

title element wrong for media ingest

The IMAGES xml files used to ingest data into the the media repository contain a flaw.
Bad version (current): <title main="Duplex House" variant="Residence project" variant="Exterior presepctive"/> Good: <title type="main" > Duplex House </title> <title type="variant" > Residence project </title> <title type="variant" > Exterior perspective </title> Effected files: bln-dcugranting2007.xml
botanical-dcugranting2007.xml
cbi-dcugranting2007.xml
ellis-dcugranting2007.xml
mno-dcugranting2007.xml
mss-alexanderBros.xml
mss-dcugranting2007.xml

Some more files (all the rest):

s 001-bell-historicalmaps
005-ymca-wwiPhotos
006-map-19thCent
008-eas-ming
011-mss-purcellMasonite

August 10, 2009

Ames collection ingest ... multiple metadata per file

The problem

While ingesting the metadata for the AMes collection I found a problem. Below is a list of metadata sets that map to the same image. s
identifier local Image
ama00711 ama00711.jpg
ama00712 ama00711.jpg
amp00259 amp00259.jpg
ap00259 amp00259.jpg
amp00435 amp00435.jpg
amp00436 amp00435.jpg
amp00448 amp00449.jpg
amp00449 amp00449.jpg
amp00513 amp00531.jpg
amp00531 amp00531.jpg

Jason's solution

identifier local Image ama00711 ama00711.jpg ama00712 ama00711.jpg - error in the data. Should point to ama00712.jpg (I've fixed it in IMAGES) amp00259 amp00259.jpg ap00259 amp00259.jpg - delete this version amp00435 amp00435.jpg amp00436 amp00435.jpg - error in the data. Should point to amp00436.jpg (I've fixed it in IMAGES) amp00448 amp00449.jpg - error in the data. Should point to amp00448.jpg (I've fixed it in IMAGES) amp00449 amp00449.jpg amp00513 amp00531.jpg - error in the data. Should point to amp00513.jpg (I've fixed it in IMAGES) amp00531 amp00531.jpg

What to do from here

Bill will need to pull the ames collection from IMAGES. I will need to re ingest it and delete ap00259.

May 24, 2009

Control Groups:

Control Groups:

a. Managed Content (M): Datastream content is stored and managed within the Fedora repository’s persistent storage. The content can be any MIME type including XML. b. Inline XML (X): A special case of M, restricted to well-formed XML. In this case the datastream content is stored as part of the XML structure of the digital object itself and is thus included when the digital object is exported (e.g., for archival purposes). c. Externally Referenced (E): Datastream content is external to the Fedora repository and is referenced by a URL that is recorded within the digital object. The content can be any MIME type including XML. d. Redirected Content (R): Like E, but datastream content is delivered to the client without any mediation by Fedora; i.e., via an HTTP redirect. You should use this datastream type when the external content is a web page with relative links or it is streaming audio or video. The content can be any MIME type including XML.


State
"A" "I" "D" (Active, Inactive Deleted)
Fedora object type(s)
O=regular data objects, D=behavior definitions, M=behavior mechanisms
.

April 10, 2009

Minimal DC data stream in Fedora Repository

Fedora docs say That a minimal DC datastream consists of the elements dc:title and dc:identifier

January 20, 2009

Making a mods data stream for fedora.

Overall Plan

The function _ustore_ingest_fedora creates the dc content stream. Here is a list of functions within _ustore_ingest_fedora and what needs to be done to them to create a mods stream.
routine change needed for mods
_fedora_doc No change
_dc_datastream write mods_datastream
_dc_content write _mods_content
_append_datastream No change
_append_dc_content change query string
No change to the rels-ext stuff.

July 7, 2008

ingestFormat and Fedora 3.0

Version of ingestFormat that fails in Fedora 3.0

Old version version of ingest with ingestFormat equal to "foxml1.0"
This fails in Fedora 3.0. giving the error:

fedora.server.errors.ObjectValidityException: Unsupported format: foxml1.0


Code example that generates error:
 
import fedora.server.types.gen.RepositoryInfo;
import java.io.*;

public class FedoraIngest {

    private static final String protocol = "http";
    private static final String host = "localhost";
    private static final int port = 8080;
    private static final String usr = "fedoraAdmin";
    private static final String pwd = "pass";

    private static final String collection = "swhp";
//    private static final String foxmlSrc = "/Users/bill/projects/fedora/" 
//                                         + collection + "/";

    private static final String foxmlSrc = "/Users/silvi003/Desktop/bill" 
                                         + collection + "/";


    public static void main(String[] argv) throws Exception {

	String[] dir = new java.io.File(foxmlSrc).list(new FOXMLFilter());
	FedoraSOAPClient caller = new FedoraSOAPClient(protocol, host, port, usr, pwd);
	// test client connection status with the most basic call...
	for (int i = 0; i< dir.length; i++) {
	    String pid = dir[i];
	    System.out.println("FedoraIngest " + pid);
	    try {
		FileInputStream fis = null;
		String fedoraPid = null;
		File foxml = new File(foxmlSrc + pid);
		fis = new FileInputStream(foxml);
		fedoraPid = caller.ingest(fis, "foxml1.0", "ingest of " + pid);
		System.out.println("new fedora object: " + fedoraPid);
	    } 
	    catch (Exception excp) {
		System.out.println("ingest error: " + excp.getMessage());
		excp.printStackTrace();
	    }
	}
    }
}






Version of ingestFormat that works in Fedora 3.0

The ingestFormat value of "info:fedora/fedora-system:FOXML-1.1" works in Fedora 3.0.

I found this value in the config file:
$FEDORA_SRC_HOME/src/properties/server/fedora/server/resources/Server.properties
Code example that works:
 
import fedora.server.types.gen.RepositoryInfo;
import java.io.*;

public class FedoraIngestOneFile {

    private static final String protocol = "http";
    private static final String host = "localhost";
    private static final int port = 8080;
    private static final String usr = "fedoraAdmin";
    private static final String pwd = "pass";

    public static void main(String[] argv) throws Exception {

	FedoraSOAPClient caller = new FedoraSOAPClient(protocol, host, port, usr, pwd);
	String FileName = "/Users/silvi003/Desktop/umndob_msp01688";
	    try {
		File foxml = new File(FileName);
		FileInputStream fis = new FileInputStream(foxml);
		String fedoraPid = caller.ingest(fis, "info:fedora/fedora-system:FOXML-1.1", "ingest of " + FileName);
		System.out.println("new fedora object: " + fedoraPid);
	    } 
	    catch (Exception excp) {
		System.out.println("ingest error: " + excp.getMessage());
		excp.printStackTrace();
	    }
    }
}

July 1, 2008

Upload to soap

MTOM: way of sending binary in soap

SOAP Message Transmission Optimization Mechanism (MTOM)
XOP (XML-binary Optimization Packaging)

RFC 2045 section 6.8 gives description of Base64 Content-Transfer-Encoding

Understanding MTOM

Advantages of MTOM
Introduction to MTOM: A Hands-on Approach (more advantages to MTOM)
Sending Files in Chunks with MTOM Web Services and .NET 2.0

Possible PHP library for MTOM

I have a client/soap server in AXIS2 that sends MTOM back and forth. This could serve as the web service. PHP needs to talk to the AXIS2 webservice.
A possible choice is WSO2 Web Services Framework/PHP Proven Interoperability

WSO2 WSF/PHP features proven interoperability with Microsoft .NET, WSO2 WSAS (Apache Axis2/Java based Web services application server) and other J2EE implementations. The basic SOAP level interoperability as well as WS-* specification implementations have been tested and proven to interoperate.

Attachments with Web Services and Clients
You can send and receive attachments with SOAP messages both in optimized as well as non optimized formats with MTOM support. Attachments with MTOM/XOP
-- problem seems to require that you know the mime type

Downloading a Binary File from a Web Service using Axis2 and SOAP with Attachments

This example uses SWA.

June 23, 2008

Getting basic axis soap client (CalcClient) to work.

For the axis 1.4 soap client (samples.userguide.example2.CalcClient) to work I had to do the following:

No Problem with server

I copied Calculate.java tomcat/webapps/axis/Calculate.jws and it worked just fine.

The following URLs give the right answer:
http://localhost:8080/axis/Calculator.jws
http://localhost:8080/axis/Calculator.jws?method=add&i1=1&i2=2

Issues with the client

1) Change the build.xml file so that it did not exclude CalcClient.

2) Download two additional jars:
/jaf-1.1.1/activation.jar
javamail-1.4.1/mail.jar
From:
activation.jar
mail.jar


This eliminated the message:
Exception in thread "main" java.lang.NoClassDefFoundError: samples/userguide/example2/CalcClient

3) Changed the names of some of the jars that are in the $AXISCLASSPATH. The correct names are:
/usr/local/axis-1_4/lib/axis-ant.jar
/usr/local/axis-1_4/lib/axis.jar
/usr/local/axis-1_4/lib/commons-discovery-0.2.jar
/usr/local/axis-1_4/lib/commons-logging-1.0.4.jar
/usr/local/axis-1_4/lib/jaxrpc.jar
/usr/local/axis-1_4/lib/log4j-1.2.8.jar
/usr/local/axis-1_4/lib/saaj.jar
/usr/local/axis-1_4/lib/wsdl4j-1.5.1.jar

Some of the names on the AXIS install page are wrong. For instance $AXIS_LIB/commons-discovery.jar is listed while it really should be: $AXIS_LIB/commons-discovery-0.2.jar
4) Moved activation.jar and mail.jar to my $AXIS_LIB (/usr/local/axis-1_4/lib). And made a new $AXISCLASSPATH: yielding the nice result


AXISCLASSPATH=/usr/local/axis-1_4/lib/axis-ant.jar:\
/usr/local/axis-1_4/lib/axis.jar:\
/usr/local/axis-1_4/lib/commons-discovery-0.2.jar:\
/usr/local/axis-1_4/lib/commons-logging-1.0.4.jar:\
/usr/local/axis-1_4/lib/jaxrpc.jar:\
/usr/local/axis-1_4/lib/log4j-1.2.8.jar:\
/usr/local/axis-1_4/lib/saaj.jar:\
/usr/local/axis-1_4/lib/wsdl4j-1.5.1.jar:\
/usr/local/axis-1_4/lib/activation.jar:/usr/local/axis-1_4/lib/mail.jar

java -cp .:$AXISCLASSPATH samples.userguide.example2.CalcClient add 8 34
Got result : 42

Useful Links

Creating Web Services with Apache Axis

March 29, 2008

svn on chaucer

Location of my svn space on chaucer:

/var/svn/projects/jeff

web


svn:chaucer

February 5, 2008

Lunch with Colin Clustering, Qmaster, Flash codec

Summary

I had lunch with Colin McFadden today and he told me how he was using clustering to increase Media Mill's throughput.

Qmaster and clustering

Apple Qmaster is a system made by Apple Inc. that provides automated work distribution and processing for high-volume projects created with certain digital visual effects software packages: Shake, Alias Maya, Final Cut Pro, Compressor, DVD Studio Pro and any UNIX command-line program. It processes such jobs on a cluster of Macintosh or Xserve computers. Colin says that the time form opening the box to having a new computer in the cluster is about 1 hour. In the end, the limiting factor will be network speed.

Compressor used by Qmaster

Compressor is a video and audio media compression and encoding application for use with Final Cut Studio and Logic Studio on Mac OS X. It can be used with Qmaster for clustering.

Codec and clustering

A video codec is a device or software that enables video compression and/or decompression for digital video. Flash uses VP6: A proprietary video codec developed by On2 Technologies and used in Adobe Flash Player 8 and above. Colin tells me that VP6 can only run a job on one box at a time.
Adobe is moving to: H.264 is a standard for video compression. It is also known as MPEG-4 Part 10, or MPEG-4 AVC (for Advanced Video Coding). It was written by the ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC Moving Picture Experts Group (MPEG) as the product of a partnership effort known as the Joint Video Team (JVT). Colin explains that this codec is multi-machine aware.

December 21, 2007

Java classes to test fedora server

Compile the java class SimpleClient.java with the class FedoraSOAPClient.java in the same directory. That is take the steps:

$> CLASSPATH=/usr/local/fedora/client/fedora-client.jar:.
$> javac SimpleClient.java
$> java SimpleClient

this should produce like this result