« Bill's Solution to the Solr problem | Main | Islandora was not properly pointing at https fedora sites »

Finding all the collections a DSPACE item belongs to

Multi-level tree in AgEcon

The list below gives an item that is under several layers in the DSPACE tree.
AgEcon Search
  Universitaet Hohenheim>
   Institute of Agricultural Policy and Agricultural MarketsĀ >
      Working PapersĀ >
        A 2004 Social Accounting Matrix for Israel>

SQL to determine names of ancestors of an item

Item to collection

Starting handle = 110156

SQL for item handle to collection handle

SELECT handle FROM handle WHERE resource_type_id=3 AND resource_id=(SELECT collection.collection_id FROM collection, collection2item, item, handle WHERE collection2item.item_id=item.item_id AND collection2item.collection_id=collection.collection_id AND handle.resource_id=item.item_id AND resource_type_id=2 AND handle = 110156) ;
collection handle: 93692

Collection Name SQL

select name from handle, collection where handle = 93692 AND resource_type_id= 3 AND collection_id = handle.resource_id;

Collection to Community

SQL for Collection handle to Community handle

SELECT handle FROM community2collection, handle WHERE handle.resource_type_id =4 AND community2collection.community_id = handle.resource_id AND community2collection.collection_id = (SELECT resource_id FROM handle WHERE resource_type_id= 3 AND handle= 93692);
community level 1: 93691

community name SQL

select name from handle, community where handle = 93691 AND resource_type_id= 4 AND community_id = handle.resource_id;

Next Community level

SELECT handle FROM community2collection, handle WHERE handle.resource_type_id =4 AND community2collection.community_id = handle.resource_id AND community2collection.collection_id = (SELECT resource_id FROM handle WHERE resource_type_id= 3 AND handle= 93691);
top 93690

Proof that 93690 is top

SELECT handle FROM community2community, handle WHERE handle.resource_type_id =4 AND community2community.parent_comm_id = handle.resource_id AND community2community.child_comm_id = (SELECT resource_id FROM handle WHERE resource_type_id= 4sss AND handle= 93690);
returns nothing

PHP code

The code below will pull the names of all the ancestors of an item.
#!/usr/bin/php 
 $value)
 echo $key.'=>'.$value."\n";

//
// Test should produce: (1 indicates a primary community, 0 is a lower level)
//
// Content-type: text/html
// X-Powered-By: PHP/4.3.9
// 
// Working Papers=>0
// Universitaet Hohenheim=>1
// Institute of Agricultural Policy and Agricultural Markets=>0


// if you are interesed in the parents of a collection call:
// process_collection 

// if you are interesed in the parents of a community call:
// process_community 


function ancestor_names($item_handle, &$ancestors){
  include("port_DSPACE_sql.inc.php");
  $dbh = open_DSPACE_DB();  // opens a connection to the DSPACE psql database

// This query will find a collection handle given an item handle
  $item_to_collection_query="SELECT handle FROM handle WHERE resource_type_id=3 AND 
          resource_id=(SELECT collection.collection_id 
          FROM collection, collection2item, item, handle 
          WHERE collection2item.item_id=item.item_id 
          AND collection2item.collection_id=collection.collection_id 
          AND handle.resource_id=item.item_id AND resource_type_id=2 
          AND handle = $item_handle) ; ";

  $result = pg_query($item_to_collection_query);
  if (!$result) {
    echo "Problem with query " . $item_to_collection_query . "\n";
    echo pg_last_error();
    exit();
  }


// Could have multiple collection parents
while ($collection_handle = pg_fetch_row($result)) {
     process_collection($collection_handle[0], $ancestors);
}
}
 
 function process_collection($collection_handle,  &$ancestors){
 $collection_title_query = "select name from handle, collection where handle = 93692 AND resource_type_id= 3 AND collection_id = handle.resource_id;";
  $result = pg_query($collection_title_query);
  if (!$result) {
    echo "Problem with query " . $collection_title_query . "\n";
    echo pg_last_error();
    exit();
  }

while ($collection_title = pg_fetch_row($result)) {
  $ancestors[$collection_title[0]] = 0;
}
   collection_to_community ($collection_handle, $ancestors);
 }

// climbs the tree from the collection to the comunity level
function collection_to_community ($collection_handle,  &$ancestors) {
  $collection_to_commuity_query="SELECT handle FROM community2collection, handle 
                                WHERE handle.resource_type_id =4 AND 
                                community2collection.community_id = handle.resource_id AND
                                community2collection.collection_id = 
                                (SELECT resource_id FROM handle WHERE 
                                resource_type_id= 3 AND handle= $collection_handle);";
  $result = pg_query($collection_to_commuity_query);
  if (!$result) {
    echo "Problem with query " . $collection_to_commuity_query . "\n";
    echo pg_last_error();
    exit();
  }

while ($community_handle = pg_fetch_row($result)) {
     process_community($community_handle[0],$ancestors);
}

}


// finds the community title and recursively finds all higher level collections
function process_community ($community_handle, &$ancestors) {
 $community_title_query = "select name from handle, community where handle = $community_handle AND resource_type_id= 4 AND community_id = handle.resource_id; ";
 
  $result = pg_query($community_title_query);
  if (!$result) {
    echo "Problem with query " . $community_title_query . "\n";
    echo pg_last_error();
    exit();
  }
$community_title="";
while ($row = pg_fetch_row($result)) {
  $community_title = $row[0] ;
}

  $community_to_community_query="SELECT handle FROM community2community, handle WHERE 
                                handle.resource_type_id =4 AND 
                                community2community.parent_comm_id = handle.resource_id AND 
                                community2community.child_comm_id = (SELECT resource_id 
                                FROM handle WHERE resource_type_id= 4 AND handle= $community_handle); ";
  $result = pg_query($community_to_community_query);
  if (!$result) {
    echo "Problem with query " . $collection_to_commuity_query . "\n";
    echo pg_last_error();
    exit();
  }
$this_community_is_top = 1;
while ($row = pg_fetch_row($result)) {
  // if we get into this loop there is a higher community
  $this_community_is_top = 0;
  process_community($row[0],$ancestors );
}
  $ancestors[$community_title] = $this_community_is_top;
}
 ?>

Every item maps to a collection

Number of item handles:
dspace_sr=> select count (handle) from handle where resource_type_id=2;
 count 
-------
 48653
Number of item links to collections:
dspace_sr=> select count (item_id) from collection2item;
 count 
-------
 48653

Communities do not have multiple parents

I also checked the community2community table and there were no duplicate child_comm_id values. This implies that the children have one and only one parent.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)