[EP-tech] Re: Merging two eprints using 'succeeds'

Andy Reid Andy.Reid at lshtm.ac.uk
Tue Sep 22 19:40:29 BST 2015


Just for the record, yes, that bit of XML does work - it really was that
simple :-)  

</FamousLastWords>

Andy

barebones TEST Script:  Minimal error checking or safety checks 
Would need another section to set <metadata-visibility> on the old ID
================================================================8<
<?php
# Usage:
http://foo.bar.ac.uk/publications/utils/y_succeeds_x.php?oldID=1111&newID=2222

$tmpDIR = "/tmp/";
$username='xxxxxxxxxxx';$password='yyyyyyyyyyyy';  
$oldID=$_REQUEST['oldID'];
$newID=$_REQUEST['newID'];



$XML=<<< EOX
<?xml version="1.0" encoding="UTF-8"?>
<eprints xmlns="http://eprints.org/ep2/data/2.0">
    <eprint id="http://foo.bar.ac.uk/id/eprint/$newID"><!-- not sure
this is necessary, but haven't tested without it - I think the
CURLOPT_URL may be sufficient —>
	    <eprintid>$newID</eprintid>							
   <!--  ditto   —>
	    <succeeds>$oldID</succeeds>
    </eprint>
</eprints>
EOX;

$ch = curl_init();
curl_setopt($ch, CURLOPT_USERPWD, $username . ":" . $password);
$tmpFILESIZE=strlen($XML);
$tmpFILE=$tmpDIR.$newID.'_succeeds_'.$oldID.'.xml';
file_put_contents($tmpFILE,$XML) or die("unable to write to file
$tmpFILE");
curl_setopt($ch, CURLOPT_PUT,1);
 $handle = fopen($tmpFILE, "r");
curl_setopt($ch,CURLOPT_INFILE,$handle);
curl_setopt($ch,CURLOPT_INFILESIZE,$tmpFILESIZE);

curl_setopt($ch, CURLOPT_URL,
"http://eprints.foo.bar.ac.uk/id/eprint/$newID");

curl_setopt($ch, CURLOPT_HEADER, 1);

if(!($newID && $oldID) ){die( "Needs old and new eprint IDs");}  # but
doesn't check that either eprint ID actually exists on the server!

$pkgheader=Array('X-Packaging: http://eprints.org/ep2/data/2.0',
				 'Content-Type: text/xml',
				 'Metadata-Relevant: true',
				 'X-Verbose: true' ); #,			
   
curl_setopt($ch,CURLOPT_HTTPHEADER,$pkgheader);


#########################################################
($result=curl_exec($ch) )|| die( "curl_exec failed: ".
curl_error($ch));
#########################################################

echo "RESULT=". $result;


curl_close($ch);
fclose($handle);
unlink($tmpFILE);

?>

>>> "Andy Reid" <Andy.Reid at lshtm.ac.uk> 21 September 2015 15:11 >>>
Hi,
I'm trying to work out how to thread two eprints so that I can merge
the Accepted Manuscript record with Final published version, given that
I have two separate systems feeding into the repository -  one feeding
Manuscripts, one harvesting published metadata from PubMed etc.  I've
looked at what happens when you create a linked new version of a record,
with the <succeeds> set on the new record, and <metadata_visibility> set
to no-search  on the old one. I'm in the slightly different position of
wanting to merge two records once they are in the repository, but it
seems like those are the fields I need to tweak.  I've looked at the
code in
https://github.com/eprints/eprints/blob/3.3/perl_lib/EPrints/DataObj/EPrint.pm

to see the various in_thread manipulation functions, and can just about
follow them.  

What I can't find is an admin utility that says 'Merge these two
records, making this one the version of record, and retaining that one
as an earlier version.'  Am I not looking hard enough?

What I might also want to do is to trigger this from the external
systems, via SWORD, sending a couple of minimal XML packages to modify
each of the records:

<?xml version="1.0" encoding="UTF-8"?>
<eprints xmlns="http://eprints.org/ep2/data/2.0">
( file:///C:/Users/EADMAREI/Downloads/lshtmtest-eprint-991329.xml#)
<eprint
id="http://blah.ac.uk/id/eprint/991329"><eprintid>991329</eprintid><succeeds>		
991328</succeeds></eprint></eprints>
and likewise for the <metadata-visiblility> on the old record**
Does that seem like it ought to work?  It seems too easy.  Am I missing
deeper layers of subtlety that are going to get corrupted by my naive
approach.
Also, the eprint id is given twice in a standard eprint XML export, as
per the edited code above.  Is it necessary to have both to trigger an
update correctly?
Andy Reid
** (I say 'old record' but in reality the published versions may arrive
before the manuscript versions, which depend on the authors sending them
and us having the resources to process them ;-/ )

Andy Reid
Research Information Manager
Room G43, Executive Office
London School of Hygiene & Tropical Medicine
Keppel St
LONDON WC1E 7HT
+44 020-7927-2618
http://orcid.org/0000-0002-2500-2980


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20150922/5df1a75d/attachment.html 


More information about the Eprints-tech mailing list