[EP-tech] OAI harvesting / records moving from live to buffer (or other non-deletion datasets).

John Salter J.Salter at leeds.ac.uk
Mon Jun 15 14:13:25 BST 2015

I'm trying to work out an elegant solution to this issue:
EPrint is made live
EPrint is harvested over OAI-PMH
EPrint is moved to non-live dataset (e.g. buffer - or in this specific case dark-archive)
EPrint is no longer available publicly, but an OAI-PMH harvest will not see the record as deleted - and will not therefore remove it.

I've checked with OAI-PMH gurus, and they think that just flagging the record as deleted will be OK - if the record subsequently reappears, it should get re-harvested OK.

I think that the solution for this is to add a filter to the OAI-PMH searches that looks for EPrints with a datestamp (when the item was first made live), but that aren't in the 'archive' dataset.
To achieve this methods in EPrints::OpenArchives (that currently check for 'deletion' status) will need to be tweaked, and filters for 'has datestamp' added to cgi/oai2 OR $c->{oai}->{filters}.

Has anyone else come across this issue and found an elegant solution - or can see any issues with this proposal?


More information about the Eprints-tech mailing list