[EP-tech] help on configuring OAI-PMH harvester in eprints 3.3.14

John Salter J.Salter at leeds.ac.uk
Thu Jun 29 11:41:48 BST 2017

Hi Alfredo,
I haven't used the harvested, but from a quick look at the code, the 'http://idei.fr' will render any OAI identifiers that match with a link to that service.
Unless you are harvesting from idei.fr, this code will not do anything (so you can leave it - but you might want to extend for sources you are harvesting from).

Looking at the bin script, you need to set up some configuration before you can run a harvest.
I would copy the commented-out block (starting with $c->{oai_harvester}->{stub} ) at the end of oai_harvester.pl into [EPRINTS_ROOT]/archives/[ARCHIVEID]/cfg/cfg.d/z_oai_harvest_ABC.pl
- where 'ABC' is a name you would associate with the repository you are harvesting.

Now un-comment at least the 'url' line - and if necessary, add other values.
For testing, it might be worth adding a known set / time period so you can collect a small number of records from the source - e.g.
$c->{oai_harvester}->{ABC} = {
                url => 'http://ABC.com/oai',        # compulsory
                set => 'driver',                                   # optional
                from => '2017-06-29',                                                     # optional, format is YYYY-MM-DD
#             'until' => '2011-12-31',                                    # optional
#             metadataPrefix => 'oai_dc',                                        # optional, should be set by the OAIPMH/* plugin
#             default_values => sub {                                                # optional, gives a chance to set default values
#                             my( $session, $epdata, $header ) = @_;
#                             $epdata->{userid} = 1234;
#                             $epdata->{eprint_status} = 'archive';
#                             $epdata->{FIELDNAME} = VALUE;
#             },

With this configuration in place, I think you should be able to do:
> bin/harvest ARCHIVE_ID --conf=ABC --plugin=OAIPMH::OAI_DC

As I said - I've never used this - and the above is from a quick skim-read of the code, but I hope this gets you started!


PS I'll look at your other email now :o)

From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of Alfredo Cosco
Sent: 29 June 2017 10:00
To: eprints-tech at ecs.soton.ac.uk
Subject: [EP-tech] help on configuring OAI-PMH harvester in eprints 3.3.14

Hi all,
i've to configure the harvester module in eprints 3.3.14 but documentation is quite unconsistent and I need for a little help.

I downloaded the module from this link: http://files.eprints.org/798/

Has anyone managed to install and use this feature?

In the cfg/cfg.d/oai_harvester.pl<http://oai_harvester.pl> file, before the sample, there is a piece of code that points to http://idei.fr/, is it a sample too and has to be configured or I've to leave it as it is?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20170629/41167028/attachment-0001.html 

More information about the Eprints-tech mailing list