[EP-tech] DSpace Harvester and OAI_Bibliography.pm

Yuri yurj at alfa.it
Fri Jun 19 06:51:18 BST 2020


You're right.

OAI plugins works in this way:

for every archive record, for every plugin, do the metadata format. But OAI_Bibliography works only for items with bibliography
, not for all the records. So, it should not be used as generic oai plugin which expect to have valid metadata for every item. Bibliography has valid metadata only on bibliography items.

You can disable it in the config, being it a plugin:

$c->{plugins}{"Export::OAI_Bibliography"}{params}{disable} = 1;

I think this should be a default setting, maybe worth a pull request on the git repository here:

https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprints%2Feprints%2Fblob%2F3.3%2Flib%2Fdefaultcfg%2Fcfg.d%2Fplugins.pl&data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Ca25dc2382e0b44b9d02008d81414cbef%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&sdata=jeLEcjRDezn6X486Fo9igN1qrzGA9%2F0aWLl%2BxoSrfxA%3D&reserved=0

Il 18/06/20 20:50, Tomasz Neugebauer ha scritto:
> Hi Yuri, thank you for the detailed info.  Yes, it looks like an issue with DSpace harvester.
>
> The issue did make me think about our oai_bibl metadata prefix, though, is that OAI_Bibliography.pm file doing something useful, if it suggests a metadata prefix in the OAI endpoint that returns empty records?    If anyone has any comments on this, that's great, but the harvester question is resolved AFAIK, it should be requesting a specific prefix oai_dc.
>
> Tomasz
>
>
> -----Original Message-----
> From: eprints-tech-bounces at ecs.soton.ac.uk <eprints-tech-bounces at ecs.soton.ac.uk> On Behalf Of Yuri via Eprints-tech
> Sent: June 18, 2020 4:51 AM
> To: eprints-tech at ecs.soton.ac.uk
> Subject: Re: [EP-tech] DSpace Harvester and OAI_Bibliography.pm
>
> Attention This email originates from outside the concordia.ca domain. // Ce courriel provient de l'exterieur du domaine de concordia.ca I would exclude this format/plugin from oai2 in:
>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprints%2Feprints%2Fblob%2F3.3%2Fcgi%2Foai2%23L559&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Ca25dc2382e0b44b9d02008d81414cbef%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=eTpePCErS2X5P4%2FP98lM1DWmOib%2BXPvZQVgXbME%2FP0w%3D&amp;reserved=0
>
> or you can change sort here (weak):
>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprints%2Feprints%2Fblob%2F3.3%2Fcgi%2Foai2%23L565&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Ca25dc2382e0b44b9d02008d81414cbef%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=5MPma7%2BgUxXVqi8BXPuRZHQnB0yxvZxOn6MSNxE4dNU%3D&amp;reserved=0
>
> I think this is an issue in DSpace, it should use always oai_dc as default format (instead of checking schema, the OAI specs cite oai_dc).

>
> Il 17/06/20 20:22, Tomasz Neugebauer via Eprints-tech ha scritto:
>> Hi everyone...  in attempting to harvest some EPrinst repositories
>> using DSpace harvester, the following issue was reported in 2016:
>>
>> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdspac
>> e.2283337.n4.nabble.com%2FHarvesting-EPrints-repository-from-DSpace-td
>> 4681086.html&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbf431
>> 9bea44c499bca5e08d81364c1dc%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp
>> ;sdata=nI%2FI5a0TlLkyap47s5F1Z0Qp14h%2FyFtkbPF6siVr5Ig%3D&amp;reserved
>> =0
>> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdspa
>> ce.2283337.n4.nabble.com%2FHarvesting-EPrints-repository-from-DSpace-t
>> d4681086.html&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbf43
>> 19bea44c499bca5e08d81364c1dc%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&am
>> p;sdata=nI%2FI5a0TlLkyap47s5F1Z0Qp14h%2FyFtkbPF6siVr5Ig%3D&amp;reserve
>> d=0>
>>
>> "What happens in this case is that EPrints has more than one entry for
>> the supported metadata formats using OAI_DC (oai_bibl and oai_dc
>> prefixes):
>>
>> .
>> <metadataFormat>
>>    <metadataPrefix>oai_bibl</metadataPrefix>
>>    
>> <schema>https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F
>> %2Fwww.openarchives.org%2FOAI%2F2.0%2Foai_dc.xsd&amp;data=01%7C01%7Cep
>> rints-tech%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a53
>> 78f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=MLrEz%2BO6rjrjKedBKsZP4s2gY
>> E5HBcmFChuRyJKP2lE%3D&amp;reserved=0
>> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.
>> openarchives.org%2FOAI%2F2.0%2Foai_dc.xsd&amp;data=01%7C01%7Ceprints-t
>> ech%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a5378f929f
>> 44d3ebe89669d03ada9d8%7C0&amp;sdata=MLrEz%2BO6rjrjKedBKsZP4s2gYE5HBcmF
>> ChuRyJKP2lE%3D&amp;reserved=0></schema>
>>    
>> <metadataNamespace>https://eur03.safelinks.protection.outlook.com/?url
>> =http%3A%2F%2Fwww.openarchives.org%2FOAI%2F2.0%2Foai_dc%2F&amp;data=01
>> %7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c
>> 1dc%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=DcH06bBnqrStDrAwV
>> Ka7OheMydO6ax9Vw86FYCLAbu4%3D&amp;reserved=0
>> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.
>> openarchives.org%2FOAI%2F2.0%2Foai_dc%2F&amp;data=01%7C01%7Ceprints-te
>> ch%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a5378f929f4
>> 4d3ebe89669d03ada9d8%7C0&amp;sdata=DcH06bBnqrStDrAwVKa7OheMydO6ax9Vw86
>> FYCLAbu4%3D&amp;reserved=0></metadataNamespace>
>> </metadataFormat>
>> <metadataFormat>
>>    <metadataPrefix>oai_dc</metadataPrefix>
>>    
>> <schema>https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F
>> %2Fwww.openarchives.org%2FOAI%2F2.0%2Foai_dc.xsd&amp;data=01%7C01%7Cep
>> rints-tech%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a53
>> 78f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=MLrEz%2BO6rjrjKedBKsZP4s2gY
>> E5HBcmFChuRyJKP2lE%3D&amp;reserved=0
>> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.
>> openarchives.org%2FOAI%2F2.0%2Foai_dc.xsd&amp;data=01%7C01%7Ceprints-t
>> ech%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a5378f929f
>> 44d3ebe89669d03ada9d8%7C0&amp;sdata=MLrEz%2BO6rjrjKedBKsZP4s2gYE5HBcmF
>> ChuRyJKP2lE%3D&amp;reserved=0></schema>
>>    
>> <metadataNamespace>https://eur03.safelinks.protection.outlook.com/?url
>> =http%3A%2F%2Fwww.openarchives.org%2FOAI%2F2.0%2Foai_dc%2F&amp;data=01
>> %7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c
>> 1dc%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=DcH06bBnqrStDrAwV
>> Ka7OheMydO6ax9Vw86FYCLAbu4%3D&amp;reserved=0
>> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.
>> openarchives.org%2FOAI%2F2.0%2Foai_dc%2F&amp;data=01%7C01%7Ceprints-te
>> ch%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a5378f929f4
>> 4d3ebe89669d03ada9d8%7C0&amp;sdata=DcH06bBnqrStDrAwVKa7OheMydO6ax9Vw86
>> FYCLAbu4%3D&amp;reserved=0></metadataNamespace>
>> </metadataFormat>
>> .
>>
>> DSpace's harvester is then selecting the first metadataPrefix, i.e.
>> oai_bibl, for which EPrints is returning records with no metadata."
>>
>> Someone is having a similar issue now with EPrints repositories, so
>> I'm wondering, is this still an issue, or was there a fix/modification
>> added to EPrints for this?
>>
>> I haven't tried the solution to remove OAI_Bibliography.pm from the
>> core files.
>>
>> Tomasz
>>
>>
>> *** Options:
>> http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>> *** Archive:
>> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.e
>> prints.org%2Ftech.php%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.a
>> c.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a5378f929f44d3ebe89669d03ad
>> a9d8%7C0&amp;sdata=FHk26N61rfj82zHanYPYmPj4MZ2%2Bw0fyHLb%2FiWX0fmI%3D&
>> amp;reserved=0
>> *** EPrints community wiki:
>> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.
>> eprints.org%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbf4
>> 319bea44c499bca5e08d81364c1dc%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&a
>> mp;sdata=77NiOEIH%2F2QizbYVyA2a8PVGoYkO4XgtFtE85W8zgEg%3D&amp;reserved
>> =0
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Ca25dc2382e0b44b9d02008d81414cbef%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=oj5mUhefMSnxVWtWkyhJbiQ4TUNR33KxM9LOeLncXf0%3D&amp;reserved=0
> *** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Ca25dc2382e0b44b9d02008d81414cbef%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=AhsLoJJT0REuL4%2BVsKnTWYSa2SG4jEai8p0pArrlxps%3D&amp;reserved=0
>



More information about the Eprints-tech mailing list