[EP-tech] DSpace Harvester and OAI_Bibliography.pm

Tomasz Neugebauer Tomasz.Neugebauer at concordia.ca
Thu Jun 18 19:50:58 BST 2020


Hi Yuri, thank you for the detailed info.  Yes, it looks like an issue with DSpace harvester. 

The issue did make me think about our oai_bibl metadata prefix, though, is that OAI_Bibliography.pm file doing something useful, if it suggests a metadata prefix in the OAI endpoint that returns empty records?    If anyone has any comments on this, that's great, but the harvester question is resolved AFAIK, it should be requesting a specific prefix oai_dc.

Tomasz


-----Original Message-----
From: eprints-tech-bounces at ecs.soton.ac.uk <eprints-tech-bounces at ecs.soton.ac.uk> On Behalf Of Yuri via Eprints-tech
Sent: June 18, 2020 4:51 AM
To: eprints-tech at ecs.soton.ac.uk
Subject: Re: [EP-tech] DSpace Harvester and OAI_Bibliography.pm

Attention This email originates from outside the concordia.ca domain. // Ce courriel provient de l'exterieur du domaine de concordia.ca I would exclude this format/plugin from oai2 in:

https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprints%2Feprints%2Fblob%2F3.3%2Fcgi%2Foai2%23L559&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cfb3fe8e8b2ce49432e7c08d813b8ad5e%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=lNVHZwqNjQjwwpVBjZ48Sx6NX6dLaC0oVTxkXOVfG68%3D&amp;reserved=0

or you can change sort here (weak):

https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprints%2Feprints%2Fblob%2F3.3%2Fcgi%2Foai2%23L565&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cfb3fe8e8b2ce49432e7c08d813b8ad5e%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=KhZgdUHdouSAdZuNOweis1ZIkXUanUo4uuuVrRwro90%3D&amp;reserved=0

I think this is an issue in DSpace, it should use always oai_dc as default format (instead of checking schema, the OAI specs cite oai_dc).

Il 17/06/20 20:22, Tomasz Neugebauer via Eprints-tech ha scritto:
>
> Hi everyone...  in attempting to harvest some EPrinst repositories 
> using DSpace harvester, the following issue was reported in 2016:
>
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdspac
> e.2283337.n4.nabble.com%2FHarvesting-EPrints-repository-from-DSpace-td
> 4681086.html&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbf431
> 9bea44c499bca5e08d81364c1dc%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp
> ;sdata=nI%2FI5a0TlLkyap47s5F1Z0Qp14h%2FyFtkbPF6siVr5Ig%3D&amp;reserved
> =0 
> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdspa
> ce.2283337.n4.nabble.com%2FHarvesting-EPrints-repository-from-DSpace-t
> d4681086.html&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbf43
> 19bea44c499bca5e08d81364c1dc%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&am
> p;sdata=nI%2FI5a0TlLkyap47s5F1Z0Qp14h%2FyFtkbPF6siVr5Ig%3D&amp;reserve
> d=0>
>
> "What happens in this case is that EPrints has more than one entry for 
> the supported metadata formats using OAI_DC (oai_bibl and oai_dc
> prefixes):
>
> .
> <metadataFormat>
>   <metadataPrefix>oai_bibl</metadataPrefix>
>   
> <schema>https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F
> %2Fwww.openarchives.org%2FOAI%2F2.0%2Foai_dc.xsd&amp;data=01%7C01%7Cep
> rints-tech%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a53
> 78f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=MLrEz%2BO6rjrjKedBKsZP4s2gY
> E5HBcmFChuRyJKP2lE%3D&amp;reserved=0
> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.
> openarchives.org%2FOAI%2F2.0%2Foai_dc.xsd&amp;data=01%7C01%7Ceprints-t
> ech%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a5378f929f
> 44d3ebe89669d03ada9d8%7C0&amp;sdata=MLrEz%2BO6rjrjKedBKsZP4s2gYE5HBcmF
> ChuRyJKP2lE%3D&amp;reserved=0></schema>
>   
> <metadataNamespace>https://eur03.safelinks.protection.outlook.com/?url
> =http%3A%2F%2Fwww.openarchives.org%2FOAI%2F2.0%2Foai_dc%2F&amp;data=01
> %7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c
> 1dc%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=DcH06bBnqrStDrAwV
> Ka7OheMydO6ax9Vw86FYCLAbu4%3D&amp;reserved=0
> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.
> openarchives.org%2FOAI%2F2.0%2Foai_dc%2F&amp;data=01%7C01%7Ceprints-te
> ch%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a5378f929f4
> 4d3ebe89669d03ada9d8%7C0&amp;sdata=DcH06bBnqrStDrAwVKa7OheMydO6ax9Vw86
> FYCLAbu4%3D&amp;reserved=0></metadataNamespace>
> </metadataFormat>
> <metadataFormat>
>   <metadataPrefix>oai_dc</metadataPrefix>
>   
> <schema>https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F
> %2Fwww.openarchives.org%2FOAI%2F2.0%2Foai_dc.xsd&amp;data=01%7C01%7Cep
> rints-tech%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a53
> 78f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=MLrEz%2BO6rjrjKedBKsZP4s2gY
> E5HBcmFChuRyJKP2lE%3D&amp;reserved=0
> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.
> openarchives.org%2FOAI%2F2.0%2Foai_dc.xsd&amp;data=01%7C01%7Ceprints-t
> ech%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a5378f929f
> 44d3ebe89669d03ada9d8%7C0&amp;sdata=MLrEz%2BO6rjrjKedBKsZP4s2gYE5HBcmF
> ChuRyJKP2lE%3D&amp;reserved=0></schema>
>   
> <metadataNamespace>https://eur03.safelinks.protection.outlook.com/?url
> =http%3A%2F%2Fwww.openarchives.org%2FOAI%2F2.0%2Foai_dc%2F&amp;data=01
> %7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c
> 1dc%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=DcH06bBnqrStDrAwV
> Ka7OheMydO6ax9Vw86FYCLAbu4%3D&amp;reserved=0
> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.
> openarchives.org%2FOAI%2F2.0%2Foai_dc%2F&amp;data=01%7C01%7Ceprints-te
> ch%40ecs.soton.ac.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a5378f929f4
> 4d3ebe89669d03ada9d8%7C0&amp;sdata=DcH06bBnqrStDrAwVKa7OheMydO6ax9Vw86
> FYCLAbu4%3D&amp;reserved=0></metadataNamespace>
> </metadataFormat>
> .
>
> DSpace's harvester is then selecting the first metadataPrefix, i.e. 
> oai_bibl, for which EPrints is returning records with no metadata."
>
> Someone is having a similar issue now with EPrints repositories, so 
> I'm wondering, is this still an issue, or was there a fix/modification 
> added to EPrints for this?
>
> I haven't tried the solution to remove OAI_Bibliography.pm from the 
> core files.
>
> Tomasz
>
>
> *** Options: 
> http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: 
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.e
> prints.org%2Ftech.php%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.a
> c.uk%7Cbf4319bea44c499bca5e08d81364c1dc%7C4a5378f929f44d3ebe89669d03ad
> a9d8%7C0&amp;sdata=FHk26N61rfj82zHanYPYmPj4MZ2%2Bw0fyHLb%2FiWX0fmI%3D&
> amp;reserved=0
> *** EPrints community wiki: 
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.
> eprints.org%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cbf4
> 319bea44c499bca5e08d81364c1dc%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&a
> mp;sdata=77NiOEIH%2F2QizbYVyA2a8PVGoYkO4XgtFtE85W8zgEg%3D&amp;reserved
> =0

*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cfb3fe8e8b2ce49432e7c08d813b8ad5e%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=Lb1ReVZK0Pdp5CDT3QE9UNoLLyRfOpcl4NN1CaT%2FhqU%3D&amp;reserved=0
*** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cfb3fe8e8b2ce49432e7c08d813b8ad5e%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=7t%2FaGaki1BSnVl2Pg8NuqEwgsLSOONyFl5mqivO0Xrs%3D&amp;reserved=0




More information about the Eprints-tech mailing list