[EP-tech] Antwort: Antwort: Re: Antwort: Re: fail to import PubMedID

martin.braendle at id.uzh.ch martin.braendle at id.uzh.ch
Wed Nov 9 07:54:33 GMT 2016


Hi,

thanks also to Adam Field who reviewed my code and provided useful
suggestions.

If you have not done yet, you should get the latest revision which returns
an XML error code if the NCBI server fails from
https://github.com/eprintsug/PubMedID-Import .

Also, Jens had updated the metadata_update script that uses PubMed too and
provides it at the URL above.

Regards,

Martin

--
Dr. Martin Brändle
Zentrale Informatik
Universität Zürich
Stampfenbachstr. 73
CH-8006 Zürich




Von:	Hiroshi Watabe <hwatabe at m.tohoku.ac.jp>
An:	eprints-tech at ecs.soton.ac.uk
Datum:	09/11/2016 01:03
Betreff:	Re: [EP-tech] Antwort: Antwort: Re: Antwort: Re: fail to import
            PubMedID
Gesendet von:	eprints-tech-bounces at ecs.soton.ac.uk



Dear Martin,

Thank you for your code. Now it works for me (although I must skip
duplication check because my table does not have eprint.pubmedid).

Regards,

Hiroshi
 On Tue, 8 Nov 2016
12:52:43 +0100 martin.braendle at id.uzh.ch wrote:

> I have published our version of the PubMedID Import plugin to
>
> https://github.com/eprintsug/PubMedID-Import
>
> It has been updated to cope with the https protocol that NCBI uses
> and also contains some code that does a duplicate check in the
> EPrints repo. See also attached phrases files (English and German).
>
> Feel free to use from this code whatever you think is useful for your
> implementation.
>
> Best regards,
>
> Martin
>
> --
> Dr. Martin Brändle
> Zentrale Informatik
> Universität Zürich
> Stampfenbachstr. 73
> CH-8006 Zürich
>
> mail: martin.braendle at id.uzh.ch
> phone: +41 44 63 56705
> fax: +41 44 63 54505
> http://www.zi.uzh.ch
>
>
>
> Von:		 jens.vieler at id.uzh.ch
> An:		 eprints-tech at ecs.soton.ac.uk
> Datum:		 07/11/2016 16:05
> Betreff:		 [EP-tech] Antwort: Re:  Antwort: Re:  fail to import
> PubMedID Gesendet von:		 eprints-tech-bounces at ecs.soton.ac.uk
>
>
>
> ...i think, it is more general if XML::LibXML can't deal with https.
> So it's here: perl_lib/EPrints/XML/LibXML.pm (Line 69) and
> 'XML::LibXML->new ();' is the wrong parser for our needs.
>
> What would you suggest? Changing Import/PubMedID.pm and
> bin/metadata_update from anything like
>
> EPrints::XML::parse_url( $url );
>
> to something like
>
> - using LWP to retrieve it
> - then LibXML to decode it to xml
>
> or create a more general and new EPrints::XML module?
>
> Workarounds or other quick & dirtys are also welcome
>
> Jens
>
>
>
> --
> Jens Vieler
> Zentrale Informatik
> Universität Zürich
> Stampfenbachstrasse 73
> CH-8006 Zürich
>
> mail:  jens.vieler at id.uzh.ch
> phone: +41 44 63 56777
> http://www.id.uzh.ch
>
> Inactive hide details for Adam Field ---07.11.2016 14:39:46---….on,
> incidentally, it’s this line: https://github.com/eprints/Adam Field
> ---07.11.2016 14:39:46---….on, incidentally, it’s this line:
> https://github.com/eprints/eprints/blob/3.3/perl_lib/EPrints/Plu
>
> Von: Adam Field <Adam.Field at jisc.ac.uk>
> An: "eprints-tech at ecs.soton.ac.uk" <eprints-tech at ecs.soton.ac.uk>
> Datum: 07.11.2016 14:39
> Betreff: Re: [EP-tech] Antwort: Re:  fail to import PubMedID
> Gesendet von: eprints-tech-bounces at ecs.soton.ac.uk
>
>
>
> ….on, incidentally, it’s this line:
>
>
https://github.com/eprints/eprints/blob/3.3/perl_lib/EPrints/Plugin/Import/PubMedID.pm#L58

>
>
>
>
>
> |-----------------------------|
> |                             |
> |-----------------------------|
> |Adam Field                   |
> |SHERPA services analyst      |
> |developer                    |
> |-----------------------------|
>
>
>
>
> From: Adam Field <Adam.Field at jisc.ac.uk>
> Date: Monday, 7 November 2016 13:32
> To: "eprints-tech at ecs.soton.ac.uk" <eprints-tech at ecs.soton.ac.uk>
> Subject: Re: [EP-tech] Antwort: Re: fail to import PubMedID
>
> I can confirm this – I can also download the metadata via https using
> curl.
>
> Jens’ suggestions are good.  We should be able to respond to this
> kind of thing as a community – it’s a non-core, simple bug.  I’m
> happy to offer advice, code review and testing if anyone wants to
> give it a stab. Alternatively, is there anyone out there who can
> offer me the same if I take a stab?
>
> Best
>
>
>
> |-----------------------------|
> |                             |
> |-----------------------------|
> |Adam Field                   |
> |SHERPA services analyst      |
> |developer                    |
> |-----------------------------|
>
>
>
>
> From: <eprints-tech-bounces at ecs.soton.ac.uk> on behalf of
> "jens.vieler at id.uzh.ch" <jens.vieler at id.uzh.ch>
> Reply-To: "eprints-tech at ecs.soton.ac.uk"
> <eprints-tech at ecs.soton.ac.uk> Date: Monday, 7 November 2016 10:45
> To: "eprints-tech at ecs.soton.ac.uk" <eprints-tech at ecs.soton.ac.uk>
> Subject: [EP-tech] Antwort: Re: fail to import PubMedID
>
>
>
> Dear Adam, Hiroshi, List
>
> Watching the same since this morning #-) ...they changed to https this
> weekend.
>
> wget'ing https works fine, but we canot simply change the protocol in
> our script, because it seems LibXML can't handle it. So what about
> getting the https from out of the script and change parse_url into
> parse_file on that local file. Or change to LWP::Protocol::https?
>
> Jens
>
>
> --
> Jens Vieler
> Zentrale Informatik
> Universität Zürich
> Stampfenbachstrasse 73
> CH-8006 Zürich
>
> mail:  jens.vieler at id.uzh.ch
> phone: +41 44 63 56777
> http://www.id.uzh.ch
>
> active hide details for Adam Field ---07.11.2016 11:30:30---Visiting
> the Adam Field ---07.11.2016 11:30:30---Visiting the URL, I get:
> <eFetchResult>
>
> Von: Adam Field <Adam.Field at jisc.ac.uk>
> An: "eprints-tech at ecs.soton.ac.uk" <eprints-tech at ecs.soton.ac.uk>
> Datum: 07.11.2016 11:30
> Betreff: Re: [EP-tech] fail to import PubMedID
> Gesendet von: eprints-tech-bounces at ecs.soton.ac.uk
>
>
>
>
> Visiting the URL, I get:
>
> <eFetchResult>
> <ERROR>WebEnv parameter is required</ERROR>
> </eFetchResult>
>
> If I add a dummy WebEnb parameter, I get:
>
> <eFetchResult>
> <ERROR>query_key parameter is required</ERROR>
> </eFetchResult>
>
> …it looks like the API the plugin is using has changed L  It’s
> unlikely to be a local problem.
>
>
>
>
> |-----------------------|
> |                       |
> |-----------------------|
> |      Adam Field       |
> |      SHERPA services  |
> |      analyst developer|
> |-----------------------|
>
>
>
>
> From: <eprints-tech-bounces at ecs.soton.ac.uk> on behalf of Hiroshi
> Watabe <hwatabe at m.tohoku.ac.jp>
> Organization: CYRIC
> Reply-To: "eprints-tech at ecs.soton.ac.uk"
> <eprints-tech at ecs.soton.ac.uk> Date: Monday, 7 November 2016 01:27
> To: "eprints-tech at ecs.soton.ac.uk" <eprints-tech at ecs.soton.ac.uk>
> Subject: [EP-tech] fail to import PubMedID
>
> Dear all,
>
> It seems PubMed only accepts https now and I cannot import PubMed ID
> anymore. I got the following warning message.
> Unhandled warning in Import::PubMedID: http error : Unknown IO error
>
> I modified PubMedID.pm as follows but no success.
> 27c27
> <       $self->{EFETCH_URL} =
> '
>
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&retmode=xml&rettype=full

> ';
> ---
>              $self->{EFETCH_URL} =
>       '
>
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&retmode=xml&rettype=full

>       ';
>
> Error message is as follows;
> Unhandled exception in Import::PubMedID: Could not create file parser
> context for file
>
> Could you help me?
>
> Hiroshi
> *** Options:
> http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech ***
> Archive: http://www.eprints.org/tech.php/ *** EPrints community wiki:
> http://wiki.eprints.org/ *** EPrints developers Forum:
> http://forum.eprints.org/
>
>
>
>
>
> Jisc is a registered charity (number 1149740) and a company limited by
> guarantee which is registered in England under Company No. 5747339,
> VAT No. GB 197 0632 86. Jisc’s registered office is: One Castlepark,
> Tower Hill, Bristol, BS2 0JA. T 0203 697 5800.
>
> Jisc Services Limited is a wholly owned Jisc subsidiary and a company
> limited by guarantee which is registered in England under company
> number 2881024, VAT number GB 197 0632 86. The registered office is:
> One Castle Park, Tower Hill, Bristol BS2 0JA. T 0203 697 5800. ***
> Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/*** Options:
> http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/*** Options:
> http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/
>
>
>
>
>

*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20161109/9cc27891/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
Url : http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20161109/9cc27891/attachment-0001.gif 


More information about the Eprints-tech mailing list