[EP-tech] Re: ISI Citation Data Import Script

Tian, Jia J.Tian at kingston.ac.uk
Wed Feb 20 14:15:13 GMT 2013


Dear Tim,

That is wired. I am sure the premium service result is embedded in CDATA section unless we are testing against different end point. 

Here is the premium search website service end point we are testing against:

http://search.webofknowledge.com/esti/wokmws/ws/WokSearch

Although we actually haven't subscribed the premium service, we are still able to get access of it. Is that a different service than the one you are using? 

The response message I got is like the attachment (I just initialised a fresh test in SOAPUI, so it should be up to date). You can searched out several nodes named "email_addr" and they do have the Kingston authors' valid email addresses inside.

Actually we are not using the premium service, but I am a bit puzzled....why we got different search responses. 

Best Wishes,
Jia


Jia Tian
Systems Analyst, Infrastructure, Information Services

T   Internal: 62079
T   020 8417 2079

Kingston University London
Penrhyn Road, Kingston upon Thames KT1 2EE
www.kingston.ac.uk

Information in this email and any attachments are confi dential, and may not be copied or used by anyone other than the addressee, nor disclosed
to any third party without our permission. There is no intention to create any legally binding contract or other commitment through the use of this email.
Please consider the environment before printing this email.


-----Original Message-----
From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of Tim Brody
Sent: 20 February 2013 13:22
To: eprints-tech at ecs.soton.ac.uk
Subject: [EP-tech] Re: ISI Citation Data Import Script

On Wed, 2013-02-20 at 12:30 +0000, Tian, Jia wrote:
> Dear Tim,
> 
> I have developed a SOAP client against the Lite Service based on the  
> CPAN module "SOAP::WSDL".
>  http://search.cpan.org/~mkutter/SOAP-WSDL-2.00.10/lib/SOAP/WSDL.pm
> 
> The main difference I see between premium and Lite is the data  
> encapsulation. The search results returned by Lite service are in  
> plain XML modes while the results back from premium is encapsulated in  
> CDATA. It is very painful to parse the CDATA though I succeeded. Also  
> there are two key metadata are missing in the new version Lite Search  
> compared with the old WoS service (as I understand there was no  
> Premium service before):

Here's a sample from premium:
http://users.ecs.soton.ac.uk/tdb2/eprints/records.xml
note: I've stripped r_id_disclaimer (ISI repeat it, causing broken XML) and pretty-printed

These are embedded in the SOAP response as escaped text (not CDATA sections).

> 1, The record type is missing in Lite service. So all the records are  
> imported as "Article". Our editors need to sort them out by hand once  
> records are imported.
>
> 2, Author's email addresses are missing in Lite service. So editors  
> need to add them by hand.

It doesn't look like premium gives this either. I've done a search for field "email_addr" (which restricts to records containing that) and the record itself doesn't contain an email address.

> Of course, there are more metadata provided in the premium service,  
> such as physical addresses of authors, sponsors of projects, abstract,  
> etc. However, our repository is not very interested in those. We are  
> now fighting with WoS about the record type metadata as it actually  
> downgraded the service level as we had before.

Hope that helps.

/Tim.

This email has been scanned for all viruses by the MessageLabs Email Security System.

This email has been scanned for all viruses by the MessageLabs Email
Security System.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: WoS_premium_response_Kingston.xml
Type: text/xml
Size: 473543 bytes
Desc: WoS_premium_response_Kingston.xml
Url : http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20130220/478a258f/attachment-0001.xml 


More information about the Eprints-tech mailing list