[EP-tech] Re: harvester (question)
John Salter
J.Salter at leeds.ac.uk
Mon Mar 3 09:07:59 GMT 2014
Hi Jean-Marie,
I think it’s probably a namespace problem (see references below). If you try
$xml->findnodes( “//tef:auteur/tef:nom/*” )
do you get any results?
You could also do this via xslt – if you have any experience of this?
I’m guessing it’s something like this you’re starting with: http://www.abes.fr/abes/documents/tef/recommandation/ex1_theseSimplePDF.xml
These might explain a bit more about namespaces:
http://stackoverflow.com/a/4083929/2455451
http://stackoverflow.com/questions/2673370/why-should-i-use-xpathcontext-with-perls-xmllibxml/2673452#2673452
Cheers,
John
From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of Jean-Marie Le Bechec
Sent: 03 March 2014 08:18
To: eprints-tech at ecs.soton.ac.uk
Subject: [EP-tech] harvester (question)
hi Seb,
I need to harvest an OAI server in a format other than Dublin Core (TEF format). I can not get specific metadata with the same name.
For example :
...
<tef:thesisAdmin>
<tef:auteur>
<tef:nom>nom1</tef:nom>
...
and
...
<tef:directeurThese>
<tef:nom>nom2</tef:nom>
<tef:prenom>Carine</tef:prenom>
<tef:autoriteInterne>MADS_DIRECTEUR_DE_THESE_1</tef:autoriteInterne>
<tef:autoriteExterne autoriteSource="Sudoc">073367826</tef:autoriteExterne>
</tef:directeurThese>
<tef:directeurThese>
<tef:nom>nom3</tef:nom>
<tef:prenom>Louise</tef:prenom>
<tef:autoriteInterne>MADS_DIRECTEUR_DE_THESE_2</tef:autoriteInterne>
<tef:autoriteExterne autoriteSource="Sudoc">035036672</tef:autoriteExterne>
</tef:directeurThese>
...
in the same record !
I need to extract all this data.
I tried things like :
my $nom;
foreach my $node ($xml->findnodes( "//auteur/nom/*" ))
{
$nom = $node->textContent;
}
but it does not work (no result)
any idea ?
Thanks !
Jean-Marie
--
***********************************************
Jean Marie Le Bechec
Service Commun de la Documentation
Responsable ingenierie documentaire
&
Direction du Systeme d'Information
Referent Etudes
Institut National Polytechnique de Toulouse
6 allee Emile Monso - bp 34038 -
31029 Toulouse cedex 4
Tel : 05 34 32 31 16
Mail : lebechec at inp-toulouse.fr<mailto:lebechec at inp-toulouse.fr>
***********************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20140303/71901cd6/attachment-0001.html
More information about the Eprints-tech
mailing list