<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 14 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
        {font-family:Consolas;
        panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";
        color:black;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
pre
        {mso-style-priority:99;
        mso-style-link:"HTML Preformatted Char";
        margin:0cm;
        margin-bottom:.0001pt;
        font-size:10.0pt;
        font-family:"Courier New";
        color:black;}
span.HTMLPreformattedChar
        {mso-style-name:"HTML Preformatted Char";
        mso-style-priority:99;
        mso-style-link:"HTML Preformatted";
        font-family:Consolas;
        color:black;}
span.EmailStyle19
        {mso-style-type:personal-reply;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body bgcolor=white lang=EN-GB link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Hi Jean-Marie,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>I think it’s probably a namespace problem (see references below). If you try<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>$xml->findnodes( “//tef:auteur/tef:nom/*” )<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>do you get any results?<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>You could also do this via xslt – if you have any experience of this?<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>I’m guessing it’s something like this you’re starting with: <a href="http://www.abes.fr/abes/documents/tef/recommandation/ex1_theseSimplePDF.xml">http://www.abes.fr/abes/documents/tef/recommandation/ex1_theseSimplePDF.xml</a><o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>These might explain a bit more about namespaces:<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>http://stackoverflow.com/a/4083929/2455451<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><a href="http://stackoverflow.com/questions/2673370/why-should-i-use-xpathcontext-with-perls-xmllibxml/2673452#2673452">http://stackoverflow.com/questions/2673370/why-should-i-use-xpathcontext-with-perls-xmllibxml/2673452#2673452</a><o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Cheers,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>John<o:p></o:p></span></p><p class=MsoNormal><o:p> </o:p></p><div><div style='border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=MsoNormal><b><span lang=EN-US style='font-size:10.0pt;font-family:"Tahoma","sans-serif";color:windowtext'>From:</span></b><span lang=EN-US style='font-size:10.0pt;font-family:"Tahoma","sans-serif";color:windowtext'> eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] <b>On Behalf Of </b>Jean-Marie Le Bechec<br><b>Sent:</b> 03 March 2014 08:18<br><b>To:</b> eprints-tech@ecs.soton.ac.uk<br><b>Subject:</b> [EP-tech] harvester (question)<o:p></o:p></span></p></div></div><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>hi Seb,<br><br>I need to harvest an OAI server in a format other than Dublin Core (TEF format). I can not get specific metadata with the same name. <br><br>For example :<br>...<br><tef:thesisAdmin><br> <tef:auteur><br> <b><tef:nom>nom1</tef:nom></b><br>...<br><br>and <br>...<br><tef:directeurThese><br> <b><tef:nom>nom2</tef:nom></b><br> <tef:prenom>Carine</tef:prenom><br> <tef:autoriteInterne>MADS_DIRECTEUR_DE_THESE_1</tef:autoriteInterne><br> <tef:autoriteExterne autoriteSource="Sudoc">073367826</tef:autoriteExterne><br> </tef:directeurThese><br> <tef:directeurThese><br> <b><tef:nom>nom3</tef:nom></b><br> <tef:prenom>Louise</tef:prenom><br> <tef:autoriteInterne>MADS_DIRECTEUR_DE_THESE_2</tef:autoriteInterne><br> <tef:autoriteExterne autoriteSource="Sudoc">035036672</tef:autoriteExterne><br> </tef:directeurThese><br>...<br>in the same record !<br><br>I need to extract all this data.<br><br>I tried things like :<br><br>my $nom;<br>foreach my $node ($xml->findnodes( "//auteur/nom/*" ))<br> {<br> $nom = $node->textContent; <br> }<br><br>but it does not work (no result)<br><br>any idea ?<br><br><br>Thanks !<br><br>Jean-Marie<br><br><br><o:p></o:p></p><pre>-- <o:p></o:p></pre><pre><o:p> </o:p></pre><pre>***********************************************<o:p></o:p></pre><pre>Jean Marie Le Bechec<o:p></o:p></pre><pre>Service Commun de la Documentation<o:p></o:p></pre><pre>Responsable ingenierie documentaire<o:p></o:p></pre><pre>&<o:p></o:p></pre><pre>Direction du Systeme d'Information<o:p></o:p></pre><pre>Referent Etudes<o:p></o:p></pre><pre><o:p> </o:p></pre><pre>Institut National Polytechnique de Toulouse<o:p></o:p></pre><pre>6 allee Emile Monso - bp 34038 -<o:p></o:p></pre><pre>31029 Toulouse cedex 4<o:p></o:p></pre><pre>Tel : 05 34 32 31 16<o:p></o:p></pre><pre>Mail : <a href="mailto:lebechec@inp-toulouse.fr">lebechec@inp-toulouse.fr</a><o:p></o:p></pre><pre>*********************************************** <o:p></o:p></pre></div></body></html>