<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 12 (filtered medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p
        {mso-style-priority:99;
        mso-margin-top-alt:auto;
        margin-right:0cm;
        mso-margin-bottom-alt:auto;
        margin-left:0cm;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";}
p.MsoAcetate, li.MsoAcetate, div.MsoAcetate
        {mso-style-priority:99;
        mso-style-link:"Balloon Text Char";
        margin:0cm;
        margin-bottom:.0001pt;
        font-size:8.0pt;
        font-family:"Tahoma","sans-serif";}
span.EmailStyle18
        {mso-style-type:personal;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
span.EmailStyle19
        {mso-style-type:personal-reply;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
span.BalloonTextChar
        {mso-style-name:"Balloon Text Char";
        mso-style-priority:99;
        mso-style-link:"Balloon Text";
        font-family:"Tahoma","sans-serif";}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="2050" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-GB" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">I think Seb’s suggestions should have you on the right track, but in case it helps, and being a search that does not behave itself when a word ending with an
‘s’ is involved, looks like stemming might have a say in this.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">As far as I know, the same stemming configuration is used for both searching and indexing, so re-indexing the record should have worked...<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">But you can anyway have a look at “cfg.d/indexing.pl” and temporarily place the following lines near the end of the script, just before “return( \@g , \@b );”<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">use Data::Dumper;<o:p></o:p></span></p>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">print STDERR " *** Indexing: $text\n";<o:p></o:p></span></p>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">print STDERR " *** GOODs:\n";<o:p></o:p></span></p>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">print STDERR Dumper(\@g);<o:p></o:p></span></p>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">print STDERR " *** BADs:\n";<o:p></o:p></span></p>
<p class="MsoNormal" style="margin-left:36.0pt"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">print STDERR Dumper(\@b);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">That will (insistently) display in the log file the words that are being discarded and the ones that are finally used.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Best,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> Jose.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">
<a href="mailto:eprints-tech-bounces@ecs.soton.ac.uk">eprints-tech-bounces@ecs.soton.ac.uk</a> [<a href="mailto:eprints-tech-bounces@ecs.soton.ac.uk">mailto:eprints-tech-bounces@ecs.soton.ac.uk</a>]
<b>On Behalf Of </b>Phil<br>
<b>Sent:</b> 25 July 2012 20:15<br>
<b>To:</b> <a href="mailto:eprints-tech@ecs.soton.ac.uk">eprints-tech@ecs.soton.ac.uk</a><br>
<b>Subject:</b> [EP-tech] Re: We are at our wits end,<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Yes I have re indexed multiple times, my confusion comes from the fact that the index tables are correct, we are at 3.3.X
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Does the debugger in sql output to a logfile?<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">
<a href="mailto:eprints-tech-bounces@ecs.soton.ac.uk">eprints-tech-bounces@ecs.soton.ac.uk</a> [<a href="mailto:eprints-tech-bounces@ecs.soton.ac.uk">mailto:eprints-tech-bounces@ecs.soton.ac.uk</a>]
<b>On Behalf Of </b>sf2<br>
<b>Sent:</b> 25 July 2012 22:41<br>
<b>To:</b> <a href="mailto:eprints-tech@ecs.soton.ac.uk">eprints-tech@ecs.soton.ac.uk</a><br>
<b>Subject:</b> [EP-tech] Re: We are at our wits end,<o:p></o:p></span></p>
</div>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p><span lang="EN-US">Hi Phil,<o:p></o:p></span></p>
<p><span lang="EN-US">For info, which EPrints version are you using?<o:p></o:p></span></p>
<p><span lang="EN-US">Worth a try:<o:p></o:p></span></p>
<p><span lang="EN-US">- have you tried to reindex the eprint dataset? (bin/epadmin reindex <archive_id> eprint)<o:p></o:p></span></p>
<p><span lang="EN-US">- have you tried debugging the search to see which SQL statement gets executed (**) ?<o:p></o:p></span></p>
<p><span lang="EN-US">Seb<o:p></o:p></span></p>
<p><span lang="EN-US">(**) to enable SQL debugging, edit perl_lib/EPrints/Search/Condition.pm, look at the end of "sub sql" for:<o:p></o:p></span></p>
<p><span lang="EN-US">#print STDERR "\nsql=$sql\n\n";<o:p></o:p></span></p>
<p><span lang="EN-US">Un-comment the line above, restart apache and re-run the problematic search - feel free to copy/paste the SQL query here so we can have a look.<o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US">On Wed, 25 Jul 2012 22:12:26 +1000, "Phil" <<a href="mailto:philpearson@iinet.net.au">philpearson@iinet.net.au</a>> wrote:<o:p></o:p></span></p>
<blockquote style="border:none;border-left:solid #1010FF 1.5pt;padding:0cm 0cm 0cm 4.0pt;margin-left:3.75pt;margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">If anyone can help we would very much appreciate it, we have posted this issue numerous times and had someone control our eprints server but the problem will
not go away. <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">I will explain it step by step<o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">If I look at the database directly and query the index I get the following result<o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="border:solid windowtext 1.0pt;padding:0cm"><img border="0" width="542" height="276" id="Picture_x0020_1" src="cid:image001.jpg@01CD6B1E.A214DE80" alt="Image removed by sender."></span><span lang="EN-US"><o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">You can see the word “Green” and “Bans” are in the index<o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">So we should be able to find this book in the advanced or simple search using the term “green bans”<o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="border:solid windowtext 1.0pt;padding:0cm"><img border="0" width="860" height="157" id="Picture_x0020_2" src="cid:image002.jpg@01CD6B1E.A214DE80" alt="Image removed by sender."></span><span lang="EN-US"><o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">Unfortunately when we do we get only one result below<o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="border:solid windowtext 1.0pt;padding:0cm"><img border="0" width="934" height="333" id="Picture_x0020_3" src="cid:image003.jpg@01CD6B1E.A214DE80" alt="Image removed by sender."></span><span lang="EN-US"><o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">Whilst this is correct, it is incomplete, the search should have also returned record 3594<o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">If you look back at the SQL above you will see the other words in the title<o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">If I put a few of them in the search as per below, I get the correct result:<o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="border:solid windowtext 1.0pt;padding:0cm"><img border="0" width="829" height="124" id="Picture_x0020_4" src="cid:image004.jpg@01CD6B1E.A214DE80" alt="Image removed by sender."></span><span lang="EN-US"><o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="border:solid windowtext 1.0pt;padding:0cm"><img border="0" width="859" height="347" id="Picture_x0020_5" src="cid:image005.jpg@01CD6B1E.A214DE80" alt="Image removed by sender."></span><span lang="EN-US"><o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">I have erased the index record and reindexed the item to no avail.<o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">This is not the only record that does not return from a search and it makes our catalogue unusable.<o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">I have installed xapian but that also gives me problems as per a post of mine two days ago.<o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span lang="EN-US">We have put a lot of effort into this migrating from the old system, if we cannot fix this we will have to go through the whole process again with something else.<o:p></o:p></span></p>
<p><span lang="EN-US"> <o:p></o:p></span></p>
</div>
</blockquote>
<p><span lang="EN-US"> <o:p></o:p></span></p>
</div>
</body>
</html>