<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    <p>I don't recall if you can reindex individual fields.<br>
    </p>
    <div class="moz-cite-prefix">On 30/04/2020 09:51, Yuri via
      Eprints-tech wrote:<br>
    </div>
    <blockquote type="cite" cite="mid:EMEW3|6d8dbf3fb7da9a54f26d6558f796a2f9w3Y9rI14eprints-tech-bounces|ecs.soton.ac.uk|bfc7fde6-2d35-2d22-ace0-bc3e24da89e2@alfa.it">
      
      <p>Thanks for the pointer, maybe a check against a fixed
        vocabulary can be enough.<br>
      </p>
      <p>This also mean reindex all the archive. Is it possible to
        reindex only title and keywords? Full text can be a problem to
        reindex if you've a lot of pdf, for example.<br>
      </p>
      <div class="moz-cite-prefix">Il 30/04/20 10:29, Christopher
        Gutteridge via Eprints-tech ha scritto:<br>
      </div>
      <blockquote type="cite" cite="mid:EMEW3|873b453e06baf3567d21799648f3bbcfw3Y9Va14eprints-tech-bounces|ecs.soton.ac.uk|53088265-0f14-fff8-e573-a29509f22563@soton.ac.uk">
        <p>EPrints makes some decisions on what to index. Those can be
          overridden, if I recall the old magics from the dawn of time.</p>
        <p><a class="moz-txt-link-freetext" href="https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprints%2Feprints%2Fblob%2F3.3%2Flib%2Fdefaultcfg%2Fcfg.d%2Findexing.pl&amp;data=01%7C01%7C%7Cbdf305b2704e4ffb496908d7ece42d6d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=SrR1%2FhqpRRbxFl2bgSehx%2FlWwFf3EV4RxSbgVJ5p0Sw%3D&amp;reserved=0" originalSrc="https://github.com/eprints/eprints/blob/3.3/lib/defaultcfg/cfg.d/indexing.pl" shash="mZM1wmgfHLz77HpXK/MBzfn6rswaihVfr1MzZ0gwXZS8txNS+GFKOdE8iiC7+vW/AuEGkeyoLwxnXNsH8el0VMKRBBGZg46xf7zc7Z+7Q+KQ98Wl5ETzMvfK4ADEhS76cRqcMdYoJRjZcNG/YHTP2oAfNpLtILuuI+7bRGXTC8U=" originalsrc="https://github.com/eprints/eprints/blob/3.3/lib/defaultcfg/cfg.d/indexing.pl" shash="I9lHL/mekUE9CA1xe7797WoLhNTMmwmC0HhZ7h6DjwvGXK65fSeHMkQ5LKVhCug4e8ZoEhKykhjwscQ98Y0nHVsoxW&#43;FjxRfyyVyA0rsyLjFD2EbQbJgFVI4qyzbsTucCDJclh&#43;JpnG9lpRycnvJcMHhjdwINz0BiWjFCC9TW08=" moz-do-not-send="true" title="Outlook Unmangled from:
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprints%2Feprints%2Fblob%2F3.3%2Flib%2Fdefaultcfg%2Fcfg.d%2Findexing.pl&amp;data=01%7C01%7Ctotl%40soton.ac.uk%7C2b9294b8ad2646e0fc9c08d7ece40d9c%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=bGFleEVgoq0JSpPIkC%2F7eyKdbBXb6EUgmx9ari88F3M%3D&amp;reserved=0">https://github.com/eprints/eprints/blob/3.3/lib/defaultcfg/cfg.d/indexing.pl</a></p>
        <p>That, by default, uses EPrints word split function <a class="moz-txt-link-freetext" href="https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprints%2Feprints%2Fblob%2F3.3%2Fperl_lib%2FEPrints%2FIndex%2FTokenizer.pm%23L39&amp;data=01%7C01%7C%7Cbdf305b2704e4ffb496908d7ece42d6d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=vjTyV70Vsyd5WHcIi9IcaUASsqKAYSwvhMZ%2BhdzBn2M%3D&amp;reserved=0" originalSrc="https://github.com/eprints/eprints/blob/3.3/perl_lib/EPrints/Index/Tokenizer.pm#L39" shash="MoaiGvCVRNbTN5LHJCY0mmM4tkscRM4YTU2V1uac13b4Hvu62Pq0/bTrV6ThlQ6iGzu33RCJ7NJWqzrQSrgmTS7aQ2FM7xoN/TOiH2xWGvyUAmdrzxNvn1Ou1Bqpl6T4XdJbGBT0zyUzMHzRFk0x8VbNVNDusqXbbnOjzsQ/1Hg=" originalsrc="https://github.com/eprints/eprints/blob/3.3/perl_lib/EPrints/Index/Tokenizer.pm#L39" shash="GoJQU1bZBAs49TOlAHOVIJvboSwfnjcipFYMsVE27p1BwXOVJij5A7SDvMqlnGupx4RiW08JBhnJwVJliJpr0OpOHGaIn&#43;JHtfP2JVJlIr/HbQP81f0LbjiMsrZOMqI83NeRHRJ1qwrLYBc5TNHPU411grmP3Tibot1MJWAZ/EU=" moz-do-not-send="true" title="Outlook Unmangled from:
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprints%2Feprints%2Fblob%2F3.3%2Fperl_lib%2FEPrints%2FIndex%2FTokenizer.pm%23L39&amp;data=01%7C01%7Ctotl%40soton.ac.uk%7C2b9294b8ad2646e0fc9c08d7ece40d9c%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=0HFf7UqOKFoiX%2FdTY02mcmYAsiXRoKUwI6Y2LOiLsXc%3D&amp;reserved=0">https://github.com/eprints/eprints/blob/3.3/perl_lib/EPrints/Index/Tokenizer.pm#L39</a>
          which apparently uses the perl regexp library to decide word
          breaks, but you can write one that does what you want. <span class="pl-c1"><span class="text-bold bg-yellow-light
              rounded-1 d-inline-block">freetext_seperator_chars seems
              utterly ignored now. <br>
            </span></span></p>
        <p><span class="pl-c1"><span class="text-bold bg-yellow-light
              rounded-1 d-inline-block">This is still obeyed </span></span><br>
          <span class="pl-c1"><span class="text-bold bg-yellow-light
              rounded-1 d-inline-block"><span class="pl-smi">$c</span><span class="pl-k">-&gt;</span>{<span class="pl-c1">indexing</span>}<span class="pl-k">-&gt;</span>{<span class="pl-c1">freetext_min_word_size</span>}
              = 3;</span></span></p>
        <p><span class="pl-c1"><span class="text-bold bg-yellow-light
              rounded-1 d-inline-block">Which caused some issues for
              people with Chinese name &quot;Wu&quot;.</span></span></p>
        <p><span class="pl-c1"><span class="text-bold bg-yellow-light
              rounded-1 d-inline-block">I would suggest considering
              keeping it by altering indexing.pl to always index numbers
              even if they are one or two digits long. Something like
              this (of course you'd then have to entirely reindex)<br>
            </span></span></p>
        <p><tt><br>
          </tt><tt>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; # First approximation is if this word is over
            or equal</tt><tt><br>
          </tt><tt>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; # to the minimum size set in SiteInfo.</tt><tt><br>
          </tt><tt>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; my $ok = $wordlen &gt;=
            $c-&gt;{indexing}-&gt;{freetext_min_word_size};</tt><tt><br>
          </tt></p>
        <p><tt><font color="#ff0000">&nbsp; &nbsp;&nbsp;&nbsp; &nbsp; if( $word =~ m/^\d&#43;$/ ) {<br>
              &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; $ok = 1;<br>
              &nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; </font></tt><br>
        </p>
        <div class="moz-cite-prefix">On 30/04/2020 08:27, Yuri via
          Eprints-tech wrote:<br>
        </div>
        <blockquote type="cite" cite="mid:EMEW3|b966302d0df9e000a031a1f8b6e8872cw3Y8YW14eprints-tech-bounces|ecs.soton.ac.uk|d32bf79e-5ac1-3c16-f6d0-c890b0b95c0d@alfa.it">
          <p>Hi!</p>
          <p>&nbsp;I've found that the virus can be referred also as &quot;SARS
            COV-2&quot; so maybe you can add also this. But beware that
            Eprints search has a problem with -, it split the word using
            it.<br>
          </p>
          <div class="moz-cite-prefix">Il 27/04/20 17:06, James Kerwin
            via Eprints-tech ha scritto:<br>
          </div>
          <blockquote type="cite" cite="mid:EMEW3|942a457e724595d3b487147e73b60d14w3QG9014eprints-tech-bounces|ecs.soton.ac.uk|CAKkNZ9Bp5Hpsxb-G9oKRtnn6-pHfcG10ob8mfYBkQ-KFcAF6Sw@mail.gmail.com">
            <div dir="ltr">Hello All,<br>
              <div><br>
              </div>
              <div>I hope everyone is well in body and mind.</div>
              <div><br>
              </div>
              <div>I need some help with the EPrints search function. I
                have been asked to add a box to the repository homepage
                that lists the latest coronavirus-related deposits.</div>
              <div><br>
              </div>
              <div>I'm hoping to search via keywords for &quot;coronavirus&quot;
                and &quot;covid-19&quot;. I also want to search for either of
                these terms in titles. To do this I'm currently
                butchering&nbsp;a copy of cgi/latest_tool.</div>
              <div><br>
              </div>
              <div>I can get the keywords part to work using:</div>
              <div><br>
              </div>
              <blockquote style="margin:0 0 0
                40px;border:none;padding:0px">
                <blockquote style="margin:0 0 0
                  40px;border:none;padding:0px">
                  <blockquote style="margin:0 0 0
                    40px;border:none;padding:0px">
                    <div>$c-&gt;{latest_rona_modes} = {</div>
                  </blockquote>
                </blockquote>
                <blockquote style="margin:0 0 0
                  40px;border:none;padding:0px">
                  <blockquote style="margin:0 0 0
                    40px;border:none;padding:0px">
                    <div> default =&gt; { citation =&gt; &quot;noauth&quot; },</div>
                  </blockquote>
                </blockquote>
                <blockquote style="margin:0 0 0
                  40px;border:none;padding:0px">
                  <blockquote style="margin:0 0 0
                    40px;border:none;padding:0px">
                    <div> fplatest =&gt; { </div>
                  </blockquote>
                </blockquote>
                <blockquote style="margin:0 0 0
                  40px;border:none;padding:0px">
                  <blockquote style="margin:0 0 0
                    40px;border:none;padding:0px">
                    <div> citation =&gt; &quot;popular&quot;, max =&gt; 5, </div>
                  </blockquote>
                </blockquote>
                <blockquote style="margin:0 0 0
                  40px;border:none;padding:0px">
                  <blockquote style="margin:0 0 0
                    40px;border:none;padding:0px">
                    <div> #citation =&gt; &quot;result&quot;, max =&gt; 3, </div>
                  </blockquote>
                </blockquote>
                <blockquote style="margin:0 0 0
                  40px;border:none;padding:0px">
                  <blockquote style="margin:0 0 0
                    40px;border:none;padding:0px">
                    <div> filters =&gt; [</div>
                  </blockquote>
                </blockquote>
                <blockquote style="margin:0 0 0
                  40px;border:none;padding:0px">
                  <blockquote style="margin:0 0 0
                    40px;border:none;padding:0px">
                    <div> #{ meta_fields =&gt; [
                      &quot;full_text_status&quot;,&quot;full_text_status&quot; ], value
                      =&gt; (&quot;none&quot;||&quot;public&quot;) }</div>
                  </blockquote>
                </blockquote>
                <blockquote style="margin:0 0 0
                  40px;border:none;padding:0px">
                  <blockquote style="margin:0 0 0
                    40px;border:none;padding:0px">
                    <div> { meta_fields =&gt; [ &quot;keywords&quot; ], value
                      =&gt; &quot;covid-19&quot;}</div>
                    <div><br>
                    </div>
                  </blockquote>
                </blockquote>
              </blockquote>
              This also works with &quot;title&quot; as you would expect.
              <div><br>
              </div>
              <div>What I really want is to do a search where the
                keywords can be &quot;covid-19&quot; OR &quot;coronavirus&quot; as well as
                including some allowance for adding an:</div>
              <div><br>
              </div>
              <div>&nbsp;&quot;OR title LIKE '%covid-19%' OR title LIKE
                'coronavirus' in MYSQL-speak.</div>
              <div><br>
              </div>
              <div>Am I able to do this using the&nbsp;EPrints::Search
                plugin? I've tried reading the codumentation and
                experimenting with it, but I'm not getting very far.</div>
              <div><br>
              </div>
              <div>If it's not possible I can think of a number of
                bodges for it, but decided it was best to attempt the
                proper way first.</div>
              <div><br>
              </div>
              <div>Thanks,</div>
              <div>James</div>
            </div>
            <br>
            <fieldset class="mimeAttachmentHeader"></fieldset>
            <pre class="moz-quote-pre" wrap="">*** Options: <a class="moz-txt-link-freetext" href="http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech">http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech</a>
*** Archive: <a class="moz-txt-link-freetext" href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=01%7C01%7C%7Cbdf305b2704e4ffb496908d7ece42d6d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=bMk10ir3se1FWpSlorkYi%2FjWJuR7uc1DXjagxxu3wPc%3D&amp;reserved=0" originalSrc="http://www.eprints.org/tech.php/" shash="T0VR+yCDsPbPD8auj2xh/XfMh8qlH2AiymIPKx17Wo+tvMZDcyUvWzpRUbOLV1BskFn7HZhopuCUdMDfBqNMnXgzDIr6gjh8FpkLJ61bfXqiVCqzJL+3qlXnC20UE4wYzRl7ZKBxmh/jKnLuu1sy8IV+ojw1e7IqkKTYtbsIuR0=">http://www.eprints.org/tech.php/</a>
*** EPrints community wiki: <a class="moz-txt-link-freetext" href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=01%7C01%7C%7Cbdf305b2704e4ffb496908d7ece42d6d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=%2Bae8Gl1RyG0P6%2B0RsaGpfjl%2BNm5MAivqaLXZ8amdcAY%3D&amp;reserved=0" originalSrc="http://wiki.eprints.org/" shash="YnjkpL9Onjeq7vlqhW+Hyddb0zI+b8hpc35hYxsQwTNu/o2lPAKSg+PLXZV9coPBWYke9jYyTyKjCHP8U7qZ6fnpEfoFw3EVjtaib3CAeVCRiN35nJ0qOddVBy4eW9BsODw7wt37qqI6ULezqgKm3zceFEB8WtfNXvEyUQtXfyU=">http://wiki.eprints.org/</a></pre>
          </blockquote>
          <br>
          <fieldset class="mimeAttachmentHeader"></fieldset>
          <pre class="moz-quote-pre" wrap="">*** Options: <a class="moz-txt-link-freetext" href="http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech">http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech</a>
*** Archive: <a class="moz-txt-link-freetext" href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=01%7C01%7C%7Cbdf305b2704e4ffb496908d7ece42d6d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=bMk10ir3se1FWpSlorkYi%2FjWJuR7uc1DXjagxxu3wPc%3D&amp;reserved=0" originalSrc="http://www.eprints.org/tech.php/" shash="T0VR+yCDsPbPD8auj2xh/XfMh8qlH2AiymIPKx17Wo+tvMZDcyUvWzpRUbOLV1BskFn7HZhopuCUdMDfBqNMnXgzDIr6gjh8FpkLJ61bfXqiVCqzJL+3qlXnC20UE4wYzRl7ZKBxmh/jKnLuu1sy8IV+ojw1e7IqkKTYtbsIuR0=">http://www.eprints.org/tech.php/</a>
*** EPrints community wiki: <a class="moz-txt-link-freetext" href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=01%7C01%7C%7Cbdf305b2704e4ffb496908d7ece42d6d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=%2Bae8Gl1RyG0P6%2B0RsaGpfjl%2BNm5MAivqaLXZ8amdcAY%3D&amp;reserved=0" originalSrc="http://wiki.eprints.org/" shash="YnjkpL9Onjeq7vlqhW+Hyddb0zI+b8hpc35hYxsQwTNu/o2lPAKSg+PLXZV9coPBWYke9jYyTyKjCHP8U7qZ6fnpEfoFw3EVjtaib3CAeVCRiN35nJ0qOddVBy4eW9BsODw7wt37qqI6ULezqgKm3zceFEB8WtfNXvEyUQtXfyU=">http://wiki.eprints.org/</a></pre>
        </blockquote>
        <pre class="moz-signature" cols="72">-- 
Christopher Gutteridge <a class="moz-txt-link-rfc2396E" href="mailto:totl@soton.ac.uk">&lt;totl@soton.ac.uk&gt;</a> 
You should read our team blog at <a class="moz-txt-link-freetext" href="http://blog.soton.ac.uk/webteam/">http://blog.soton.ac.uk/webteam/</a></pre>
        <br>
        <fieldset class="mimeAttachmentHeader"></fieldset>
        <pre class="moz-quote-pre" wrap="">*** Options: <a class="moz-txt-link-freetext" href="http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech">http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech</a>
*** Archive: <a class="moz-txt-link-freetext" href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=01%7C01%7C%7Cbdf305b2704e4ffb496908d7ece42d6d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=bMk10ir3se1FWpSlorkYi%2FjWJuR7uc1DXjagxxu3wPc%3D&amp;reserved=0" originalSrc="http://www.eprints.org/tech.php/" shash="T0VR+yCDsPbPD8auj2xh/XfMh8qlH2AiymIPKx17Wo+tvMZDcyUvWzpRUbOLV1BskFn7HZhopuCUdMDfBqNMnXgzDIr6gjh8FpkLJ61bfXqiVCqzJL+3qlXnC20UE4wYzRl7ZKBxmh/jKnLuu1sy8IV+ojw1e7IqkKTYtbsIuR0=">http://www.eprints.org/tech.php/</a>
*** EPrints community wiki: <a class="moz-txt-link-freetext" href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=01%7C01%7C%7Cbdf305b2704e4ffb496908d7ece42d6d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=%2Bae8Gl1RyG0P6%2B0RsaGpfjl%2BNm5MAivqaLXZ8amdcAY%3D&amp;reserved=0" originalSrc="http://wiki.eprints.org/" shash="YnjkpL9Onjeq7vlqhW+Hyddb0zI+b8hpc35hYxsQwTNu/o2lPAKSg+PLXZV9coPBWYke9jYyTyKjCHP8U7qZ6fnpEfoFw3EVjtaib3CAeVCRiN35nJ0qOddVBy4eW9BsODw7wt37qqI6ULezqgKm3zceFEB8WtfNXvEyUQtXfyU=">http://wiki.eprints.org/</a></pre>
      </blockquote>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <pre class="moz-quote-pre" wrap="">*** Options: <a class="moz-txt-link-freetext" href="http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech">http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech</a>
*** Archive: <a class="moz-txt-link-freetext" href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=01%7C01%7C%7Cbdf305b2704e4ffb496908d7ece42d6d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=bMk10ir3se1FWpSlorkYi%2FjWJuR7uc1DXjagxxu3wPc%3D&amp;reserved=0" originalSrc="http://www.eprints.org/tech.php/" shash="T0VR+yCDsPbPD8auj2xh/XfMh8qlH2AiymIPKx17Wo+tvMZDcyUvWzpRUbOLV1BskFn7HZhopuCUdMDfBqNMnXgzDIr6gjh8FpkLJ61bfXqiVCqzJL+3qlXnC20UE4wYzRl7ZKBxmh/jKnLuu1sy8IV+ojw1e7IqkKTYtbsIuR0=">http://www.eprints.org/tech.php/</a>
*** EPrints community wiki: <a class="moz-txt-link-freetext" href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=01%7C01%7C%7Cbdf305b2704e4ffb496908d7ece42d6d%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=%2Bae8Gl1RyG0P6%2B0RsaGpfjl%2BNm5MAivqaLXZ8amdcAY%3D&amp;reserved=0" originalSrc="http://wiki.eprints.org/" shash="YnjkpL9Onjeq7vlqhW+Hyddb0zI+b8hpc35hYxsQwTNu/o2lPAKSg+PLXZV9coPBWYke9jYyTyKjCHP8U7qZ6fnpEfoFw3EVjtaib3CAeVCRiN35nJ0qOddVBy4eW9BsODw7wt37qqI6ULezqgKm3zceFEB8WtfNXvEyUQtXfyU=">http://wiki.eprints.org/</a></pre>
    </blockquote>
    <pre class="moz-signature" cols="72">-- 
Christopher Gutteridge <a class="moz-txt-link-rfc2396E" href="mailto:totl@soton.ac.uk">&lt;totl@soton.ac.uk&gt;</a> 
You should read our team blog at <a class="moz-txt-link-freetext" href="http://blog.soton.ac.uk/webteam/">http://blog.soton.ac.uk/webteam/</a></pre>
  </body>
</html>