<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN">
<html><body style='font-family: Verdana,Geneva,sans-serif'>
<p>Hello Martin,</p>
<p>So:</p>
<p>1- not real issues, no - but it's a fairly basic implementation - I think it indexes fields which shouldn't be indexed (eg search for "00" will match every records cos the "dir" field is indexed). And there's no advanced features like record caching, faceting, fields collapsing, suggestions, ... all supported by xapian though.&nbsp;</p>
<p>2- nope - I think EPS have one install that uses it but that was being implemented around the time I left so don't know where this went. It's at the "please test this" stage I'd say. But I appreciate it's not an easy task to take on as it requires knowledge of xapian (or solr or else) "under the hood".</p>
<p>3- my test dataset used 22k records:&nbsp;http://vmdev1.eprints.org/cgi/xapian (note I don't control that url anymore) - seems fast. It scales in O(n*m) if I recall correctly, with n the number of (matching) records and m the max number of facet values (m=1 for single fields and max(m) = 5 by default for multiple).</p>
<p>Hope this helps,</p>
<p>Seb</p>
<p>&nbsp;</p>
<p>On 23.12.2014 08:55, martin.braendle@id.uzh.ch wrote:</p>
<blockquote type="cite" style="padding-left:5px; border-left:#1010ff 2px solid; margin-left:5px"><!-- html ignored --><!-- head ignored --><!-- meta ignored -->
<p><span style="font-family: sans-serif; font-size: small;">Hi Seb,</span><br /><br /><span style="font-family: sans-serif; font-size: small;">can you answer the following questions:</span><br /><br /><span style="font-family: sans-serif; font-size: small;">- what do you mean by "be careful using the default eprints-xapian indexing" (shipped with eprints 3.3.12) ? Are there any known problems?</span><br /><span style="font-family: sans-serif; font-size: small;">- in as far can the code on </span><span style="font-family: sans-serif; font-size: small;"><a href="https://github.com/eprints/xapianv2">https://github.com/eprints/xapianv2</a></span><span style="font-family: sans-serif; font-size: small;">&nbsp;</span><span style="font-family: sans-serif; font-size: small;">be considered as finished and be recommended for production?</span><span style="font-family: sans-serif; font-size: small;">&nbsp;</span><br /><span style="font-family: sans-serif; font-size: small;">- was faceting tested on a real-world repo having 10'000s of records (and not only on 93 as with </span><span style="color: #0000ff; font-family: sans-serif; font-size: small;"><a href="http://puffin.ecs.soton.ac.uk/">http://puffin.ecs.soton.ac.uk/</a></span><span style="font-family: sans-serif; font-size: small;">) ? Is performance still good? E.g., on </span><a href="http://www.zora.uzh.ch/"><span style="font-family: sans-serif; font-size: small;">http://www.zora.uzh.ch/</span></a><span style="font-family: sans-serif; font-size: small;">, depending on the search terms used, one may obtain 1000s of records.</span><br /><br /><span style="font-family: sans-serif; font-size: small;">Best regards,</span><br /><br /><span style="font-family: sans-serif; font-size: small;">Martin</span><br /><br /><br /><span style="font-family: sans-serif; font-size: small;">--</span><br /><span style="font-family: sans-serif; font-size: small;">Dr. Martin Br&auml;ndle</span><br /><span style="font-family: sans-serif; font-size: small;">Zentrale Informatik</span><br /><span style="font-family: sans-serif; font-size: small;">Universit&auml;t Z&uuml;rich</span><br /><span style="font-family: sans-serif; font-size: small;">Winterthurerstr. 190</span><br /><span style="font-family: sans-serif; font-size: small;">CH-8057 Z&uuml;rich</span><br /><br /><span style="font-family: sans-serif; font-size: small;">mail: martin.braendle@id.uzh.ch</span><br /><span style="font-family: sans-serif; font-size: small;">phone: +41 44 63 56705</span><br /><span style="font-family: sans-serif; font-size: small;">fax: +41 44 63 54505</span><br /><span style="font-family: sans-serif; font-size: small;"><a href="http://www.id.uzh.ch">http://www.id.uzh.ch</a></span><br /><br /><img src="cid:61b7e29266fd1cc998b92a5bcf653c2c@ecs.soton.ac.uk" alt="Inactive hide details for sf2 ---19/12/2014 21:51:32---  Sure thing.. install libxapian, libsearch-xapian-perl (yup that's" width="16" height="16" border="0" /><span style="color: #424282; font-family: sans-serif; font-size: small;">sf2 ---19/12/2014 21:51:32--- &nbsp;Sure thing.. install libxapian, libsearch-xapian-perl (yup that's</span><br /><br /><span style="color: #5f5f5f; font-family: sans-serif; font-size: xx-small;">Von: </span><span style="font-family: sans-serif; font-size: xx-small;">sf2 &lt;sf2@ecs.soton.ac.uk&gt;</span><br /><span style="color: #5f5f5f; font-family: sans-serif; font-size: xx-small;">An: </span><span style="font-family: sans-serif; font-size: xx-small;">eprints-tech@ecs.soton.ac.uk</span><br /><span style="color: #5f5f5f; font-family: sans-serif; font-size: xx-small;">Datum: </span><span style="font-family: sans-serif; font-size: xx-small;">19/12/2014 21:51</span><br /><span style="color: #5f5f5f; font-family: sans-serif; font-size: xx-small;">Betreff: </span><span style="font-family: sans-serif; font-size: xx-small;">[EP-tech] Re: Xapian install on Ubuntu 12.04</span><br /><span style="color: #5f5f5f; font-family: sans-serif; font-size: xx-small;">Gesendet von: </span><span style="font-family: sans-serif; font-size: xx-small;">eprints-tech-bounces@ecs.soton.ac.uk</span></p>
<hr style="color: #8091a5;" align="left" size="2" width="100%" /><br /><br /><br /><span style="font-family: Verdana; font-size: medium;">Sure thing.. install libxapian, libsearch-xapian-perl (yup that's Search::Xapian) and voila. Then I'd install xapian-tools because some of their utility are damned useful to debup/map a xapian DB.</span>
<p><span style="font-family: Verdana; font-size: medium;">Then as a word of caution, I'd say be careful in using the default eprints-xapian indexing (what's shipped with eprints 3.3.x basically) cos it's very basic. Perhaps look up </span><span style="font-family: Verdana; font-size: medium;"><a href="https://github.com/eprints/xapianv2">https://github.com/eprints/xapianv2</a></span><span style="font-family: Verdana; font-size: medium;">&nbsp;to do more advanced stuff such as faceting.</span></p>
<p><span style="font-family: Verdana; font-size: medium;">Seb</span></p>
<p><span style="font-family: Verdana; font-size: medium;">&nbsp;</span></p>
<p><span style="font-family: Verdana; font-size: medium;">On 19.12.2014 20:25, Tomasz Neugebauer wrote:</span></p>
<br /><span style="font-family: Verdana; font-size: medium;">We have the following instructions for installing Xapian on Ubuntu 12.04:</span>
<p><span style="font-family: Verdana; font-size: medium;">&nbsp;</span></p>
<p><span style="font-family: Verdana; font-size: medium;">Install Xapian:</span></p>
<p><span style="font-family: Verdana; font-size: medium;">&nbsp;</span></p>
<br /><span style="font-family: Calibri; font-size: small;">wget </span><a href="http://oligarchy.co.uk/xapian/1.2.13/xapian-core-1.2.13.tar.gz"><span style="color: #0000ff; font-family: Calibri; font-size: small;"><span style="text-decoration: underline;">http://oligarchy.co.uk/xapian/1.2.13/xapian-core-1.2.13.tar.gz</span></span></a>
<p><span style="font-family: Calibri; font-size: small;">wget </span><a href="http://oligarchy.co.uk/xapian/1.2.13/xapian-omega-1.2.13.tar.gz"><span style="color: #0000ff; font-family: Calibri; font-size: small;"><span style="text-decoration: underline;">http://oligarchy.co.uk/xapian/1.2.13/xapian-omega-1.2.13.tar.gz</span></span></a></p>
<p><span style="font-family: Calibri; font-size: small;">wget </span><a href="http://oligarchy.co.uk/xapian/1.2.13/xapian-bindings-1.2.13.tar.gz"><span style="color: #0000ff; font-family: Calibri; font-size: small;"><span style="text-decoration: underline;">http://oligarchy.co.uk/xapian/1.2.13/xapian-bindings-1.2.13.tar.gz</span></span></a></p>
<p><span style="font-family: Calibri; font-size: small;">&nbsp;</span></p>
<p><span style="font-family: Calibri; font-size: small;">tar zxvf xapian-core-1.2.13.tar.gz</span></p>
<p><span style="font-family: Calibri; font-size: small;">tar zxvf xapian-omega-1.2.13.tar.gz</span></p>
<p><span style="font-family: Calibri; font-size: small;">tar zxvf xapian-bindings-1.2.13.tar.gz</span></p>
<p><span style="font-family: Calibri; font-size: small;">&nbsp;</span></p>
<p><span style="font-family: Calibri; font-size: small;">cd xapian-core-1.2.13</span></p>
<p><span style="font-family: Calibri; font-size: small;">sudo apt-get install uuid-dev </span></p>
<p><span style="font-family: Calibri; font-size: small;">sudo ./configure</span></p>
<p><span style="font-family: Calibri; font-size: small;">sudo make</span></p>
<p><span style="font-family: Calibri; font-size: small;">sudo make install</span></p>
<p><span style="font-family: Calibri; font-size: small;">&nbsp;</span></p>
<p><span style="font-family: Calibri; font-size: small;">cd xapian-omega-1.2.13</span></p>
<p><span style="font-family: Calibri; font-size: small;">sudo apt-get install libpcre3-dev</span></p>
<p><span style="font-family: Calibri; font-size: small;">sudo ./configure </span></p>
<p><span style="font-family: Calibri; font-size: small;">sudo make</span></p>
<p><span style="font-family: Calibri; font-size: small;">sudo make install</span></p>
<p><span style="font-family: Calibri; font-size: small;">&nbsp;</span></p>
<p><span style="font-family: Calibri; font-size: small;">cd xapian-bindings-1.2.13</span></p>
<p><span style="font-family: Calibri; font-size: small;">sudo ./configure</span></p>
<p><span style="font-family: Calibri; font-size: small;">sudo make</span></p>
<p><span style="font-family: Calibri; font-size: small;">sudo make install</span></p>
<p><span style="font-family: Calibri; font-size: small;">&nbsp;</span></p>
<p><span style="font-family: Calibri; font-size: small;">sudo cpan Search::Xapian</span></p>
<p><span style="font-family: Calibri; font-size: small;">&nbsp;</span></p>
<br /><span style="font-family: Verdana; font-size: medium;">We were wondering if it is preferable to use the Ubuntu packages instead?</span>
<p><span style="font-family: Verdana; font-size: medium;">&nbsp;</span></p>
<p><span style="font-family: Verdana; font-size: medium;">We found these packages: </span></p>
<p><span style="font-family: Verdana; font-size: medium;">&nbsp;</span></p>
<p><span style="font-family: Verdana; font-size: medium;">libept-dev - High-level library for managing Debian package information</span></p>
<p><span style="font-family: Verdana; font-size: medium;">libxapian-dev - Development files for Xapian search engine library</span></p>
<p><span style="font-family: Verdana; font-size: medium;">libxapian22 - Search engine library</span></p>
<p><span style="font-family: Verdana; font-size: medium;">libxapian22-dbg - Debugging symbols for the Xapian Search engine library</span></p>
<p><span style="font-family: Verdana; font-size: medium;">xapian-doc - Core Xapian documentation</span></p>
<p><span style="font-family: Verdana; font-size: medium;">xapian-examples - Xapian simple example programs</span></p>
<p><span style="font-family: Verdana; font-size: medium;">libsearch-xapian-perl - Perl bindings for the Xapian search library</span></p>
<p><span style="font-family: Verdana; font-size: medium;">xapian-omega - CGI search interface and indexers using Xapian</span></p>
<p><span style="font-family: Verdana; font-size: medium;">xapian-tools - Basic tools for Xapian search engine library</span></p>
<p><span style="font-family: Verdana; font-size: medium;">&nbsp;</span></p>
<p><span style="font-family: Verdana; font-size: medium;">libsearch-xapian-perl looks to be the equivalent to CPAN&rsquo;s Search::Xapian?</span></p>
<p><span style="font-family: Verdana; font-size: medium;">Does anyone have any experience with installing xapian on Ubuntu this way?</span></p>
<p><span style="font-family: Verdana; font-size: medium;">&nbsp;</span></p>
<p><span style="font-family: Verdana; font-size: medium;">Thanks!</span></p>
<p><span style="font-family: Verdana; font-size: medium;">&nbsp;</span></p>
<p><span style="font-family: Verdana; font-size: medium;">Tomasz</span></p>
<p><span style="font-family: Verdana; font-size: medium;">&nbsp;</span></p>
<p><span style="font-family: Verdana; font-size: medium;">&nbsp;</span></p>
<p><span style="color: #a6a6a6; font-family: 'Courier New'; font-size: xx-small;">________________________________________________</span></p>
<p><span style="font-family: 'Courier New'; font-size: xx-small;">Tomasz Neugebauer</span></p>
<p><span style="color: #404040; font-family: 'Courier New'; font-size: xx-small;">Digital Projects &amp; Systems Development Librarian </span></p>
<p><a href="mailto:tomasz.neugebauer@concordia.ca"><span style="color: #404040; font-family: 'Courier New'; font-size: xx-small;"><span style="text-decoration: underline;">tomasz.neugebauer@concordia.ca</span></span></a><span style="color: #404040; font-family: 'Courier New'; font-size: xx-small;">&nbsp;</span></p>
<p><span style="color: #404040; font-family: 'Courier New'; font-size: xx-small;">Concordia University Libraries </span></p>
<p><span style="color: #404040; font-family: 'Courier New'; font-size: xx-small;">1400 de Maisonneuve West (LB 341-3)</span></p>
<p><span style="color: #404040; font-family: 'Courier New'; font-size: xx-small;">Tel.: (514) 848-2424 ex. 7738</span></p>
<p><span style="color: #404040; font-family: 'Courier New'; font-size: xx-small;">Montreal, Canada</span></p>
<p><span style="font-family: Verdana; font-size: medium;">&nbsp;</span></p>
<p><br /><span style="font-family: Verdana; font-size: medium;">*** Options: </span><a href="http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech"><span style="color: #0000ff; font-family: Verdana; font-size: medium;"><span style="text-decoration: underline;">http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech</span></span></a><span style="font-family: Verdana; font-size: medium;"><br /> *** Archive: </span><a href="http://www.eprints.org/tech.php/"><span style="color: #0000ff; font-family: Verdana; font-size: medium;"><span style="text-decoration: underline;">http://www.eprints.org/tech.php/</span></span></a><span style="font-family: Verdana; font-size: medium;"><br /> *** EPrints community wiki: </span><a href="http://wiki.eprints.org/"><span style="color: #0000ff; font-family: Verdana; font-size: medium;"><span style="text-decoration: underline;">http://wiki.eprints.org/</span></span></a><span style="font-family: Verdana; font-size: medium;"><br /> *** EPrints developers Forum: </span><a href="http://forum.eprints.org/"><span style="color: #0000ff; font-family: Verdana; font-size: medium;"><span style="text-decoration: underline;">http://forum.eprints.org/</span></span></a><span style="font-family: Verdana; font-size: medium;"><br /></span></p>
<br /><span style="font-family: Verdana; font-size: medium;">&nbsp;</span>
<p><span style="font-family: Verdana; font-size: medium;">&nbsp;</span><tt><span style="font-size: small;">*** Options: </span></tt><tt><span style="font-size: small;"><a href="http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech">http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech</a></span></tt><tt><span style="font-size: small;"><br /> *** Archive: </span></tt><tt><span style="font-size: small;"><a href="http://www.eprints.org/tech.php/">http://www.eprints.org/tech.php/</a></span></tt><tt><span style="font-size: small;"><br /> *** EPrints community wiki: </span></tt><tt><span style="font-size: small;"><a href="http://wiki.eprints.org/">http://wiki.eprints.org/</a></span></tt><tt><span style="font-size: small;"><br /> *** EPrints developers Forum: </span></tt><tt><span style="font-size: small;"><a href="http://forum.eprints.org/">http://forum.eprints.org/</a></span></tt><tt><span style="font-size: small;"><br /></span></tt></p>
<p>&nbsp;</p>
<!-- html ignored --><br />
<pre>*** Options: <a href="http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech">http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech</a>
*** Archive: <a href="http://www.eprints.org/tech.php/">http://www.eprints.org/tech.php/</a>
*** EPrints community wiki: <a href="http://wiki.eprints.org/">http://wiki.eprints.org/</a>
*** EPrints developers Forum: <a href="http://forum.eprints.org/">http://forum.eprints.org/</a>
</pre>
</blockquote>
<p>&nbsp;</p>
<div>&nbsp;</div>
</body></html>