[EP-tech] Re: Antwort: Use of truncation in advanced searches

John Salter J.Salter at leeds.ac.uk
Tue Apr 21 10:41:28 BST 2015


Hi Giles,
One thing we did on our platform was the change described here:
https://github.com/eprints/eprints/issues/120 (see also: http://www.eprints.org/tech.php/18677.html).

The description of this issue is around simple searches (when you’re not using Xapian), but I don’t know if it would have any effect on the advanced searches too.

In some circumstances, MySQL takes longer to work out the best way to run a query than it would take just running the query inefficiently. This change makes it try less iterations to find the ‘best’ way to run the query.

Might be worth a try?

Cheers,
John


From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of martin.braendle at id.uzh.ch
Sent: 21 April 2015 10:19
To: eprints-tech at ecs.soton.ac.uk
Subject: [EP-tech] Antwort: Use of truncation in advanced searches


Hi Gilles,

our repo has about 80'000 records and 56% fulltext, so is comparable to yours.

Advanced search of thermograph* in

title: immediate (1-2 seconds)
documents (full text): 20-30 seconds. The mysql daemon goes up to 70-100% CPU load.

Quick search (Xapian):

title:thermograph*  : immediate
thermograph* : immediate

We recommend in our help page (http://www.zora.uzh.ch/help/) that Quick Search should be the tool of choice and only for very precise searches Advanced Search should be used.


From a recent debug session (on another issue) I know that EPrints translates behind the scenes an advanced search query into a series of dozens of complicated SQL statements. It might be that for certain cases these are not optimized.

If it were that simple as

select distinct ei.eprintid from eprint__rindex ei, eprint e  where ei.field='documents' and ei.word like 'thermograph%' and e.eprint_status='archive' and e.eprintid=ei.eprintid;

then that query would be answered in a fraction of a second. But it isn't, and can't be, and EPrints software engineers surely have put a lot of effort into the EPrints database engine part to cover all possible situations.

Best regards,

Martin

--
Dr. Martin Brändle
Zentrale Informatik
Universität Zürich
Winterthurerstr. 190
CH-8057 Zürich


[Inactive hide details for Gilles Fournié ---21/04/2015 10:22:27---Hi, I have a question about right-hand truncation in advanced]Gilles Fournié ---21/04/2015 10:22:27---Hi, I have a question about right-hand truncation in advanced searches.

Von: Gilles Fournié <gilles.fournie at cirad.fr<mailto:gilles.fournie at cirad.fr>>
An: eprints-tech at ecs.soton.ac.uk<mailto:eprints-tech at ecs.soton.ac.uk>
Datum: 21/04/2015 10:22
Betreff: [EP-tech]  Use of truncation in advanced searches
Gesendet von: eprints-tech-bounces at ecs.soton.ac.uk<mailto:eprints-tech-bounces at ecs.soton.ac.uk>

________________________________



Hi,

I have a question about right-hand truncation in advanced searches.

If we search for (in title for example) :

    thermography

the search runs for 1 to 3 seconds before returning results.

If we extend our search to :

    thermography thermographie

the search time is about the same.

But if we try to use a wildcard :

    thermograph*

the search takes a very long time (counts in minutes) !

Does somebody have experienced such delays ?
Any clues about what we can do to solve this problem ?

(our archive contains ~ 91000 eprints)

Best regards,
GF
*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20150421/7421a559/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.gif
Type: image/gif
Size: 105 bytes
Desc: image001.gif
Url : http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20150421/7421a559/attachment-0001.gif 


More information about the Eprints-tech mailing list