[EP-tech] Antwort: Use of truncation in advanced searches

Hi Gilles,

our repo has about 80'000 records and 56% fulltext, so is comparable to

Advanced search of thermograph* in

title: immediate (1-2 seconds)
documents (full text): 20-30 seconds. The mysql daemon goes up to 70-100%
CPU load.

Quick search (Xapian):

title:thermograph*  : immediate
thermograph* : immediate

We recommend in our help page (http://www.zora.uzh.ch/help/) that Quick
Search should be the tool of choice and only for very precise searches
Advanced Search should be used.

>From a recent debug session (on another issue) I know that EPrints
translates behind the scenes an advanced search query into a series of
dozens of complicated SQL statements. It might be that for certain cases
these are not optimized.

If it were that simple as

select distinct ei.eprintid from eprint__rindex ei, eprint e  where
ei.field='documents' and ei.word like 'thermograph%' and
e.eprint_status='archive' and e.eprintid=ei.eprintid;

then that query would be answered in a fraction of a second. But it isn't,
and can't be, and EPrints software engineers surely have put a lot of
effort into the EPrints database engine part to cover all possible

I have a question about right-hand truncation in advanced searches.

If we search for (in title for example) :


the search runs for 1 to 3 seconds before returning results.

If we extend our search to :

     thermography thermographie

the search time is about the same.

But if we try to use a wildcard :


the search takes a very long time (counts in minutes) !

Does somebody have experienced such delays ?
Any clues about what we can do to solve this problem ?

(our archive contains ~ 91000 eprints)

Best regards,
