[EP-tech] Re: Simple search - length of query

Sebastien Francois sf2 at ecs.soton.ac.uk
Wed Mar 5 11:59:43 GMT 2014


Hi John,

Check this: https://github.com/eprints/eprints/issues/120

Without Xapian, the simple search will run SELECT's for each input word 
for each "selected field in the simple search" so yep queries will get 
long quickly. I've seen this to cause a repository to halt with any 
other queries being locked until the max no of connections is reached in 
mysql.

Xapian searches in a totally different manner and scales up > 
http://vmdev1.eprints.org/cgi/xapian?q=characteristics+of+organic+matter+and+carbonate+in+saltmarsh+sediments+from+south+west+Scotland

Hope this helps,
Seb.

On 05/03/14 11:52, John Salter wrote:
> Hi,
> Yesterday we had a bit of an issue with our repository when someone pasted a full citation into the simple search box.
> This produced an impressive SQL query that locked things up and made users unhappy...
>
> Is there a way to sanitise what a 'simple' search might try to handle? e.g. would restricting it to a certain number of words be acceptable?
> Would the Xapian search handle a request like the one below any better?
>
> Details below/attached if you're interested!
> Cheers,
> John
>
> GET /cgi/search/simple?full=%E2%80%98Families%2C+Domesticity+and+Intimacy%3A+Changing+Relationships+in+Changi
> ng+Times%E2%80%99%2C+in+Richardson%2C+D%2C+and+Robinson%2C+V.+%28eds%29+Introducing+Women%27s+Studies%2C+third+edition.+Basingstoke%3A+Palgrave%2C+2008+pp.+1
> 25-143.+&_action_search=Search&_order=bytitle&basic_srchtype=ALL&_satisfyall=ALL
>
> searchexp created in cache table:
> 0|1|-date/creators_name/title|archive|-|full:abstract/creators_name/date/documents/title:ALL:IN:?Families, Domesticity and Intimacy%3A Changing Relationships in Changing Times?, in Richardson, D, and Robinson, V. (eds) Introducing Women's Studies, third edition. Basingstoke%3A Palgrave, 2008 pp. 125-143. |-|eprint_status:eprint_status:ANY:EQ:archive|metadata_visibility:metadata_visibility:ANY:EQ:show
>
> The SQL generated by search is attached (get ready for this - it's a thing of beauty ;o) - you can see why it took a while to run!
>
>
>
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20140305/5341aba3/attachment.html 


More information about the Eprints-tech mailing list