[EP-tech] Re: Simple search - length of query
Sebastien Francois
sf2 at ecs.soton.ac.uk
Wed Mar 5 11:59:43 GMT 2014
Hi John,
Check this: https://github.com/eprints/eprints/issues/120
Without Xapian, the simple search will run SELECT's for each input word
for each "selected field in the simple search" so yep queries will get
long quickly. I've seen this to cause a repository to halt with any
other queries being locked until the max no of connections is reached in
mysql.
Xapian searches in a totally different manner and scales up >
http://vmdev1.eprints.org/cgi/xapian?q=characteristics+of+organic+matter+and+carbonate+in+saltmarsh+sediments+from+south+west+Scotland
Hope this helps,
Seb.
On 05/03/14 11:52, John Salter wrote:
> Hi,
> Yesterday we had a bit of an issue with our repository when someone pasted a full citation into the simple search box.
> This produced an impressive SQL query that locked things up and made users unhappy...
>
> Is there a way to sanitise what a 'simple' search might try to handle? e.g. would restricting it to a certain number of words be acceptable?
> Would the Xapian search handle a request like the one below any better?
>
> Details below/attached if you're interested!
> Cheers,
> John
>
> GET /cgi/search/simple?full=%E2%80%98Families%2C+Domesticity+and+Intimacy%3A+Changing+Relationships+in+Changi
> ng+Times%E2%80%99%2C+in+Richardson%2C+D%2C+and+Robinson%2C+V.+%28eds%29+Introducing+Women%27s+Studies%2C+third+edition.+Basingstoke%3A+Palgrave%2C+2008+pp.+1
> 25-143.+&_action_search=Search&_order=bytitle&basic_srchtype=ALL&_satisfyall=ALL
>
> searchexp created in cache table:
> 0|1|-date/creators_name/title|archive|-|full:abstract/creators_name/date/documents/title:ALL:IN:?Families, Domesticity and Intimacy%3A Changing Relationships in Changing Times?, in Richardson, D, and Robinson, V. (eds) Introducing Women's Studies, third edition. Basingstoke%3A Palgrave, 2008 pp. 125-143. |-|eprint_status:eprint_status:ANY:EQ:archive|metadata_visibility:metadata_visibility:ANY:EQ:show
>
> The SQL generated by search is attached (get ready for this - it's a thing of beauty ;o) - you can see why it took a while to run!
>
>
>
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20140305/5341aba3/attachment.html
More information about the Eprints-tech
mailing list