[EP-tech] Re: Indexing files inside a compressed (zip, rar) document

sf2 sf2 at ecs.soton.ac.uk
Mon Feb 17 15:15:07 GMT 2014


Hi Andras, 

If you're on 3.3, have a look inside lib/cfg.d/search_xapian.pl and look
where it does the full text indexing (look for "Convert"). It might just
be easier to write a zip/tar to "text/plain" Convert plug-in. 

Note that at the moment there are no consideration on the "security"
settings of the documents. It might something you need to address when
you're indexing research data. 

If you come up with something, feel free to share your work on e.g.
GitHub (github.com/eprints/eprints), thanks! 


On 17.02.2014 14:39, András Micsik wrote: 

> Hi,
> do you have any hint on how to extend the indexer to index the inside of 
> zip/rar/etc archives? Is there any ready solution for this, or do I have to 
> write an indexer plugin? The rationale behind: the large number of files 
> contain research data, so they are easiest handled as a zip, but still would 
> be nice to search inside...
> thanks,

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20140217/7e195b71/attachment.html 

More information about the Eprints-tech mailing list