[EP-tech] Fulltext (PDF) index

John Salter J.Salter at leeds.ac.uk
Mon Dec 6 15:23:18 GMT 2021


CAUTION: This e-mail originated outside the University of Southampton.
Hi Mohd,
I would check to see if the indexer is running, and if the task queue has anything in it.

The quickest way to do this is to visit:
https://[your repository URL]/cgi/counter

This should present a text response. Look for 'event_queue' and 'indexer'.

In EPrints, the fulltext indexing jobs are placed in the event_queue.
The 'indexer' works through this queue.

Normally, the 'indexer' should report as 'running', and the event_queue should be close to zero - meaning the indexer is doing what is needed.

If the indexer is either 'stopped' or 'stalled', try running the indexer with one of these parameters:
~/bin/indexer [status | stop | start]
The indexer writes a log to ~/var/indexer.log - if something is causing the indexer to stop, there may be some information in there.

To see what is actually in the event_queue (rather than just how many items are waiting), in the web interface, go to the Admin menu -> Manage Records -> Tasks.
If there are a lot of items, you can sort the list, or filter on the start time, status etc.

Hopefully that helps!

Cheers,
John

________________________________
From: eprints-tech-bounces at ecs.soton.ac.uk <eprints-tech-bounces at ecs.soton.ac.uk> on behalf of MOHD.IZWAN SALIM via Eprints-tech <eprints-tech at ecs.soton.ac.uk>
Sent: 06 December 2021 04:04
To: EDER Norbert via Eprints-tech <eprints-tech at ecs.soton.ac.uk>
Subject: [EP-tech] Fulltext (PDF) index

CAUTION: This e-mail originated outside the University of Southampton.
Dear EPrints Community

I just set up a new repo with the latest Eprints version.

How searching word in pdf (full text) does not return any result.

The PDF is already OCR and searchable.

I already run ./epadmin erase_fulltext_index repo --verbose

Is there anything should I look at?

Regards

Mohd Izwan Bin Salim
UiTM Digital Library


PENAFIAN: E-mel ini dan apa-apa fail yang dihantar bersama-samanya ("Mesej") adalah dihasratkan hanya untuk kegunaan penerima yang dinyatakan di atas dan mungkin mengandungi maklumat yang tidak umum, bermilik, istimewa, sulit dan dikecualikan dari penzahiran di bawah undang-undang yang terpakai termasuklah Akta Rahsia Rasmi 1972. BACA SELANJUTNYA...<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.uitm.edu.my%2Findex.php%2Fcomponent%2Fcontent%2Farticle%3Fid%3D2%26Itemid%3D103&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C737cd42e134f454c842f08d9b8cc55c1%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637744010013039066%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=kcS4%2FDDjXl4eN1Y7ocDspffxEarF8FC2FdGGk0zIhdQ%3D&amp;reserved=0>

________________________________
DISCLAIMER : This e-mail and any files transmitted with it ("Message") is intended only for the use of the recipient(s) named above and may contain information that is non-public,  proprietary,  privileged,  confidential  and  exempt  from  disclosure under applicable law including the Official Secrets Act 1972. READ MORE...<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.uitm.edu.my%2Findex.php%2Fcomponent%2Fcontent%2Farticle%3Fid%3D2%26Itemid%3D103&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C737cd42e134f454c842f08d9b8cc55c1%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637744010013039066%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=kcS4%2FDDjXl4eN1Y7ocDspffxEarF8FC2FdGGk0zIhdQ%3D&amp;reserved=0>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20211206/2f4c19f1/attachment.html 


More information about the Eprints-tech mailing list