[EP-tech] Re: Question about full text search (Documents in Advanced Search page)

Michael Street mstreet at yorku.ca
Wed Jan 27 18:23:12 GMT 2016


Hi Lizz,

Thanks, yes, I have found the tables on my own and can manually insert 
terms and it works that way.  I just can't figure out where the 
disconnect is between what the Indexer is seeing and what it is or 
isn't, inserting in the db.

I will check the video for more hints though, thanks.

--Mike


On 1/27/2016 11:24 AM, Lizz Jennings wrote:
> Hi Michael,
>
> Have you looked at the database entries for the indexes?  Adam showed which tables to look at (at about 6 minutes in) in the troubleshooting search video:
>
> http://wiki.eprints.org/w/Training_Video:Search_Troubleshooting
>
> That might offer a hint?
>
> Lizz
>
> --
>
> Lizz Jennings BA MSc ACLIP MCLIP (Revalidated 2015)
>
> Technical Data Officer
>
> The Library 4.10, University of Bath, Bath, BA2 7AY UK
>
> Ext. 3570 (External 01225 383570)
>
> E.Jennings at bath.ac.uk
>
> ________________________________________
> From: eprints-tech-bounces at ecs.soton.ac.uk <eprints-tech-bounces at ecs.soton.ac.uk> on behalf of Michael Street <mstreet at yorku.ca>
> Sent: 27 January 2016 15:46
> To: eprints-tech at ecs.soton.ac.uk
> Subject: [EP-tech] Re: Question about full text search (Documents in Advanced Search page)
>
> Hi folks,
>
> Is there any other way to actually find out more details about what the
> Indexer is doing?  I've turned it on verbose logging and loglevel 6, but
> I'd like to really know exactly what terms it's found, and what it's
> inserting, if anything, into the database.
>
> Thanks,
> Mike.
>
> On 1/27/2016 9:46 AM, Michael Street wrote:
>> Hi Alan,
>>
>> Thanks but I have tried that.  I've increased the logging verbosity and
>> tried reindexing one of the offending deposits.  Nothing in the logs.
>> To be honest, I see nothing in the logs but that there's no tasks.
>>
>> Occasionally I see something about documents being locked but the
>> numbers don't match up.  I'm not sure how the numbering system works
>> (for ex. 'document.5917 is locked').  I would assume though, that I
>> would see an error message when reindexing one of the offending
>> deposits.  I don't see anything when reindexing those, so I assume the
>> 'locked' message has nothing to do with it.
>>
>> I will try the Xapian plugin later....see if that makes any difference.
>>
>> --Mike
>>
>>
>> On 1/25/2016 4:20 AM, Alan.Stiles wrote:
>>> Have you tried to reindex one of the missing items to see if it made a difference?  Check the error_log whilst it reindexes in case eprints is having some other issue with opening the pdf (we sometimes have issues with e.g. apostrophes in the filenames).
>>>
>>>
>>> -----Original Message-----
>>> From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of Michael Street
>>> Sent: 22 January 2016 21:01
>>> To: eprints-tech at ecs.soton.ac.uk
>>> Subject: [EP-tech] Re: Question about full text search (Documents in Advanced Search page)
>>>
>>> Hi again,
>>>
>>> Does anyone have any idea why these documents are not showing up in the search results?
>>>
>>> Any suggestions would really be appreciated.  I'm at a loss as to why it's not returning results that clearly have the search term in the pdf (and the converted text document).
>>>
>>> --Mike Street
>>>
>>> On 1/15/2016 11:05 AM, Michael Street wrote:
>>>> Hi John,
>>>>
>>>> Thanks very much for your response.  Please find my answers below:
>>>>
>>>> 1)  Indexer is running and confirmed to be working.  The documents
>>>> that don't show up are some of the oldest and are available through
>>>> other links.  Newly deposited items also show up in the Views.
>>>>
>>>> 2)  I have tried pdftotext on the system and had no issues with
>>>> converting it.  I also was able to find the search term within the
>>>> document easily.
>>>>
>>>> 3)  I run a cronjob that updates the DB and switches everything to be
>>>> visible, every 15 minutes.  My client does not want anything to be
>>>> hidden, especially previous versions of eprints, so this was the
>>>> easiest way to achieve that, for me.  Also, the eprints in question do
>>>> show up in the Views, which shows they're set to visible.
>>>>
>>>> So if you have any other ideas, I'd really appreciate it.  I'm at a
>>>> loss here.
>>>>
>>>> Thanks,
>>>> Mike.
>>>>
>>>>
>>>> On 1/14/2016 4:35 PM, John Salter wrote:
>>>>> Hi,
>>>>> I'd check that you indexer is running, and that the task queue is processed.
>>>>>
>>>>> I'd also check that the PDFs aren't restricted in some way (maybe see what something like pdftotext returns when run against one of the not-returned PDFs.
>>>>>
>>>>> Also, as was mentioned in a different thread recently, check what the 'metadata visibility' flag for the EPrint is.
>>>>>
>>>>> If none of that gets you anywhere, let us know and we'll put our collective thinking caps on!
>>>>>
>>>>> Cheers,
>>>>> John
>>>>>
>>>>> ________________________________________
>>>>> From: eprints-tech-bounces at ecs.soton.ac.uk
>>>>> <eprints-tech-bounces at ecs.soton.ac.uk> on behalf of Michael Street
>>>>> <mstreet at yorku.ca>
>>>>> Sent: 14 January 2016 16:04
>>>>> To: eprints-tech at ecs.soton.ac.uk
>>>>> Subject: [EP-tech] Question about full text search (Documents in Advanced       Search page)
>>>>>
>>>>> Hi,
>>>>>
>>>>> I've got some pdfs in the repository that include the phrase 'bohm'
>>>>> many times but the Advanced Search page is only returning 4 out of
>>>>> probably
>>>>> 25+ eprints as hits on the phrase.  I'm using the Documents search
>>>>> 25+ box,
>>>>> which I believe it the full-text search box.  Is there something I'm
>>>>> missing?
>>>>>
>>>>> Any help would be appreciated thanks, Mike.
>>>>>
>>>>> *** Options:
>>>>> http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>>>>> *** Archive: http://www.eprints.org/tech.php/
>>>>> *** EPrints community wiki: http://wiki.eprints.org/
>>>>> *** EPrints developers Forum: http://forum.eprints.org/
>>>>>
>>>>> *** Options:
>>>>> http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>>>>> *** Archive: http://www.eprints.org/tech.php/
>>>>> *** EPrints community wiki: http://wiki.eprints.org/
>>>>> *** EPrints developers Forum: http://forum.eprints.org/
>>>> *** Options:
>>>> http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>>>> *** Archive: http://www.eprints.org/tech.php/
>>>> *** EPrints community wiki: http://wiki.eprints.org/
>>>> *** EPrints developers Forum: http://forum.eprints.org/
>>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>>> *** Archive: http://www.eprints.org/tech.php/
>>> *** EPrints community wiki: http://wiki.eprints.org/
>>> *** EPrints developers Forum: http://forum.eprints.org/
>>> -- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302). The Open University is authorised and regulated by the Financial Conduct Authority.
>>>
>>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>>> *** Archive: http://www.eprints.org/tech.php/
>>> *** EPrints community wiki: http://wiki.eprints.org/
>>> *** EPrints developers Forum: http://forum.eprints.org/
>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>> *** Archive: http://www.eprints.org/tech.php/
>> *** EPrints community wiki: http://wiki.eprints.org/
>> *** EPrints developers Forum: http://forum.eprints.org/
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/



More information about the Eprints-tech mailing list