[EP-tech] Indexing issues

Hilliard, Richard C. rchilliard at mun.ca
Wed Apr 23 14:17:47 BST 2014


Hi All,

 

We seem to be running into some indexer issues, just wondering if anyone
else is experiencing the same. On a full index, all seems to go well for
100's of eprints, after which we are getting piles of pdftotext errors.
Post-index, content for the eprints noted in the errors is (predictably)
not in the index. Example run at the bottom, any suggestions on what
might be happening would be welcome,

 

Cheers,

Casey

 

 

Example run:

 

eprints at puppy:~/bin$ ./epadmin reindex research_eprints eprint --verbose

 

Starting EPrints Repository.

Connecting to DB ... done.

 

You are about to reindex "eprint" in the research_eprints repository.

This can take some time.

 

Number of records in set: 3980

Continue [y/n] ? es

Exception: Unable to get write lock on
/usr/share/eprints3/archives/research_eprints/var/xapian: already locked

Indexed item: eprint/1

Indexed item: eprint/3

...

Indexed item: eprint/1488

Error 255 from pdftotext command: \/usr\/bin\/pdftotext -enc UTF-8
-layout
\/usr\/share\/eprints3\/archives\/research_eprints\/documents\/disk0\/00
\/00\/14\/90\/01\/Whelan_Maudie\.pdf
\/tmp\/jdeN7U21yV\/Whelan_Maudie\.txt at
/mnt/eprintsdrive/eprints3/bin/../perl_lib/EPrints/BackCompatibility.pm
line 463

        EPrints::Platform::exec('EPrints::Repository=HASH(0x37ed110)',
'pdftotext', 'TARGET', '/tmp/jdeN7U21yV/Whelan_Maudie.txt',
'TARGET_DIR', 'File::Temp::Dir=HASH(0x66d6300)', 'SOURCE',
'File::Temp=GLOB(0x6d3ebd0)') called at
/mnt/eprintsdrive/eprints3/bin/../perl_lib/EPrints/Repository.pm line
1994

        EPrints::Repository::exec('EPrints::Repository=HASH(0x37ed110)',
'pdftotext', 'SOURCE', 'File::Temp=GLOB(0x6d3ebd0)', 'TARGET_DIR',
'File::Temp::Dir=HASH(0x66d6300)', 'TARGET',
'/tmp/jdeN7U21yV/Whelan_Maudie.txt') called at
/usr/share/eprints3/perl_lib/EPrints/Plugin/Convert/PlainText.pm line
141

 
EPrints::Plugin::Convert::PlainText::export('EPrints::Plugin::Convert::P
lainText=HASH(0x6711160)', 'File::Temp::Dir=HASH(0x66d6300)',
'EPrints::DataObj::Document=HASH(0x67d0430)', 'text/plain') called at
(eval 102) line 137

        EPrints::Config::research_eprints::__ANON__('repository',
'EPrints::Repository=HASH(0x37ed110)', 'fields', 'ARRAY(0x67c5e40)',
'dataobj', 'EPrints::DataObj::EPrint=HASH(0x6f74250)') called at
/mnt/eprintsdrive/eprints3/bin/../perl_lib/EPrints/Repository.pm line
1551

 
EPrints::Repository::run_trigger('EPrints::Repository=HASH(0x37ed110)',
11, 'dataobj', 'EPrints::DataObj::EPrint=HASH(0x6f74250)', 'fields',
'ARRAY(0x67c5e40)') called at
/usr/share/eprints3/perl_lib/EPrints/Plugin/Event/Indexer.pm line 68

 
EPrints::Plugin::Event::Indexer::_index_fields('EPrints::Plugin::Event::
Indexer=HASH(0x64f7660)', 'EPrints::DataObj::EPrint=HASH(0x6f74250)',
'ARRAY(0x67c5e40)') called at
/usr/share/eprints3/perl_lib/EPrints/Plugin/Event/Indexer.pm line 37

 
EPrints::Plugin::Event::Indexer::index_all('EPrints::Plugin::Event::Inde
xer=HASH(0x64f7660)', 'EPrints::DataObj::EPrint=HASH(0x6f74250)') called
at ./epadmin line 1992

        main::__ANON__('EPrints::Repository=HASH(0x37ed110)',
'EPrints::DataSet=HASH(0x3f8d198)',
'EPrints::DataObj::EPrint=HASH(0x6f74250)', undef) called at
/mnt/eprintsdrive/eprints3/bin/../perl_lib/EPrints/List.pm line 664

        EPrints::List::map('EPrints::List=HASH(0x2952a48)',
'CODE(0x358c368)') called at ./epadmin line 1998

        main::reindex('research_eprints', 'eprint') called at ./epadmin
line 344

 

Error 255 from pdftotext command: \/usr\/bin\/pdftotext -enc UTF-8
-layout
\/usr\/share\/eprints3\/archives\/research_eprints\/documents\/disk0\/00
\/00\/14\/90\/03\/Whelan_Maudie\.pdf
\/tmp\/jdeN7U21yV\/Whelan_Maudie\.txt at
/mnt/eprintsdrive/eprints3/bin/../perl_lib/EPrints/BackCompatibility.pm
line 463

        EPrints::Platform::exec('EPrints::Repository=HASH(0x37ed110)',
'pdftotext', 'TARGET', '/tmp/jdeN7U21yV/Whelan_Maudie.txt',
'TARGET_DIR', 'File::Temp::Dir=HASH(0x66d6300)', 'SOURCE',
'File::Temp=GLOB(0x7137318)') called at
/mnt/eprintsdrive/eprints3/bin/../perl_lib/EPrints/Repository.pm line
1994

        EPrints::Repository::exec('EPrints::Repository=HASH(0x37ed110)',
'pdftotext', 'SOURCE', 'File::Temp=GLOB(0x7137318)', 'TARGET_DIR',
'File::Temp::Dir=HASH(0x66d6300)', 'TARGET',
'/tmp/jdeN7U21yV/Whelan_Maudie.txt') called at
/usr/share/eprints3/perl_lib/EPrints/Plugin/Convert/PlainText.pm line
141

 
EPrints::Plugin::Convert::PlainText::export('EPrints::Plugin::Convert::P
lainText=HASH(0x6e40e20)', 'File::Temp::Dir=HASH(0x66d6300)',
'EPrints::DataObj::Document=HASH(0x6c02ec8)', 'text/plain') called at
(eval 102) line 137

        EPrints::Config::research_eprints::__ANON__('repository',
'EPrints::Repository=HASH(0x37ed110)', 'fields', 'ARRAY(0x67c5e40)',
'dataobj', 'EPrints::DataObj::EPrint=HASH(0x6f74250)') called at
/mnt/eprintsdrive/eprints3/bin/../perl_lib/EPrints/Repository.pm line
1551

 
EPrints::Repository::run_trigger('EPrints::Repository=HASH(0x37ed110)',
11, 'dataobj', 'EPrints::DataObj::EPrint=HASH(0x6f74250)', 'fields',
'ARRAY(0x67c5e40)') called at
/usr/share/eprints3/perl_lib/EPrints/Plugin/Event/Indexer.pm line 68

 
EPrints::Plugin::Event::Indexer::_index_fields('EPrints::Plugin::Event::
Indexer=HASH(0x64f7660)', 'EPrints::DataObj::EPrint=HASH(0x6f74250)',
'ARRAY(0x67c5e40)') called at
/usr/share/eprints3/perl_lib/EPrints/Plugin/Event/Indexer.pm line 37

 
EPrints::Plugin::Event::Indexer::index_all('EPrints::Plugin::Event::Inde
xer=HASH(0x64f7660)', 'EPrints::DataObj::EPrint=HASH(0x6f74250)') called
at ./epadmin line 1992

        main::__ANON__('EPrints::Repository=HASH(0x37ed110)',
'EPrints::DataSet=HASH(0x3f8d198)',
'EPrints::DataObj::EPrint=HASH(0x6f74250)', undef) called at
/mnt/eprintsdrive/eprints3/bin/../perl_lib/EPrints/List.pm line 664

        EPrints::List::map('EPrints::List=HASH(0x2952a48)',
'CODE(0x358c368)') called at ./epadmin line 1998

        main::reindex('research_eprints', 'eprint') called at ./epadmin
line 344

 

Indexed item: eprint/1503

Error 255 from pdftotext command: \/usr\/bin\/pdftotext -enc UTF-8
-layout
\/usr\/share\/eprints3\/archives\/research_eprints\/documents\/disk0\/00
\/00\/15\/04\/01\/Emberley\-Burke_Wanda\.pdf
\/tmp\/diwwe8oTDs\/Emberley\-Burke_Wanda\.txt at
/mnt/eprintsdrive/eprints3/bin/../perl_lib/EPrints/BackCompatibility.pm
line 463

        EPrints::Platform::exec('EPrints::Repository=HASH(0x37ed110)',
'pdftotext', 'TARGET', '/tmp/diwwe8oTDs/Emberley-Burke_Wanda.txt',
'TARGET_DIR', 'File::Temp::Dir=HASH(0x6af6f88)', 'SOURCE',
'File::Temp=GLOB(0x70c9260)') called at
/mnt/eprintsdrive/eprints3/bin/../perl_lib/EPrints/Repository.pm line
1994

        EPrints::Repository::exec('EPrints::Repository=HASH(0x37ed110)',
'pdftotext', 'SOURCE', 'File::Temp=GLOB(0x70c9260)', 'TARGET_DIR',
'File::Temp::Dir=HASH(0x6af6f88)', 'TARGET',
'/tmp/diwwe8oTDs/Emberley-Burke_Wanda.txt') called at
/usr/share/eprints3/perl_lib/EPrints/Plugin/Convert/PlainText.pm line
141

 
EPrints::Plugin::Convert::PlainText::export('EPrints::Plugin::Convert::P
lainText=HASH(0x6993008)', 'File::Temp::Dir=HASH(0x6af6f88)',
'EPrints::DataObj::Document=HASH(0x6b9ae38)', 'text/plain') called at
(eval 102) line 137

        EPrints::Config::research_eprints::__ANON__('repository',
'EPrints::Repository=HASH(0x37ed110)', 'fields', 'ARRAY(0x65d2750)',
'dataobj', 'EPrints::DataObj::EPrint=HASH(0x6ed82c8)') called at
/mnt/eprintsdrive/eprints3/bin/../perl_lib/EPrints/Repository.pm line
1551

 
EPrints::Repository::run_trigger('EPrints::Repository=HASH(0x37ed110)',
11, 'dataobj', 'EPrints::DataObj::EPrint=HASH(0x6ed82c8)', 'fields',
'ARRAY(0x65d2750)') called at
/usr/share/eprints3/perl_lib/EPrints/Plugin/Event/Indexer.pm line 68

 
EPrints::Plugin::Event::Indexer::_index_fields('EPrints::Plugin::Event::
Indexer=HASH(0x64f7660)', 'EPrints::DataObj::EPrint=HASH(0x6ed82c8)',
'ARRAY(0x65d2750)') called at
/usr/share/eprints3/perl_lib/EPrints/Plugin/Event/Indexer.pm line 37

 
EPrints::Plugin::Event::Indexer::index_all('EPrints::Plugin::Event::Inde
xer=HASH(0x64f7660)', 'EPrints::DataObj::EPrint=HASH(0x6ed82c8)') called
at ./epadmin line 1992

        main::__ANON__('EPrints::Repository=HASH(0x37ed110)',
'EPrints::DataSet=HASH(0x3f8d198)',
'EPrints::DataObj::EPrint=HASH(0x6ed82c8)', undef) called at
/mnt/eprintsdrive/eprints3/bin/../perl_lib/EPrints/List.pm line 664

        EPrints::List::map('EPrints::List=HASH(0x2952a48)',
'CODE(0x358c368)') called at ./epadmin line 1998

        main::reindex('research_eprints', 'eprint') called at ./epadmin
line 344

 

Error 255 from pdftotext command: \/usr\/bin\/pdftotext -enc UTF-8
-layout
\/usr\/share\/eprints3\/archives\/research_eprints\/documents\/disk0\/00
\/00\/15\/04\/03\/Emberley\-Burke_Wanda\.pdf
\/tmp\/diwwe8oTDs\/Emberley\-Burke_Wanda\.txt at
/mnt/eprintsdrive/eprints3/bin/../perl_lib/EPrints/BackCompatibility.pm
line 463

        EPrints::Platform::exec('EPrints::Repository=HASH(0x37ed110)',
'pdftotext', 'TARGET', '/tmp/diwwe8oTDs/Emberley-Burke_Wanda.txt',
'TARGET_DIR', 'File::Temp::Dir=HASH(0x6af6f88)', 'SOURCE',
'File::Temp=GLOB(0x71b74a8)') called at
/mnt/eprintsdrive/eprints3/bin/../perl_lib/EPrints/Repository.pm line
1994

        EPrints::Repository::exec('EPrints::Repository=HASH(0x37ed110)',
'pdftotext', 'SOURCE', 'File::Temp=GLOB(0x71b74a8)', 'TARGET_DIR',
'File::Temp::Dir=HASH(0x6af6f88)', 'TARGET',
'/tmp/diwwe8oTDs/Emberley-Burke_Wanda.txt') called at
/usr/share/eprints3/perl_lib/EPrints/Plugin/Convert/PlainText.pm line
141

 
EPrints::Plugin::Convert::PlainText::export('EPrints::Plugin::Convert::P
lainText=HASH(0x686c748)', 'File::Temp::Dir=HASH(0x6af6f88)',
'EPrints::DataObj::Document=HASH(0x71fefb0)', 'text/plain') called at
(eval 102) line 137

        EPrints::Config::research_eprints::__ANON__('repository',
'EPrints::Repository=HASH(0x37ed110)', 'fields', 'ARRAY(0x65d2750)',
'dataobj', 'EPrints::DataObj::EPrint=HASH(0x6ed82c8)') called at
/mnt/eprintsdrive/eprints3/bin/../perl_lib/EPrints/Repository.pm line
1551

 
EPrints::Repository::run_trigger('EPrints::Repository=HASH(0x37ed110)',
11, 'dataobj', 'EPrints::DataObj::EPrint=HASH(0x6ed82c8)', 'fields',
'ARRAY(0x65d2750)') called at
/usr/share/eprints3/perl_lib/EPrints/Plugin/Event/Indexer.pm line 68

 
EPrints::Plugin::Event::Indexer::_index_fields('EPrints::Plugin::Event::
Indexer=HASH(0x64f7660)', 'EPrints::DataObj::EPrint=HASH(0x6ed82c8)',
'ARRAY(0x65d2750)') called at
/usr/share/eprints3/perl_lib/EPrints/Plugin/Event/Indexer.pm line 37

 
EPrints::Plugin::Event::Indexer::index_all('EPrints::Plugin::Event::Inde
xer=HASH(0x64f7660)', 'EPrints::DataObj::EPrint=HASH(0x6ed82c8)') called
at ./epadmin line 1992

        main::__ANON__('EPrints::Repository=HASH(0x37ed110)',
'EPrints::DataSet=HASH(0x3f8d198)',
'EPrints::DataObj::EPrint=HASH(0x6ed82c8)', undef) called at
/mnt/eprintsdrive/eprints3/bin/../perl_lib/EPrints/List.pm line 664

        EPrints::List::map('EPrints::List=HASH(0x2952a48)',
'CODE(0x358c368)') called at ./epadmin line 1998

        main::reindex('research_eprints', 'eprint') called at ./epadmin
line 344

 

 

-----------------------------------------------

Casey Hilliard

System Administrator 

Library Information Technology Services (LITS)

Memorial University of Newfoundland

Ph: (709)864-6267

Ce: (709)699-3041

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20140423/d872c613/attachment-0001.html 


More information about the Eprints-tech mailing list