[EP-tech] Eprints-tech Digest, Vol 91, Issue 43

martin.braendle at id.uzh.ch martin.braendle at id.uzh.ch
Thu Apr 28 06:33:51 BST 2016


Hi Adam

for robots crawling, we have a separate view 
that creates no variations. Accordingly, the robots.txt is configured so that the other views can not be crawled. 

Best regards

Martin




> Am 27.04.2016 um 17:20 schrieb Adam Field <Adam.Field at jisc.ac.uk>:
> 
> deletion of old files is good, but not a full solution (you are likely to be crawled by all sorts of robots, so all files will be generated).  This will only be a solution if there are lots of files from old configuration that were never deleted.
> 
> generate menus won't help either, I'm afraid.  It'll just regenerate the menus which isn't connected to the problem you're having.
> 
> I've had a peek at the XML of one of your records and made some assumptions about the nature of your repository.  Have you thought of filtering the person_view field so that it only contains institutional authors.  For example, https://eref.uni-bayreuth.de/15081/ contains:
> 
>     <person_view>
>       <item>
>         <name>
>           <family>Bornkamm</family>
>           <given>Joachim</given>
>         </name>
>       </item>
>       <item>
>         <name>
>           <family>Brömmelmeyer</family>
>           <given>Christoph</given>
>         </name>
>       </item>
>       <item>
>         <name>
>           <family>Brönneke</family>
>           <given>Tobias</given>
>         </name>
>       </item>
>       <item>
>         <name>
>           <family>Bultmann</family>
>           <given>Friedrich</given>
>         </name>
>       </item>
>       <item>
>         <name>
>           <family>Busch</family>
>           <given>Dörte</given>
>         </name>
>       </item>
>       <item>
>         <name>
>           <family>Derleder</family>
>           <given>Peter</given>
>         </name>
>       </item>
>       <item>
>         <name>
>           <family>Ernst</family>
>           <given>Stefan</given>
>         </name>
>       </item>
>       <item>
>         <name>
>           <family>Hirsch</family>
>           <given>Günter</given>
>         </name>
>       </item>
>       <item>
>         <name>
>           <family>Hörmann</family>
>           <given>Günter</given>
>         </name>
>       </item>
>       <item>
>         <name>
>           <family>Kohte</family>
>           <given>Wolfhard</given>
>         </name>
>       </item>
>       <item>
>         <name>
>           <family>Maier</family>
>           <given>Arne</given>
>         </name>
>       </item>
>       <item>
>         <name>
>           <family>Metz</family>
>           <given>Rainer</given>
>         </name>
>       </item>
>       <item>
>         <name>
>           <family>Rott</family>
>           <given>Peter</given>
>         </name>
>       </item>
>       <item>
>         <name>
>           <family>Schmidt-Kessel</family>
>           <given>Martin</given>
>         </name>
>         <id>martin.schmidt-kessel at uni-bayreuth.de</id>
>         <ubt>yes</ubt>
>       </item>
>       <item>
>         <name>
>           <family>Schwintowski</family>
>           <given>Hans-Peter</given>
>         </name>
>       </item>
>       <item>
>         <name>
>           <family>Stadler</family>
>           <given>Astrid</given>
>         </name>
>       </item>
>       <item>
>         <name>
>           <family>Tamm</family>
>           <given>Marina</given>
>         </name>
>       </item>
>       <item>
>         <name>
>           <family>Tiffe</family>
>           <given>Achim</given>
>         </name>
>       </item>
>       <item>
>         <name>
>           <family>Tonner</family>
>           <given>Klaus</given>
>         </name>
>       </item>
>     </person_view>
> 
> You'd get good utility from your repository if it was only:
> 
>     <person_view>
>       <item>
>         <name>
>           <family>Schmidt-Kessel</family>
>           <given>Martin</given>
>         </name>
>         <id>martin.schmidt-kessel at uni-bayreuth.de</id>
>         <ubt>yes</ubt>
>       </item>
>     </person_view>
> 
> I'm assuming you really only care deeply about people with the ubt flag set to 'yes'.
> 
>  
> <6B9928AE-9C97-4E75-8330-7E24168F02D7[10].png>
> Adam Field
> SHERPA services analyst developer
> 
> From: <eprints-tech-bounces at ecs.soton.ac.uk> on behalf of Verena Mattes <verena.mattes at ub.uni-bayreuth.de>
> Reply-To: "eprints-tech at ecs.soton.ac.uk" <eprints-tech at ecs.soton.ac.uk>
> Date: Wednesday, 27 April 2016 14:37
> To: "eprints-tech at ecs.soton.ac.uk" <eprints-tech at ecs.soton.ac.uk>
> Subject: Re: [EP-tech] Eprints-tech Digest, Vol 91, Issue 43
> 
> Hi Martin,
> 
> The deletion of older files is definitely a good idea, I'm looking at
> that now.
> 
> Not sure about the --generate menus option, but I'm going to try it out
> with our Eprints test repository.
> 
> Thanks!
> 
> Verena
> 
> 
> > Date: Wed, 27 Apr 2016 11:18:12 +0200
> > From: martin.braendle at id.uzh.ch
> > Subject: [EP-tech] Antwort: Re: Problems with view generation: EPrints
> > System Error
> > To: eprints-tech at ecs.soton.ac.uk
> > Message-ID:
> > <OF8739E530.67C3E499-ONC1257FA2.0032CB1D-C1257FA2.00331B1A at lotus.uzh.ch>
> >
> > Content-Type: text/plain; charset="iso-8859-1"
> >
> >
> > Hi Verena,
> >
> > did you try out the --generate menus  option of generate_views? This
> > reduces the number of files considerably.
> >
> > Also, in our nightly cron job, we delete all files that are older than 24
> > hours.
> >
> > Best regards,
> >
> > Martin
> >
> > --
> > Dr. Martin Br?ndle
> > Zentrale Informatik
> > Universit?t Z?rich
> > Stampfenbachstr. 73
> > CH-8006 Z?rich
> >
> > mail: martin.braendle at id.uzh.ch
> > phone: +41 44 63 56705
> > fax: +41 44 63 54505
> > http://www.zi.uzh.ch
> >
> >
> >
> > Von: Verena Mattes <verena.mattes at ub.uni-bayreuth.de>
> > An: eprints-tech at ecs.soton.ac.uk
> > Datum: 27/04/2016 10:54
> > Betreff: Re: [EP-tech] Problems with view generation: EPrints System
> >              Error
> > Gesendet von:  eprints-tech-bounces at ecs.soton.ac.uk
> >
> >
> >
> > Hi Adam,
> >
> > For now, my colleagues in the IT service centre solved the problem by
> > raising the number of files permitted in one directory on our Netapp,
> > but that is only a temporary solution. Almost 340.000 files in one
> > directory are just too many, so I'll have to find a way to split the
> > person view into groups somehow.
> >
> > Your suggestion of checking the number of variations should be a first
> > step in reducing the number of files. Here's our current view
> > configuration, which I'm going to change by taking away the DEFAULT
> > variation:
> >
> >>         {
> >>                  id => "person",
> >>                  hideempty => 0,
> >>                  allow_null => 0,
> >>                  menus => [
> >>                          {
> >>                                  fields => [ "person_view_name" ],
> >>                                  mode => "sections",
> >>                                  grouping_function =>
> > "EPrints::Update::Views::group_by_first_character",
> >>                                  group_range_function =>
> > "EPrints::Update::Views::cluster_ranges_40",
> >>                                  new_column_at => [40],
> >>                                  open_first_section => 1,
> >>                          },
> >>                          ],
> >>                  order => "-date;res=year/title/publication/book_title",
> >>                  hideup => 0,
> >>                  nocount => 0,
> >>                  notimestamp => 0,
> >>                  include => 1,
> >>                  variations => [
> >>                          "type",
> >>                          "date;truncate=4,reverse",
> >>                          "DEFAULT",
> >>                  ],
> >>                  citation => "view",            # Views mit
> > Volltext-Hinweis!
> >>          },
> >
> > Thanks for all your help!
> >
> > Verena
> >
> >>
> >> Message: 1
> >> Date: Tue, 26 Apr 2016 08:35:49 +0000
> >> From: Adam Field <Adam.Field at jisc.ac.uk>
> >> Subject: Re: [EP-tech] Problems with view generation: EPrints System
> >> Error
> >> To: "eprints-tech at ecs.soton.ac.uk" <eprints-tech at ecs.soton.ac.uk>
> >> Message-ID: <C2A1398C-702E-4C79-B4CD-2BAD5987789F at jisc.ac.uk>
> >> Content-Type: text/plain; charset="utf-8"
> >>
> >> My previous suggestion wasn't right and won't work exactly as
> expected on
> > reflection.
> >>
> >> What happens when you run generate_views from the command line?
> >> Just how many files do you have in the directory on the hard disk?
> >>
> >> Can you paste in the browse view configuration?  It may be that turning
> > of variations will solve this.
> >>
> >> [Jisc]<http://www.jisc.ac.uk/>
> >>
> >> Adam Field
> >> SHERPA services analyst developer
> >>
> >>
> >> From: <eprints-tech-bounces at ecs.soton.ac.uk<
> > mailto:eprints-tech-bounces at ecs.soton.ac.uk>> on behalf of Verena Mattes
> >
> <verena.mattes at ub.uni-bayreuth.de<mailto:verena.mattes at ub.uni-bayreuth.de>>
> >> Reply-To: "eprints-tech at ecs.soton.ac.uk<
> > mailto:eprints-tech at ecs.soton.ac.uk>" <eprints-tech at ecs.soton.ac.uk<
> > mailto:eprints-tech at ecs.soton.ac.uk>>
> >> Date: Tuesday, 26 April 2016 07:48
> >> To: "eprints-tech at ecs.soton.ac.uk<mailto:eprints-tech at ecs.soton.ac.uk>"
> > <eprints-tech at ecs.soton.ac.uk<mailto:eprints-tech at ecs.soton.ac.uk>>
> >> Subject: Re: [EP-tech] Problems with view generation: EPrints System
> > Error
> >>
> >> Hi Alan and Adam,
> >>
> >> Thanks for your suggestions. We are running EPrints 3.3.11 and the free
> >> space on our hard disk is not a problem. I checked with a colleague and
> >> he confirmed my guess that the problem is caused by the number of files
> >> in the directory, obviously the maximum number was reached last week and
> >> no further files can be created.
> >> Does anybody have a quick idea on how to divide files for one view, the
> >> person view, to different directories? Would it be possible to define
> >> different views for groups of letters, e.g. A-D or something like that?
> >>
> >> Thanks!
> >>
> >> Verena
> >>
> >>
> >>
> >> Date: Mon, 25 Apr 2016 10:58:47 +0000
> >> From: Adam Field <Adam.Field at jisc.ac.uk<mailto:Adam.Field at jisc.ac.uk>>
> >> Subject: Re: [EP-tech] Problems with view generation: EPrints System
> >> Error
> >> To: "eprints-tech at ecs.soton.ac.uk<mailto:eprints-tech at ecs.soton.ac.uk>"
> > <eprints-tech at ecs.soton.ac.uk<mailto:eprints-tech at ecs.soton.ac.uk>>
> >> Message-ID: <3D933692-5E6D-4B33-9499-1DD23E2B4C62 at jisc.ac.uk<
> > mailto:3D933692-5E6D-4B33-9499-1DD23E2B4C62 at jisc.ac.uk>>
> >> Content-Type: text/plain; charset="utf-8"
> >>
> >> What specific version of EPrints are you running?  What is line 1550
> > of /usr/share/eprints3/perl_lib/EPrints/Update/Views.pm
> >>
> >> Also, silly question, but how much free space do you have on your hard
> > disk?
> >>
> >> [Jisc]<http://www.jisc.ac.uk/>
> >>
> >> Adam Field
> >> SHERPA services analyst developer
> >>
> >>
> >> From: <eprints-tech-bounces at ecs.soton.ac.uk<
> > mailto:eprints-tech-bounces at ecs.soton.ac.uk><
> > mailto:eprints-tech-bounces at ecs.soton.ac.uk>> on behalf of Verena Mattes
> >
> <verena.mattes at ub.uni-bayreuth.de<mailto:verena.mattes at ub.uni-bayreuth.de><
> > mailto:verena.mattes at ub.uni-bayreuth.de>>
> >> Reply-To: "eprints-tech at ecs.soton.ac.uk<
> >
> mailto:eprints-tech at ecs.soton.ac.uk><mailto:eprints-tech at ecs.soton.ac.uk>"
> > <eprints-tech at ecs.soton.ac.uk<mailto:eprints-tech at ecs.soton.ac.uk><
> > mailto:eprints-tech at ecs.soton.ac.uk>>
> >> Date: Monday, 25 April 2016 09:51
> >> To: "eprints-tech at ecs.soton.ac.uk<mailto:eprints-tech at ecs.soton.ac.uk><
> > mailto:eprints-tech at ecs.soton.ac.uk>" <eprints-tech at ecs.soton.ac.uk<
> >
> mailto:eprints-tech at ecs.soton.ac.uk><mailto:eprints-tech at ecs.soton.ac.uk>>
> >> Subject: [EP-tech] Problems with view generation: EPrints System Error
> >>
> >> Hello,
> >>
> >> since last week, we've had problems with the view generation for our
> >> author/person view. These problems specifically concern the views for
> >> new author names, which are not generated, while the views for already
> >> existing author names are updated.
> >>
> >> For each author name concerned, there is an entry in the apache error
> > log:
> >>
> >> Error writing
> > to
> /usr/share/eprints3/archives/ubt_eref/html/de/view/person/Kaiser=3AMario=3A=3A.export:
> >   File too large
> >> ------------------------------------------------------------------
> >>       at /usr/share/eprints3/perl_lib/EPrints/Update/Views.pm line 1550
> >>              EPrints::Update::Views::output_files
> > ('EPrints::Repository=HASH(0x7fb0663688d0)',
> > '/usr/share/eprints3/archives/ubt_eref/html/de/view/person/Kai...',
> > 'XML::LibXML::DocumentFragment=SCALAR(0x7fb06a7e5938)',
> > '/usr/share/eprints3/archives/ubt_eref/html/de/view/person/Kai...',
> > 'XML::LibXML::Element=SCALAR(0x7fb06a45f160)',
> > '/usr/share/eprints3/archives/ubt_eref/html/de/view/person/Kai...',
> > 'XML::LibXML::DocumentFragment=SCALAR(0x7fb06a7e5938)',
> > '/usr/share/eprints3/archives/ubt_eref/html/de/view/person/Kai...',
> > 'XML::LibXML::DocumentFragment=SCALAR(0x7fb06a668aa8)', ...) called
> > at /usr/share/eprints3/perl_lib/EPrints/Update/Views.pm line 935
> >>              EPrints::Update::Views::update_view_list
> > ('EPrints::Repository=HASH(0x7fb0663688d0)',
> > '/usr/share/eprints3/archives/ubt_eref/html/de/view/person/Kai...', 'de',
> > 'EPrints::Update::Views=HASH(0x7fb06a6863f0)', 'ARRAY(0x7fb064b78c70)')
> > called at /usr/share/eprints3/perl_lib/EPrints/Update/Views.pm line 259
> >>              EPrints::Update::Views::update_view_file
> > ('EPrints::Repository=HASH(0x7fb0663688d0)', 'de',
> > '/view/person/Kaiser=3AMario=3A=3A.html',
> > '/view/person/Kaiser=3AMario=3A=3A.html') called
> > at /usr/share/eprints3/perl_lib/EPrints/Apache/Rewrite.pm line 513
> >>
> EPrints::Apache::Rewrite::handler('Apache2::RequestRec=SCALAR
> > (0x7fb06a471710)') called at -e line 0
> >>              eval {...} called at -e line 0
> >>
> >> In this specific case, the author's name is linked to only 2 EPrints
> >> entries, so it is unlikely that this file would be particularly large.
> >>
> >> Does anybody have an idea concerning this? I would appreciate any help.
> >>
> >> Thanks!
> >>
> >> Verena
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/
> 
> 
> 
> Jisc is a registered charity (number 1149740) and a company limited by guarantee which is registered in England under Company No. 5747339, VAT No. GB 197 0632 86. Jisc’s registered office is: One Castlepark, Tower Hill, Bristol, BS2 0JA. T 0203 697 5800.
> 
> Jisc Services Limited is a wholly owned Jisc subsidiary and a company limited by guarantee which is registered in England under company number 2881024, VAT number GB 197 0632 86. The registered office is: One Castle Park, Tower Hill, Bristol BS2 0JA. T 0203 697 5800.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20160428/ed827ea3/attachment-0001.html 


More information about the Eprints-tech mailing list