[EP-tech] Re: RFC access log table

Tim Brody tdb2 at ecs.soton.ac.uk
Fri Feb 15 13:53:55 GMT 2013


No, access consists of summary page and fulltext impressions.

If that same robot requests full-texts and doesn't match on user-agent
string then it will be counted.

/Tim.

On Fri, 15 Feb 2013 10:48:48 +0000, John Salter <J.Salter at leeds.ac.uk>
wrote:
> Do OAI-PMH harvests appear in the access table?
> 
> 
> -----Original Message-----
> From: eprints-tech-bounces at ecs.soton.ac.uk
> [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of Alan.Stiles
> Sent: 15 February 2013 10:30
> To: 'eprints-tech at ecs.soton.ac.uk'
> Subject: [EP-tech] Re: RFC access log table
> 
> Hi Tim,
> 
> Having a quick look through the access table, it might also be nice if
> there was the option to include / exclude a list of known robots and
> spiders from the csv dumps, and possibly just to strip them from the
> access table outside of the dumps, keeping it to a more manageable size
> without losing 'relevant' information - Bing and Yandex appear to be
among
> our worst offenders.
> 
> Alan.
> 
> -----Original Message-----
> From: Tim Brody [mailto:tdb2 at ecs.soton.ac.uk] 
> Sent: 15 February 2013 09:32
> To: eprints-tech at ecs.soton.ac.uk
> Subject: [EP-tech] Re: RFC access log table
> 
> Hi,
> 
> Yes, there is nothing in the core that relies on data in access*. The
> IRStats 1 & 2 use access to create their summary data.
> 
> It looks like the best solution is to provide a tool to periodically dump
> historic access data to files, but that it is still useful to keep
> "current" (defined by config) data in the database.
> 
> All the best,
> Tim.
> 
> On Fri, 15 Feb 2013 08:13:52 +0100, Yuri <yurj at alfa.it> wrote:
>> We've a test server which is a clone of the production server. Can I 
>> empty those access tables safely to save space? :) can I do an "delete *

>> from access" without any issue? The same for access__ordervalues_en and 
>> all the languages?
>> 
>> Il 15/02/2013 03:13, Mark Gregson ha scritto:
>>> Hi Tim
>>>
>>> Because of the DB backup issues we invested some time a while ago in
> some
>>> scripts for archiving the access data off to monthly dumps and for
>>> restoring it (if required, say be the need to have IRStats reprocess
all
>>> data). These scripts are not actually in production use because I
> haven't
>>> had time to test it to my satisfaction (sorry Nick!).
>>>
>>> CSV is a more accessible format than a MySQL dump, which may be a
>>> benefit.
>>>
>>> We are using IRStats for statistics which uses the access table but I
>>> guess this will be easily updated with a new parser. We also do some
>>> custom logging to the access table for reporting on outbound link
clicks
>>> via IRStats.  This logging is handled via EPrints::Apache::LogHandler.
>>>
>>> Cheers
>>> Mark
>>>
>>>
>>> -----Original Message-----
>>> From: eprints-tech-bounces at ecs.soton.ac.uk
>>> [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of Tim Brody
>>> Sent: Thursday, 14 February 2013 8:01 PM
>>> To: eprints-tech at ecs.soton.ac.uk
>>> Subject: [EP-tech] RFC access log table
>>>
>>> Hi All,
>>>
>>> I'm thinking about the access log table and how it can be made
>>> sustainable.
>>>
>>> What I'm suggesting is to write accesses to CSV-formatted log files,
one
>>> file per month. What I don't know is whether anyone is relying on the
>>> database table for generating statistics?
>>>
>>> The problem the access log table creates is in backing-up the EPrints
>>> database.
>>>
>>> I'd appreciate any thoughts/comments.
>>>
>>> --
>>> All the best,
>>> Tim
>>>
>>> *** Options:
> http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>>> *** Archive: http://www.eprints.org/tech.php/
>>> *** EPrints community wiki: http://wiki.eprints.org/
>> 
>> *** Options:
http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>> *** Archive: http://www.eprints.org/tech.php/
>> *** EPrints community wiki: http://wiki.eprints.org/
> 
> -- 
> All the best,
> Tim.
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/

-- 
All the best,
Tim.


More information about the Eprints-tech mailing list