[EP-tech] EPrints Database entity relationship diagram

Christopher Gutteridge cjg at ecs.soton.ac.uk
Fri Dec 15 10:53:04 GMT 2017


Hi. Long story short the database represents a set of datasets. eg. 
EPrint, Document, User. Each of these has a primary table containing the 
main record and any singular fields, eg title of an eprint. Then there 
is one table for every "multiple" field, eg. subjects. Compound multiple 
fields have one table for each field due to an implementation detail. eg 
creators is really stored as two multiple fields. creators_id and 
creators_name

In addition, each dataset has an ordervalues and an index table. The 
index table contains the words to match in full text searches. The 
ordervalues table contains a cached version of each field flattened into 
a string to use for sorting by that field.

Some eprint fields link to other objects, eg. eprint has a userid of the 
depositing user, document has the eprintid to which the document belongs 
(there's quite a few of these)

Finally there's cache tables which are used so that it doesn't repeat a 
search when you go to page two of the results.

Some of the database features aren't taken advantage of because it would 
tie EPrints to one backend and some sites want to use other SQL databases.

Deep in the admin pages there's this function 
https://eprints.soton.ac.uk/cgi/users/home?screen=DatabaseSchema which 
gives you a general idea of what database tables are. Some tables may be 
created by plugins that EPrints core doesn't know about.


On 14/12/2017 16:45, Alan.Stiles wrote:
> I've just tried creating a model of our instance from the database tool we use.  It took 20 pages just to draw the database tables, though 5 pages of that were probably cache tables.  That's without drawing any links between the tables, or leaving enough space to discern which line goes from where to where...
> We have something like 60 tables that all start eprints_* e.g. creators_name, creators_id, editors_name...  I really don't want to try drawing that ERD alone, never mind including the user table or the REF2014 plugin tables!
>
> I once got asked for similar by our central IT team (possibly "Do you have a list of fields in the database?") Once I started explaining the volume of stuff they were asking about they decided they had a much more specific query to ask!
>
> Alan
>
> -----Original Message-----
> From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of David R Newman
> Sent: 14 December 2017 16:10
> To: eprints-tech at ecs.soton.ac.uk
> Subject: Re: [EP-tech] EPrints Database entity relationship diagram
>
> Hi Andrew,
>
> I am not aware of one and they would vary between repositories sufficiently significantly that you would need a tool to produce one rather than having one ER diagram for all repositories.  You may be able to generate something comparable from the XML schema at:
>
> https://eprints.lincoln.ac.uk/cgi/schema
>
> It may also be possible to run a tool against the database directly.
>   Something like http://search.cpan.org/dist/SQL-Translator/script/sqlt-
> diagram will probably do the job.  However, I do not know how well the relationships between tables (i.e. foreign keys) have been defined.  So  you may just see a load of unassociated tables.  As the linking of tables is managed for the Perl-coded EPrints API.
>
> Regards
>
> David Newman
>
> On Thu, 2017-12-14 at 15:48 +0000, Andrew Beeken wrote:
>> Hi all!
>>
>> Is there an entity relationship diagram available for the EPrints
>> database? I’ve been asked by our databases team if I can provide one
>> and I don’t particularly relish the idea of drawing one up myself!
>>
>> Thanks in advance!
>> Andrew
>>
>>
>>
>> The University of Lincoln, located in the heart of the city of
>> Lincoln, has established an international reputation based on high
>> student satisfaction, excellent graduate employment and world-class
>> research.
>>
>> The information in this e-mail and any attachments may be
>> confidential. If you have received this email in error please notify
>> the sender immediately and remove it from your system. Do not disclose
>> the contents to another person or take copies.
>>
>> Email is not secure and may contain viruses. The University of Lincoln
>> makes every effort to ensure email is sent without viruses, but cannot
>> guarantee this and recommends recipients take appropriate precautions.
>>
>> The University may monitor email traffic data and content in
>> accordance with its policies and English law. Further information can
>> be found at: http://www.lincoln.ac.uk/legal.
>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-
>> tech
>> *** Archive: http://www.eprints.org/tech.php/
>> *** EPrints community wiki: http://wiki.eprints.org/
>> *** EPrints developers Forum: http://forum.eprints.org/
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/
> -- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302). The Open University is authorised and regulated by the Financial Conduct Authority in relation to its secondary activity of credit broking.
>
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/

-- 
Christopher Gutteridge -- http://users.ecs.soton.ac.uk/cjg

University of Southampton Open Data Service: http://data.southampton.ac.uk/
You should read our Web & Data Innovation blog: http://blogs.ecs.soton.ac.uk/webteam/



More information about the Eprints-tech mailing list