[EP-tech] Re: Migrating from D-space (with files)

George Mamalakis mamalos at eng.auth.gr
Fri Jun 19 09:29:34 BST 2015


Hi Tim,

If you could, it would be wonderful. Yesterday I spent the whole day 
trying to understand its database in order to write my custom script 
that exchanges records between the two databases (in python using some 
ORM). To be sure I understood the database design correctly, I enabled 
database logging and imported one record. Through this procedure I saw 
that many tables were affected when inserting a new record (more than 
those I'd initially imagined), so I decided to give DSpace's import 
plugin a try. I thought that I'd write my custom plugin that inherits 
DSpace.pm, readjust the GRAMMAR to fit my DSpace database, and write 
additional callbacks where needed. OK, my perl skills are limited, but I 
understood the existing code by reading it, so I imagined it wouldn't be 
that hard writing a few callbakcs to do specific string manipulations in 
perl. My only problem following this approach would still be the file 
part, where I thought I'd follow the directions of this document:

https://ejournals.bc.edu/ojs/index.php/ital/article/download/1861/pdf

which would probably need a few changes from my part to fit my needs. 
This paper describes a two-script importer (and code is supplied), where 
the first script imports only the metadata, the files are copied on the 
system "by hand", and the second script updates the imported eprints to 
"link" with the appropriate files.

Since your script generates EPrints XML, it means that imported records 
will update all necessary tables, so everything should work like a 
charm! So, if you find it, I'd be obliged if you could send it, and I'll 
try to make the relevant changes to fit my DSpace database.

Thanks again for the help!

On 19/06/2015 02:57 πμ, Timothy Miles-Board wrote:
> Hi George,
>
> I've done this a couple of times.
>
> I worked with (CSV) dumps of the DSpace database tables and wrote a script to parse/join them and convert the whole lot to EPrints XML.
>
> If you are still looking for help I could dig the script out and you can see if you can adapt it to your needs.
>
> Regards,
>
> Tim
>
> Timothy Miles-Board
> Web & Repositories Development Specialist, University of London Computer Centre
> 020 7863 1342  |  07742 970 351  | timothy.miles-board at london.ac.uk | @drtjmb
> The University of London is an exempt charity in England and Wales
>
> ________________________________________
> From: eprints-tech-bounces at ecs.soton.ac.uk <eprints-tech-bounces at ecs.soton.ac.uk> on behalf of George Mamalakis <mamalos at eng.auth.gr>
> Sent: 26 May 2015 4:48 PM
> To: eprints-tech at ecs.soton.ac.uk
> Subject: [EP-tech]  Migrating from D-space (with files)
>
> Hi all,
>
> I assumed that this scenario should be very common, but after googling
> it I realised that it's quite hard to find a straightforward answer.
>
> So, the question is as follows:
>
> What are the needed steps in order to migrate a D-space system to eprints?
>
> I see that there is this import module
> (./perl_lib/EPrints/Plugin/Import/DSpace.pm) in eprints, which (at first
> glance) doesn't seem to handle files (maybe I'm wrong). Moreover, as it
> is stated in the plugin, before migrating from D-space to eprints, one
> should subclass it in order to "refine the grammar used". Of course,
> from the admin interface I see that there is a D-space specific import,
> which -if I understood correctly- is using the import plugin just mentioned.
>
> Given these facts, for the meatadata I just have to subclass the
> DSPace.pm plugin using the correct grammar? And then, what should I do
> with associated files? Is there a way to merge this two steps in order
> to avoid mistakes?
>
> Thank you all for your time in advance,
>
> George.
>
> --
> George Mamalakis
>
> IT and Security Officer,
> Electrical and Computer Engineer (Aristotle Univ. of Thessaloniki),
> PhD (Aristotle Univ. of Thessaloniki),
> MSc (Imperial College of London)
>
> School of Electrical and Computer Engineering
> Aristotle University of Thessaloniki
>
> phone number : +30 (2310) 994379
>
>
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/
>
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/


-- 
George Mamalakis

IT and Security Officer,
Electrical and Computer Engineer (Aristotle Univ. of Thessaloniki),
PhD (Aristotle Univ. of Thessaloniki),
MSc (Imperial College of London)

School of Electrical and Computer Engineering
Aristotle University of Thessaloniki

phone number : +30 (2310) 994379





More information about the Eprints-tech mailing list