[EP-tech] General purpose CSV import

martin.braendle at uzh.ch martin.braendle at uzh.ch
Fri Sep 4 14:32:24 BST 2020



Dear all,

we quite frequently have to process lists of data (mostly CSV) coming from
various sources to update our repositories. The requirement is usually as
follows:

- there is one or several matching criterions (eprintid, DOI, ISSN,
whatever)
- there are some fields (columns) in the source format or name
- there are the fields in the EPrints repo where the data must be filled in

So the task is always the same: You write a script (or a plugin) that
matches the eprint to update by a search on the criterions and then updates
the data or adds a new record and writes a report (or log). And the next
time you write again a similar script because some of the fields or
criteria have changed. In addition, there is the overhead of exchanging
files between the repo software developer and the admins who want to have
the data updated, usually done by us via issue tracking system.

Why not have a general purpose import plugin that allows the end-user (repo
admin, OA monitoring expert, journal manager, you name them) to update data
directly:
- choose the match columns and associate with the match fields of the
repository
- choose the data columns and associate with update fields of the
repository
- choose the action options  (update, create, create upon mismatch, ...)
- carry out the action (probably as detached process)
- inform the user about the status of the process (running, terminated,
failed)
- obtain or download a report for quality control

Has anybody already created something similar? Interest?

Kind regards,

Martin

--
Dr. Martin Brändle
Zentrale Informatik
Universität Zürich
Stampfenbachstr. 73
CH-8006 Zürich
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20200904/e6314f4c/attachment-0001.html 


More information about the Eprints-tech mailing list