<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p>Hi James,</p>
<p>Yes, if you have been on HTTPS for a while and your URIs are
already showing as HTTPS this is not a problem you need to worry
about. <br>
</p>
<p>I think there is an expectation that a URI be resolvable and this
is very much the the case when they start http:// or https://,
which would be described as part of URL subset of URIs. However,
URNs (Uniform Resource Names) are the other subset of URIs and are
not expected to be resolvable, at least not without specialist
software. It might in part be my own opinion that a URL-type URI
does not have to forever be resolvable, (which raises the
question: Does it ever need to be resolvable?), as who can
guarantee that a hostname will forever host a website that will
return an appropriate representation for a particular URI.
However, this URI must never be re-used and must perpetually
remain valid as an identifier for which it was created.</p>
<p>I agree with you that non-technical people do not fully
appreciate the complexities of a hostname change. "You can get
everything to redirect, can't you?" This is true but does create
you a problem, as EPrints by default will start referencing items
by a new URI. This will not make the old URI invalid and you
would obviously ensure the old hostname redirects to the new one,
so resolvability would not be an issue either. The problem comes
when someone (or more likely a computer) has the old and new URIs
and asks themselves are these two identifiers for the same thing.
A human may be able to make the correct assumptive leap but a
(non-AI embued) computer would not be able to make any such leap.
This is the reason I incorporated the uri_url configuration option
in EPrints 3.4.1+. However, this can still cause people to fret
as they ask: "Why is it still using old hostname (or only HTTP)
for the URI? We need to update that."</p>
<p>A DOI service does offer the benefit of being able to update what
they point at to have longer persistence than an EPrints URI, that
as we have discussed, can be a the whim of your institution's
comms team. However, even DOIs are still potentially at the whims
of such teams, as I have seen institutions register some
representation of their name as part of the DOI, e.g.
10.12345/UniOfX.6789. However, usually these are sufficiently
tangential that they do not get picked up by "branding".</p>
<p>I don't think there is a one size fits all answer to this
question. In a simple world where a repository is created with
HTTPS to start with and never changes its hostname, EPrints URIs
are perfect for meeting the Plan S PID requirements. If there are
changes needed to move to HTTPS or a new hostname, then uri_url
configuration option helps manage that situation. However, in
some ways, having a service completely removed from EPrints and
not at the whims of the non-technical, allows you to rise above
this and avoid the side-effects of ensuring persistency.<br>
</p>
<p>Regards</p>
<p>David Newman<br>
</p>
<div class="moz-cite-prefix">On 10/05/2021 09:48, James Kerwin
wrote:<br>
</div>
<blockquote type="cite" cite="mid:CAKkNZ9CF561u071r-PGOEi+KLKd-4w1P39oX0v+O3w841DGnDw@mail.gmail.com">
<div style="padding-bottom: 10px; padding-top: 5px;">
<div style="padding:12px; border:1px solid #8D3970;
background-color:#F7F9FA; color:#8D3970; font-size:14px;
line-height:22px; font-family: Calibri, Arial, Helvetica,
sans-serif;">
<strong>CAUTION:</strong> This e-mail originated outside the
University of Southampton.
</div>
</div>
<div>
<div dir="ltr">Morning David,
<div><br>
</div>
<div>Thank you for the detailed reply. It's given us a lot to
think over. Hopefully you don't mind that I passed your
email on to my manager to read as he is quite concerned
about the PID side of things. We've discussed this topic
over the past week or so.</div>
<div><br>
</div>
<div>In some ways, this has simplified things a lot. We
essentially have a PID in the form of the URI. We've been
https for longer than I've been here (July 2018) so I think
we're covered with respect to https/https. We could make use
of "uri_url" when we upgrade to 3.4, but that's a whole
other story that's recently been complicated.</div>
<div><br>
</div>
<div>It's the part about a PID not needing to be resolvable
that is proving tricky, which is where we were at the
start.. I think we've got it into our minds that if we're
using a URI as an identifier, it should be resolvable. For
example, as a user I would want/expect it to be, since it
looks like a link. I THINK my interpretation should be "a
URI as a PID satisfies the Plan S requirements whether or
not it links to anywhere". After that, entirely dependent
upon us and what we do, we can ensure it remains resolvable
for as long as possible (e.g. maintaining old URLs/redirects
etc. in the event of a repository hostname change). The
bizarre thing is that we aren't considering a hostname
change and I would push against it if we were. It just
appears to have come up as this unnecessary impediment
(although it is useful to consider this sort of thing and I
am the fool that brought it up originally in our team).<br>
<br>
Personally, I'd give everything that comes into the
repository a DOI. We're already set up on the repository to
mint DOIs for our theses when they're moved to the live
archive. It would make my life a lot easier if this happened
to everything that comes in. That can then handle things
such as hostname changes because you can change where the
DOI points to by changing the repository URL on the
providers (DataCite) webpage.</div>
<div><br>
</div>
<div>The tricky bit is convincing others that we essentially
already meet this particular requirement...</div>
<div><br>
</div>
<div>Thanks again for your advice David, it's been incredibly
helpful.</div>
<div><br>
</div>
<div>James</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, Apr 28, 2021 at
10:50 AM David R Newman <<a href="mailto:drn@ecs.soton.ac.uk" moz-do-not-send="true">drn@ecs.soton.ac.uk</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<p>Hi James,</p>
<p>Fortunately (or unfortunately) I have had quite a few
thoughts on the matter. I have done my best to keep
them to the point.<br>
</p>
<p>First, I don't think it is possible to account for the
same item being in multiple repositories. As an
individual institutional repository owner you have no
control over other institutional repositories who may
have shared authors on publications and have the right
to make the same publication available on their
institutional repositories. Having a background in the
Semantic Web, trying to determine if two things with
different unique identifiers are actually the same thing
is a near impossible problem to solve definitively. The
best you can do is ensure the same unique identifier is
not somehow used for two different things and also avoid
creating and using more unique identifiers than are
absolutely necessary.</p>
<p>EPrints has always had a unique identifier in the form
of a URI (e.g. <a href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Feprints.example.org%2Fid%2Feprint%2F123&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C4ab65b7776734ac46e3808d9139bd68f%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637562382302215099%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hiltLVk16139gKXVIEraLn8HBI0Pmtj4djk7oI1VvGc%3D&reserved=0" originalSrc="http://eprints.example.org/id/eprint/123" shash="BmhUexBpQ4cmlSCj7s5Ea/FYQ4adi8GcykD/digKNn2NR6OwCol7A83C1QTLZSYI52K7A1K4t7rNdUPwnIh/9cfNVkz3oxdXRewLHvq0wQaPUEcPnsrUwh4y5nxRKMT7zPJojOFz8eIlwxvPS6Zh3pNUjMafhjnho0nIRm1P5H8=" originalsrc="http://eprints.example.org/id/eprint/123" shash="ZB0CaI18S01LVRgW6TlWKiZeo76igDnu08UGRrijjs365t8+g6x1nwXf+6sLC4PGkIfYzKu/O+9LvYrCCu9LE5Ri13KzeO433P7ms8EPQBZKHWy8K+8D1WY9wX0WYBOV385IoJnkLxyxRFMuJG9WoZt28bdXveA1F13Q4N2a700=" target="_blank" moz-do-not-send="true">
http://eprints.example.org/id/eprint/123</a>). I
would suggest this is the most appropriate unique
identifier to use as every item in your repository will
have one but not every item will necessarily have a DOI
or similar unique identifier. You could configure your
repository to use a DOI minting service (e.g. data
repositories often use DataCite) but this rather breaks
the rule of creating more unique identifiers than are
absolutely necessary.
<br>
</p>
<p>One potential problem I have noted with EPrints URIs is
that these were all originally http but if you modify
you HTTPS configuration to ensure HTTPS is used
everywhere, then these URIs will likely also be changed
to https, making them non-persistent which is another
big no-no. For this reason, early on in EPrints 3.4 I
introduced a configuration properly 'uri_url' to ensure
that you could modify a repository's HTTPS configuration
but if you had this configuration option set you could
keep the URIs as http. As in the context of being a
unique identifier, you need to consider the URI as being
a string of characters and if this string of characters
changes, then it is no longer the same unique
identifier, even though it is still describing the same
thing.
<br>
</p>
<p>I think you also identified another potential problem
with the structure of an EPrints URI, which is if there
is a change to the hostname of the repository itself.
Again the uri_url option should allow you to ensure URIs
do not change. Unfortunately, this may lead to
confusion for users who wonder why the hostname for
these URIs is different to the hostname of the
repository. Also, depending what happens to the old
hostname's DNS registration these URIs may become
unresolvable. However, there is no requirement for
URIs, as any unique identifier, to be resolvable.<br>
</p>
<p>If an item has a DOI provided by a journal, an ISBN
provided by a book publisher, etc. then this would
typically be more useful than an institutional
repository's URI, as this would be used in a general
context (i.e. you would expect a DOI or ISBN to appear
in the citation for such an item). However, I think to
provide the best possible coverage there is need for
both forms for unique identifier: the one from the
original publisher (if that is not the institutional
repository, which would likely be the case for theses,
etc.) and one from the institutional repository. If you
provide export formats that can be ingested by
third-party applications that include both unique
identifiers and therefore build a link between the two,
it is possible to build and network of unique
identifiers for a particular item. Then when you get a
journal article that has authors from multiple
institutions, it will be possible to see that a
publication from institution A is the same publication
as from institution B.</p>
<p>Regards</p>
<p>David Newman<br>
</p>
<p><br>
</p>
<p>On 28/04/2021 10:02, James Kerwin via Eprints-tech
wrote:<br>
</p>
<blockquote type="cite">
<div style="padding-bottom:10px;padding-top:5px">
<div style="padding:12px;border:1px solid
rgb(141,57,112);background-color:rgb(247,249,250);color:rgb(141,57,112);font-size:14px;line-height:22px;font-family:Calibri,Arial,Helvetica,sans-serif"><strong>CAUTION:</strong>
This e-mail originated outside the University of
Southampton.
</div>
</div>
<div>
<div dir="ltr">Hi All,<br>
<div><br>
</div>
<div>For once I have not broken anything, just
looking for opinions and advice.</div>
<div><br>
</div>
<div>As part of Plan S we need to have persistent
identifiers for scholarly publications. I have
read this EPrints wiki:<br>
<br>
<a href="https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.eprints.org%2Fw%2FPlan_S&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C4ab65b7776734ac46e3808d9139bd68f%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637562382302225045%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=rm2XBZnYupVRouihVMVEUPKhII08%2F77ny1VeLmPoIlI%3D&reserved=0" originalSrc="https://wiki.eprints.org/w/Plan_S" shash="prL3iZESy9wXv93hQcUjNMfgJC8m8DEWG0t7IdolqUoiE3v7MPvAn8v2B8YCHPH+GE/X9QMEuZsLOkeyTTlwg8HxvATqsRFh4lSZvtON++an+FGkScfhcY/4oj9udNCw/Iy+3zGgKRb7JmQBrV2zvP5+UQd5003GfHvbuUzcrqM=" originalsrc="https://wiki.eprints.org/w/Plan_S" shash="C2iQ0UtV3Re9EMMmZHhqOdAikk65AP07IEglyiuVjT+7U0DIVdc5asaM1yyBY66gzk6mRGnw9Hd5sG5BziUlGvytMaoN2QAbGNjLuugIJ+ZEnUU3KPr1ch8ceUbkE7PYzzBi905yfsry9/7+JlwmwO64h7iyVeqYCR9+Fbd+tQ0=" target="_blank" moz-do-not-send="true">https://wiki.eprints.org/w/Plan_S</a><br>
</div>
<div><br>
</div>
<div>At Liverpool we aren't 100% sure about this
topic. DOI would be the obvious choice, but there
are some on my team who reasonably point out that
the same item could be in several repositories and
end up having several separate DOIs associated
with it. I'm not sure how much that matters.<br>
<br>
Does anybody have any thoughts on this point? We
spoke with my predecessor, Adam, who was really
helpful. Unconvinced team members have suggested
using
<a href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fhandle.net%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C4ab65b7776734ac46e3808d9139bd68f%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637562382302225045%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=LMtBndt5UcpwnXvD9cyqdHkwvM%2BFEcQNGBf1MqMkSxI%3D&reserved=0" originalSrc="http://handle.net/" shash="l/yGJMrmi7sZgmRZksEM0xbUwxOIeLuF0uJBe5GxZ9w3HepOGQE+gjrVbA9M60ErGfwZ5Y2VSiyLETDIiB76JOfLn0VFJ0q/AFRJ13JCkb68EtHRNQqrS0WvUhF9uMyK+Nce2S8k/1iOdAAxEoAYftxqoZLD3K6CFZ+UDqPys/s=" originalsrc="http://handle.net/" shash="MKP3COz1RTxyS5FIRM69RyJtIGvKgMrCLeLDQ2DTg/otl9e9njVsVlrPuP+Q5R9Ljj5c4Y9cWmDUUaoV+WPJqeX4xIf4/wmALiXVzicTME3KBTKS1VjFxnmDlK1PbKaO2yPq4+jTIuc7xcTmIALKaz6UuRpZjah22dPEacKQT/0=" target="_blank" moz-do-not-send="true">
handle.net</a> which I think is overkill and
doesn't necessarily meet the needs of Plan S in
itself.<br>
<br>
Also, the URL/EPrints ID for each item, is this
not a suitable persistent identifier? The wiki
linked above does mention this. There's always the
possibility a repository URL could change in the
future, but I would expect some sort of redirect
to overcome this.</div>
<div><br>
</div>
<div>If there is a more suitable place for this type
of discussion please send me there.<br>
</div>
<div><br>
</div>
<div>Thanks,</div>
<div>James</div>
</div>
</div>
<br>
<fieldset></fieldset>
<pre>*** Options: <a href="http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech" target="_blank" moz-do-not-send="true">http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech</a>
*** Archive: <a href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C4ab65b7776734ac46e3808d9139bd68f%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637562382302235004%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=boVKrWh0X%2FpNEo0c90C26cPDg8udYFu5i5XDpxpaItE%3D&reserved=0" originalSrc="http://www.eprints.org/tech.php/" shash="Vk7lDNj0mQ7G9mN+Z1OijlVcdap0TS0GAFL8Sr3tYS0rPKAkmGAXf2MymsBB2VPJWTn8wcYvk5rIU9DALSI9WDMusBL5MQxlTecbWJrEFvBhI/RaDSOy3akSPhel4fuUCoZLIqiQnFfpwcCrdBnnNAs7Im8waP8bauFn7/xValg=" originalsrc="http://www.eprints.org/tech.php/" shash="F/Fo4dmq4gPplr3dIoWRY0mTJs1hrW4CRELujkgvskN/1KRdKIS9XR5aD9QtS9S9G2fP0TPWMxhfGULAHX02hd/6KVGFvIbdFsRrKSyiImzLotSx+5dbccn7Rn+RmWj/4ZF0nDin+gXM+AQFO9WCtCxdJ7z/6YT/16viqAw0Igc=" target="_blank" moz-do-not-send="true">http://www.eprints.org/tech.php/</a>
*** EPrints community wiki: <a href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C4ab65b7776734ac46e3808d9139bd68f%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637562382302235004%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=xCgPLszzL1InnVRHvqXRCSSNmuht%2BhvWWjZ%2BAJ9Tx%2BY%3D&reserved=0" originalSrc="http://wiki.eprints.org/" shash="QmlXvuvlTofNNHSeLekBJnhDld27X/q1XLHdRUxElPEV6g80IErOPeqUCahPt+z4P+bu9IUitXhlTJOhlaJOMjV47ghbEiuhDplPhtVVSPNZ8cT7Ihu63ZOt8l47kuseaEUqFix/KDv4YCvZLT87rb8/YW6vPzzW5MZMg/m3Ia4=" originalsrc="http://wiki.eprints.org/" shash="kNiAGIAX0EKozsYviXpa4ReQ1pU5xhRwD6wIVgWBN01QEF2PVow1D8VfMsTwQg94v9MXOkhi8SQJ4MPOJ0zY8nWfFD2YhPPgPiEt5jfA9Aam/cqyt/gHRvk1w9px3su4pkhb0c/+J9UJSZzvlpV3f4jlF32hp9mNw6xd7x26XSk=" target="_blank" moz-do-not-send="true">http://wiki.eprints.org/</a></pre>
</blockquote>
<div id="gmail-m_-552449012764509543DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2"><br>
<table style="border-top:1px solid rgb(211,212,222)">
<tbody>
<tr>
<td style="width:55px;padding-top:13px"><a href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.avg.com%2Femail-signature%3Futm_medium%3Demail%26utm_source%3Dlink%26utm_campaign%3Dsig-email%26utm_content%3Demailclient&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C4ab65b7776734ac46e3808d9139bd68f%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637562382302244963%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=qjLtRV%2B2bxv6K0lNUFLllsHGJ4R1fCl1XaH8UHwdk9k%3D&reserved=0" originalSrc="http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient" shash="IIzHCMnkcaPPTaYXD8qoH5f10cBcjUPIe7qBikBdDZtUAEBG/JpiHZMmmqGnQS5lQ1B4R+GSDcEIBD0p4q/8t9opQkaziUCI6/afALE4H7nWy4BjttD9qXkFP+DgnkJNdlSRaUAD720X7B4Okgp/2gck3X1zbjgoAYyyDoWatBg=" originalsrc="http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient" shash="ChJJYPc1tP5fe5vchkTIBEJtDHyFMQXp3o4jo2JqCiiWaAJbrwHR04Aryqvrr6ENKDU32jek/LAmLoV50QundOKliDWWsin5F3GkSVSrfWEtUii/yk02Hmi8Zrv1Dj3oBBzzWcN9xvLugjQmarnhgKUBodVaQFbDxiEO/ETlnU0=" target="_blank" moz-do-not-send="true"><img src="https://ipmcdn.avast.com/images/icons/icon-envelope-tick-green-avg-v1.png" alt="" style="width: 46px; height: 29px;" moz-do-not-send="true" width="46" height="29"></a></td>
<td style="width:470px;padding-top:12px;color:rgb(65,66,78);font-size:13px;font-family:Arial,Helvetica,sans-serif;line-height:18px">Virus-free.
<a href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.avg.com%2Femail-signature%3Futm_medium%3Demail%26utm_source%3Dlink%26utm_campaign%3Dsig-email%26utm_content%3Demailclient&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C4ab65b7776734ac46e3808d9139bd68f%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637562382302244963%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=qjLtRV%2B2bxv6K0lNUFLllsHGJ4R1fCl1XaH8UHwdk9k%3D&reserved=0" originalSrc="http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient" shash="IIzHCMnkcaPPTaYXD8qoH5f10cBcjUPIe7qBikBdDZtUAEBG/JpiHZMmmqGnQS5lQ1B4R+GSDcEIBD0p4q/8t9opQkaziUCI6/afALE4H7nWy4BjttD9qXkFP+DgnkJNdlSRaUAD720X7B4Okgp/2gck3X1zbjgoAYyyDoWatBg=" originalsrc="http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient" shash="ChJJYPc1tP5fe5vchkTIBEJtDHyFMQXp3o4jo2JqCiiWaAJbrwHR04Aryqvrr6ENKDU32jek/LAmLoV50QundOKliDWWsin5F3GkSVSrfWEtUii/yk02Hmi8Zrv1Dj3oBBzzWcN9xvLugjQmarnhgKUBodVaQFbDxiEO/ETlnU0=" style="color:rgb(68,83,234)" target="_blank" moz-do-not-send="true">
www.avg.com</a> </td>
</tr>
</tbody>
</table>
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
</body>
</html>