<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Probity is a bit like blockchain, but distributed. (I'm not an
expert on blockchain)</p>
<p>It never caught on, which is a pity, as the idea was sound.</p>
<p>I've some PHP code lying around for making a basic probity
website.<br>
</p>
<br>
<div class="moz-cite-prefix">On 25/08/2017 11:03, John Salter wrote:<br>
</div>
<blockquote type="cite"
cite="mid:EMEW3|d86da7bf1d070d63f11bd388e4e72033y7OBCD14eprints-tech-bounces|ecs.soton.ac.uk|DB6PR0301MB2264FA71F59D177E988E421FC49B0@DB6PR0301MB2264.eurprd03.prod.outlook.com">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<meta name="Generator" content="Microsoft Word 15 (filtered
medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:Wingdings;
        panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:#0563C1;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:#954F72;
        text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {mso-style-priority:34;
        margin-top:0cm;
        margin-right:0cm;
        margin-bottom:0cm;
        margin-left:36.0pt;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
span.EmailStyle18
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
span.EmailStyle19
        {mso-style-type:personal-reply;
        font-family:"Calibri",sans-serif;
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
        {page:WordSection1;}
/* List Definitions */
@list l0
        {mso-list-id:1089929554;
        mso-list-type:hybrid;
        mso-list-template-ids:-883918726 269025281 269025283 269025285 269025281 269025283 269025285 269025281 269025283 269025285;}
@list l0:level1
        {mso-level-number-format:bullet;
        mso-level-text:\F0B7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Symbol;}
@list l0:level2
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:"Courier New";}
@list l0:level3
        {mso-level-number-format:bullet;
        mso-level-text:\F0A7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Wingdings;}
@list l0:level4
        {mso-level-number-format:bullet;
        mso-level-text:\F0B7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Symbol;}
@list l0:level5
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:"Courier New";}
@list l0:level6
        {mso-level-number-format:bullet;
        mso-level-text:\F0A7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Wingdings;}
@list l0:level7
        {mso-level-number-format:bullet;
        mso-level-text:\F0B7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Symbol;}
@list l0:level8
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:"Courier New";}
@list l0:level9
        {mso-level-number-format:bullet;
        mso-level-text:\F0A7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Wingdings;}
ol
        {margin-bottom:0cm;}
ul
        {margin-bottom:0cm;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US">Hi Tomasz,<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US">I think
we're looking into similar things at the moment :o)<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US">I think
there are similarities between 'fixity' and 'probity' - so
although there isn't integration of fixity, this might be
useful info:<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US">EPrints
does support 'probity' files (<a
href="http://www.probity.org/" moz-do-not-send="true">http://www.probity.org/</a>),
which include a hash of the contents.<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US">I don’t
think these are generated by default, but the
$doc->rehash command should generate them.<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US">See the
EPrints::Probity module, and the 'rehash' option of
bin/epadmin.<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US">Running
[EPRINTS_ROOT]/bin/epadmin rehash [ARCHIVEID] [docid] will
generate a file in the owning eprint folder e.g.<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US">[EPRINTS_ROOT]/</span>a<span
style="color:#1F497D;mso-fareast-language:EN-US">rchives/[ARCHIVEID]/documents/disk0/00/00/00/01/1.2017-08-25T09=003a55=003a29Z.xsh<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US">(for
eprintid = 1, and docid = 1. Note the endcoded ':'s (=003a)
in the timestamp in the filename).<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US">The file
has the following data:<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US"><?xml
version="1.0" encoding="UTF-8" ?><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US"><hashlist
xmlns=<a class="moz-txt-link-rfc2396E" href="http://probity.org/XMLprobity"><font color="red"><b>MailScanner has detected a possible fraud attempt from "probity.org" claiming to be</b></font> "http://probity.org/XMLprobity"</a>><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US">
<hash><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US">
<name>wreo.txt</name><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US">
<algorithm>MD5</algorithm><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US">
<value>17f861744d77c1d9754fd7ab6f403065</value><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US">
<date>2017-08-25T09:55:45Z</date><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US">
</hash><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US"></hashlist><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US">You can
create multiple Probity files, but I don't think there's any
way to compare one with another, or check the current
checksum is equal to the most recently store one (which is
the main part of your question).<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US">Cheers,<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US">John<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US">PS I'm also
looking into DROID - as you were at some point. The Bazaar
package needs an update or three…<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #E1E1E1
1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US">From:</span></b><span
lang="EN-US"> <a class="moz-txt-link-abbreviated" href="mailto:eprints-tech-bounces@ecs.soton.ac.uk">eprints-tech-bounces@ecs.soton.ac.uk</a>
[<a class="moz-txt-link-freetext" href="mailto:eprints-tech-bounces@ecs.soton.ac.uk">mailto:eprints-tech-bounces@ecs.soton.ac.uk</a>]
<b>On Behalf Of </b>Tomasz Neugebauer<br>
<b>Sent:</b> 24 August 2017 18:35<br>
<b>To:</b> <a class="moz-txt-link-abbreviated" href="mailto:eprints-tech@ecs.soton.ac.uk">eprints-tech@ecs.soton.ac.uk</a><br>
<b>Subject:</b> [EP-tech] Fixity Check and EPrints -
Digital Preservation<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal" style="margin-bottom:12.0pt"><span
lang="EN-CA">I believe that EPrints stores a checksum value
for each uploaded file, but as far as I understand, there is
no way to monitor if the checksums match up with current
file, and thus no way of checking for bit rot. <o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA">DSpace has the
following: <a
href="https://wiki.duraspace.org/display/DSDOC6x/Validating+CheckSums+of+Bitstreams"
moz-do-not-send="true">
https://wiki.duraspace.org/display/DSDOC6x/Validating+CheckSums+of+Bitstreams</a><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA">A periodic fixity check
is a part of the lowest level of support for digital
preservation, i.e., “Bit-level”. See some examples of
Digital Preservation policy, all of which have some
variation on this as a requirement:“regularly audit
checksums to ensure that no files have corrupted or changed
in any way. This practice ensures the ability to provide an
exact copy of original files over time”:<o:p></o:p></span></p>
<p class="MsoListParagraph"
style="text-indent:-18.0pt;mso-list:l0 level1 lfo2"><!--[if !supportLists]--><span
style="font-family:Symbol" lang="EN-CA"><span
style="mso-list:Ignore">·<span style="font:7.0pt
"Times New Roman"">
</span></span></span><!--[endif]--><span lang="EN-CA"><a
href="https://www.sfu.ca/content/dam/sfu/archives/DigitalPreservation/FormatPolicyRegistry.pdf"
moz-do-not-send="true">https://www.sfu.ca/content/dam/sfu/archives/DigitalPreservation/FormatPolicyRegistry.pdf</a>
“Regularly perform fixity checks on AIPs”<o:p></o:p></span></p>
<p class="MsoListParagraph"
style="text-indent:-18.0pt;mso-list:l0 level1 lfo2"><!--[if !supportLists]--><span
style="font-family:Symbol" lang="EN-CA"><span
style="mso-list:Ignore">·<span style="font:7.0pt
"Times New Roman"">
</span></span></span><!--[endif]--><span lang="EN-CA"><a
href="https://digital.library.yorku.ca/documentation/fixity-procedures"
moz-do-not-send="true">https://digital.library.yorku.ca/documentation/fixity-procedures</a>
“York University Library are committed to maintaining the
integrity of objects in its care. This includes creating
checksums for all archival format objects -- plus associated
datastreams -- ingested into the repository, and regular
fixity checking of those objects”<o:p></o:p></span></p>
<p class="MsoListParagraph"
style="text-indent:-18.0pt;mso-list:l0 level1 lfo2"><!--[if !supportLists]--><span
style="font-family:Symbol" lang="EN-CA"><span
style="mso-list:Ignore">·<span style="font:7.0pt
"Times New Roman"">
</span></span></span><!--[endif]--><span lang="EN-CA"><a
href="https://researchworks.lib.washington.edu/policy-preservation.html"
moz-do-not-send="true">https://researchworks.lib.washington.edu/policy-preservation.html</a>
"Maintains the authenticity of the bitstream through
integrity checking”<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA">I understand that
EPrints is primarily an open access platform, but I think
that we should be able to provide at least the lowest
“bit-level” digital preservation support with it, and
without a Fixity check, I don’t think we can ensure that no
files are corrupted or changed over time.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><a
href="http://preserv.eprints.org/papers/presmeta/pm-paper-draft.html"
moz-do-not-send="true">Preservation Metadata for
Institutional Repositories</a>, a report looking at
EPrints and digital preservation dating back to 2007 states
the following about Fixity checking </span><span
style="font-size:12.0pt;font-family:"Times New
Roman",serif" lang="EN-CA">“Where is fixity check first
performed? Not within EPrints currently, but a script that
crawls the archive comparing files with checksums is
possible”.
</span><span lang="EN-CA">We are now 10 years later, and I am
wondering if and how institutions running EPrints are
implementing their Fixity checks? Are you using an external
tool like this:
<a href="https://www.avpreserve.com/tools/fixity/"
moz-do-not-send="true">https://www.avpreserve.com/tools/fixity/</a>?
Are you using your own custom script? Did you develop
something that is integrated with the EPrints Admin
interface?</span><span
style="font-size:12.0pt;font-family:"Times New
Roman",serif" lang="EN-CA"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><br>
Best wishes,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA">Tomasz<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:8.0pt;font-family:"Courier
New";color:#A6A6A6;mso-fareast-language:EN-CA"
lang="FR-CA">________________________________________________<o:p></o:p></span></p>
<p class="MsoNormal"
style="mso-margin-top-alt:0cm;margin-right:0cm;margin-bottom:12.0pt;margin-left:17.85pt"><span
style="font-size:9.0pt;font-family:"Arial",sans-serif;color:black"
lang="FR-CA">Tomasz Neugebauer<span style="background:white"><br>
</span>Digital Projects & Systems Development Librarian
/ Bibliothécaire des Projets Numériques & Développement
de Systèmes<span style="background:white"><br>
</span>Library / Bibliothèque<br>
Concordia University / Université Concordia</span><i><span
style="color:black" lang="FR-CA"><o:p></o:p></span></i></p>
<p class="MsoNormal"
style="mso-margin-top-alt:0cm;margin-right:0cm;margin-bottom:12.0pt;margin-left:17.85pt"><span
style="font-size:9.0pt;font-family:"Arial",sans-serif;color:black"
lang="FR-CA">Tel. / Tél. 514-848-2424 ext. / poste 7738<br>
Email / courriel: </span><span lang="EN-CA"><a
href="mailto:tomasz.neugebauer@concordia.ca"
moz-do-not-send="true"><span
style="font-size:9.0pt;font-family:"Arial",sans-serif;color:blue"
lang="FR-CA">tomasz.neugebauer@concordia.ca</span></a></span><span
style="font-size:9.0pt;font-family:"Arial",sans-serif;color:black"
lang="EN-CA">
</span><span style="color:black" lang="FR-CA"><o:p></o:p></span></p>
<p class="MsoNormal"
style="mso-margin-top-alt:0cm;margin-right:0cm;margin-bottom:12.0pt;margin-left:17.85pt"><span
style="font-size:9.0pt;font-family:"Arial",sans-serif;color:black"
lang="FR-CA">Mailing address / adresse postale: 1455 De
Maisonneuve Blvd. W., LB-540-03, Montreal, Quebec H3G 1M8<br>
Street address / adresse municipale: 1400 De Maisonneuve
Blvd. W., LB-540-03, Montreal, Quebec H3G 1M8<o:p></o:p></span></p>
<p class="MsoNormal"
style="mso-margin-top-alt:0cm;margin-right:0cm;margin-bottom:12.0pt;margin-left:17.85pt"><span
lang="EN-CA"><a href="http://library.concordia.ca/"
moz-do-not-send="true"><span
style="font-size:9.0pt;font-family:"Arial",sans-serif;color:blue"
lang="FR-CA">http://library.concordia.ca</span></a></span><span
style="font-size:9.0pt;font-family:"Arial",sans-serif;color:black;background:white"
lang="FR-CA"><br>
</span><span lang="EN-CA"><a
href="http://www.concordia.ca/faculty/tomasz-neugebauer.html"
moz-do-not-send="true"><span
style="font-size:9.0pt;font-family:"Arial",sans-serif;color:blue"
lang="FR-CA">http://www.concordia.ca/faculty/tomasz-neugebauer.html</span></a>
</span><i><span style="color:black" lang="FR-CA"><o:p></o:p></span></i></p>
<p class="MsoNormal"><span lang="FR-CA"><o:p> </o:p></span></p>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">*** Options: <a class="moz-txt-link-freetext" href="http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech">http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech</a>
*** Archive: <a class="moz-txt-link-freetext" href="http://www.eprints.org/tech.php/">http://www.eprints.org/tech.php/</a>
*** EPrints community wiki: <a class="moz-txt-link-freetext" href="http://wiki.eprints.org/">http://wiki.eprints.org/</a>
*** EPrints developers Forum: <a class="moz-txt-link-freetext" href="http://forum.eprints.org/">http://forum.eprints.org/</a>
</pre>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Christopher Gutteridge -- <a class="moz-txt-link-freetext" href="http://users.ecs.soton.ac.uk/cjg">http://users.ecs.soton.ac.uk/cjg</a>
University of Southampton Open Data Service: <a class="moz-txt-link-freetext" href="http://data.southampton.ac.uk/">http://data.southampton.ac.uk/</a>
You should read our Web & Data Innovation blog: <a class="moz-txt-link-freetext" href="http://blogs.ecs.soton.ac.uk/webteam/">http://blogs.ecs.soton.ac.uk/webteam/</a>
</pre>
</body>
</html>