<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:Wingdings;
        panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:#0563C1;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:#954F72;
        text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {mso-style-priority:34;
        margin-top:0cm;
        margin-right:0cm;
        margin-bottom:0cm;
        margin-left:36.0pt;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
span.EmailStyle18
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
span.EmailStyle19
        {mso-style-type:personal-reply;
        font-family:"Calibri",sans-serif;
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
        {page:WordSection1;}
/* List Definitions */
@list l0
        {mso-list-id:1089929554;
        mso-list-type:hybrid;
        mso-list-template-ids:-883918726 269025281 269025283 269025285 269025281 269025283 269025285 269025281 269025283 269025285;}
@list l0:level1
        {mso-level-number-format:bullet;
        mso-level-text:\F0B7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Symbol;}
@list l0:level2
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:"Courier New";}
@list l0:level3
        {mso-level-number-format:bullet;
        mso-level-text:\F0A7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Wingdings;}
@list l0:level4
        {mso-level-number-format:bullet;
        mso-level-text:\F0B7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Symbol;}
@list l0:level5
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:"Courier New";}
@list l0:level6
        {mso-level-number-format:bullet;
        mso-level-text:\F0A7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Wingdings;}
@list l0:level7
        {mso-level-number-format:bullet;
        mso-level-text:\F0B7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Symbol;}
@list l0:level8
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:"Courier New";}
@list l0:level9
        {mso-level-number-format:bullet;
        mso-level-text:\F0A7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Wingdings;}
ol
        {margin-bottom:0cm;}
ul
        {margin-bottom:0cm;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-GB" link="#0563C1" vlink="#954F72">
<div class="WordSection1">
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">Hi Tomasz,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">I think we're looking into similar things at the moment :o)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">I think there are similarities between 'fixity' and 'probity' - so although there isn't integration of fixity, this might be useful info:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">EPrints does support 'probity' files (<a href="http://www.probity.org/">http://www.probity.org/</a>), which include a hash of the contents.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">I don&#8217;t think these are generated by default, but the $doc-&gt;rehash command should generate them.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">See the EPrints::Probity module, and the 'rehash' option of bin/epadmin.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">Running [EPRINTS_ROOT]/bin/epadmin rehash [ARCHIVEID] [docid] will generate a file in the owning eprint folder e.g.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">[EPRINTS_ROOT]/</span>a<span style="color:#1F497D;mso-fareast-language:EN-US">rchives/[ARCHIVEID]/documents/disk0/00/00/00/01/1.2017-08-25T09=003a55=003a29Z.xsh<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">(for eprintid&nbsp; = 1, and docid = 1. Note the endcoded ':'s (=003a) in the timestamp in the filename).<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">The file has the following data:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ?&gt;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">&lt;hashlist xmlns=&quot;http://probity.org/XMLprobity&quot;&gt;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">&nbsp; &lt;hash&gt;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp; &lt;name&gt;wreo.txt&lt;/name&gt;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp; &lt;algorithm&gt;MD5&lt;/algorithm&gt;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp; &lt;value&gt;17f861744d77c1d9754fd7ab6f403065&lt;/value&gt;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp; &lt;date&gt;2017-08-25T09:55:45Z&lt;/date&gt;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">&nbsp; &lt;/hash&gt;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">&lt;/hashlist&gt;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">You can create multiple Probity files, but I don't think there's any way to compare one with another, or check the current checksum is equal to the most recently store one (which is
 the main part of your question).<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">Cheers,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">John<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">PS I'm also looking into DROID - as you were at some point. The Bazaar package needs an update or three&#8230;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US">From:</span></b><span lang="EN-US"> eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
<b>On Behalf Of </b>Tomasz Neugebauer<br>
<b>Sent:</b> 24 August 2017 18:35<br>
<b>To:</b> eprints-tech@ecs.soton.ac.uk<br>
<b>Subject:</b> [EP-tech] Fixity Check and EPrints - Digital Preservation<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal" style="margin-bottom:12.0pt"><span lang="EN-CA">I believe that EPrints stores a checksum value for each uploaded file, but as far as I understand, there is no way to monitor if the checksums match up with current file, and thus no way of
 checking for bit rot.&nbsp; <o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA">DSpace has the following: <a href="https://wiki.duraspace.org/display/DSDOC6x/Validating&#43;CheckSums&#43;of&#43;Bitstreams">
https://wiki.duraspace.org/display/DSDOC6x/Validating&#43;CheckSums&#43;of&#43;Bitstreams</a><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA">A periodic fixity check is a part of the lowest level of support for digital preservation, i.e., &#8220;Bit-level&#8221;.&nbsp; See some examples of Digital Preservation policy, all of which have some variation on this as a requirement:&#8220;regularly
 audit checksums to ensure that no files have corrupted or changed in any way. This practice ensures the ability to provide an exact copy of original files over time&#8221;:<o:p></o:p></span></p>
<p class="MsoListParagraph" style="text-indent:-18.0pt;mso-list:l0 level1 lfo2"><![if !supportLists]><span lang="EN-CA" style="font-family:Symbol"><span style="mso-list:Ignore">·<span style="font:7.0pt &quot;Times New Roman&quot;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span></span></span><![endif]><span lang="EN-CA"><a href="https://www.sfu.ca/content/dam/sfu/archives/DigitalPreservation/FormatPolicyRegistry.pdf">https://www.sfu.ca/content/dam/sfu/archives/DigitalPreservation/FormatPolicyRegistry.pdf</a> &#8220;Regularly perform
 fixity checks on AIPs&#8221;<o:p></o:p></span></p>
<p class="MsoListParagraph" style="text-indent:-18.0pt;mso-list:l0 level1 lfo2"><![if !supportLists]><span lang="EN-CA" style="font-family:Symbol"><span style="mso-list:Ignore">·<span style="font:7.0pt &quot;Times New Roman&quot;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span></span></span><![endif]><span lang="EN-CA"><a href="https://digital.library.yorku.ca/documentation/fixity-procedures">https://digital.library.yorku.ca/documentation/fixity-procedures</a> &#8220;York University Library are committed to maintaining the integrity
 of objects in its care. This includes creating checksums for all archival format objects -- plus associated datastreams -- ingested into the repository, and regular fixity checking of those objects&#8221;<o:p></o:p></span></p>
<p class="MsoListParagraph" style="text-indent:-18.0pt;mso-list:l0 level1 lfo2"><![if !supportLists]><span lang="EN-CA" style="font-family:Symbol"><span style="mso-list:Ignore">·<span style="font:7.0pt &quot;Times New Roman&quot;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span></span></span><![endif]><span lang="EN-CA"><a href="https://researchworks.lib.washington.edu/policy-preservation.html">https://researchworks.lib.washington.edu/policy-preservation.html</a> &quot;Maintains the authenticity of the bitstream through integrity
 checking&#8221;<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA">I understand that EPrints is primarily an open access platform, but I think that we should be able to provide at least the lowest &#8220;bit-level&#8221; digital preservation support with it, and without a Fixity check, I don&#8217;t think
 we can ensure that no files are corrupted or changed over time.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><a href="http://preserv.eprints.org/papers/presmeta/pm-paper-draft.html">Preservation Metadata for Institutional Repositories</a>, a report looking at EPrints and digital preservation dating back to 2007 states the following
 about Fixity checking </span><span lang="EN-CA" style="font-size:12.0pt;font-family:&quot;Times New Roman&quot;,serif">&#8220;Where is fixity check first performed? Not within EPrints currently, but a script that crawls the archive comparing files with checksums is possible&#8221;.&nbsp;
</span><span lang="EN-CA">We are now 10 years later, and I am wondering if and how institutions running EPrints are implementing their Fixity checks? Are you using an external tool like this:
<a href="https://www.avpreserve.com/tools/fixity/">https://www.avpreserve.com/tools/fixity/</a>? Are you using your own custom script?&nbsp; Did you develop something that is integrated with the EPrints Admin interface?</span><span lang="EN-CA" style="font-size:12.0pt;font-family:&quot;Times New Roman&quot;,serif"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><br>
Best wishes,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA">Tomasz<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span lang="FR-CA" style="font-size:8.0pt;font-family:&quot;Courier New&quot;;color:#A6A6A6;mso-fareast-language:EN-CA">________________________________________________<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:0cm;margin-right:0cm;margin-bottom:12.0pt;margin-left:17.85pt">
<span lang="FR-CA" style="font-size:9.0pt;font-family:&quot;Arial&quot;,sans-serif;color:black">Tomasz Neugebauer<span style="background:white"><br>
</span>Digital Projects &amp; Systems Development Librarian / Bibliothécaire des Projets Numériques &amp; Développement de Systèmes<span style="background:white"><br>
</span>Library / Bibliothèque<br>
Concordia University / Université Concordia</span><i><span lang="FR-CA" style="color:black"><o:p></o:p></span></i></p>
<p class="MsoNormal" style="mso-margin-top-alt:0cm;margin-right:0cm;margin-bottom:12.0pt;margin-left:17.85pt">
<span lang="FR-CA" style="font-size:9.0pt;font-family:&quot;Arial&quot;,sans-serif;color:black">Tel. / Tél. 514-848-2424 ext. / poste 7738<br>
Email / courriel: </span><span lang="EN-CA"><a href="mailto:tomasz.neugebauer@concordia.ca"><span lang="FR-CA" style="font-size:9.0pt;font-family:&quot;Arial&quot;,sans-serif;color:blue">tomasz.neugebauer@concordia.ca</span></a></span><span lang="EN-CA" style="font-size:9.0pt;font-family:&quot;Arial&quot;,sans-serif;color:black">
</span><span lang="FR-CA" style="color:black"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:0cm;margin-right:0cm;margin-bottom:12.0pt;margin-left:17.85pt">
<span lang="FR-CA" style="font-size:9.0pt;font-family:&quot;Arial&quot;,sans-serif;color:black">Mailing address / adresse postale:&nbsp;1455 De Maisonneuve Blvd. W.,&nbsp;LB-540-03, Montreal, Quebec H3G 1M8<br>
Street address / adresse municipale: 1400&nbsp;De Maisonneuve Blvd. W.,&nbsp;LB-540-03, Montreal, Quebec H3G 1M8<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:0cm;margin-right:0cm;margin-bottom:12.0pt;margin-left:17.85pt">
<span lang="EN-CA"><a href="http://library.concordia.ca/"><span lang="FR-CA" style="font-size:9.0pt;font-family:&quot;Arial&quot;,sans-serif;color:blue">http://library.concordia.ca</span></a></span><span lang="FR-CA" style="font-size:9.0pt;font-family:&quot;Arial&quot;,sans-serif;color:black;background:white"><br>
</span><span lang="EN-CA"><a href="http://www.concordia.ca/faculty/tomasz-neugebauer.html"><span lang="FR-CA" style="font-size:9.0pt;font-family:&quot;Arial&quot;,sans-serif;color:blue">http://www.concordia.ca/faculty/tomasz-neugebauer.html</span></a>
</span><i><span lang="FR-CA" style="color:black"><o:p></o:p></span></i></p>
<p class="MsoNormal"><span lang="FR-CA"><o:p>&nbsp;</o:p></span></p>
</div>
</body>
</html>