<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:Wingdings;
        panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:#0563C1;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:#954F72;
        text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {mso-style-priority:34;
        margin-top:0cm;
        margin-right:0cm;
        margin-bottom:0cm;
        margin-left:36.0pt;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
span.EmailStyle18
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
span.EmailStyle19
        {mso-style-type:personal-reply;
        font-family:"Calibri",sans-serif;
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
        {page:WordSection1;}
/* List Definitions */
@list l0
        {mso-list-id:1089929554;
        mso-list-type:hybrid;
        mso-list-template-ids:-883918726 269025281 269025283 269025285 269025281 269025283 269025285 269025281 269025283 269025285;}
@list l0:level1
        {mso-level-number-format:bullet;
        mso-level-text:\F0B7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Symbol;}
@list l0:level2
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:"Courier New";}
@list l0:level3
        {mso-level-number-format:bullet;
        mso-level-text:\F0A7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Wingdings;}
@list l0:level4
        {mso-level-number-format:bullet;
        mso-level-text:\F0B7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Symbol;}
@list l0:level5
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:"Courier New";}
@list l0:level6
        {mso-level-number-format:bullet;
        mso-level-text:\F0A7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Wingdings;}
@list l0:level7
        {mso-level-number-format:bullet;
        mso-level-text:\F0B7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Symbol;}
@list l0:level8
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:"Courier New";}
@list l0:level9
        {mso-level-number-format:bullet;
        mso-level-text:\F0A7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-18.0pt;
        font-family:Wingdings;}
ol
        {margin-bottom:0cm;}
ul
        {margin-bottom:0cm;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-GB" link="#0563C1" vlink="#954F72">
<div class="WordSection1">
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">Hi Tomasz,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">I think we're looking into similar things at the moment :o)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">I think there are similarities between 'fixity' and 'probity' - so although there isn't integration of fixity, this might be useful info:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">EPrints does support 'probity' files (<a href="http://www.probity.org/">http://www.probity.org/</a>), which include a hash of the contents.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">I don’t think these are generated by default, but the $doc->rehash command should generate them.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">See the EPrints::Probity module, and the 'rehash' option of bin/epadmin.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">Running [EPRINTS_ROOT]/bin/epadmin rehash [ARCHIVEID] [docid] will generate a file in the owning eprint folder e.g.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">[EPRINTS_ROOT]/</span>a<span style="color:#1F497D;mso-fareast-language:EN-US">rchives/[ARCHIVEID]/documents/disk0/00/00/00/01/1.2017-08-25T09=003a55=003a29Z.xsh<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">(for eprintid = 1, and docid = 1. Note the endcoded ':'s (=003a) in the timestamp in the filename).<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">The file has the following data:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"><?xml version="1.0" encoding="UTF-8" ?><o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"><hashlist xmlns="http://probity.org/XMLprobity"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"> <hash><o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"> <name>wreo.txt</name><o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"> <algorithm>MD5</algorithm><o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"> <value>17f861744d77c1d9754fd7ab6f403065</value><o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"> <date>2017-08-25T09:55:45Z</date><o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"> </hash><o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"></hashlist><o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">You can create multiple Probity files, but I don't think there's any way to compare one with another, or check the current checksum is equal to the most recently store one (which is
the main part of your question).<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">Cheers,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">John<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US">PS I'm also looking into DROID - as you were at some point. The Bazaar package needs an update or three…<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US">From:</span></b><span lang="EN-US"> eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk]
<b>On Behalf Of </b>Tomasz Neugebauer<br>
<b>Sent:</b> 24 August 2017 18:35<br>
<b>To:</b> eprints-tech@ecs.soton.ac.uk<br>
<b>Subject:</b> [EP-tech] Fixity Check and EPrints - Digital Preservation<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal" style="margin-bottom:12.0pt"><span lang="EN-CA">I believe that EPrints stores a checksum value for each uploaded file, but as far as I understand, there is no way to monitor if the checksums match up with current file, and thus no way of
checking for bit rot. <o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA">DSpace has the following: <a href="https://wiki.duraspace.org/display/DSDOC6x/Validating+CheckSums+of+Bitstreams">
https://wiki.duraspace.org/display/DSDOC6x/Validating+CheckSums+of+Bitstreams</a><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA">A periodic fixity check is a part of the lowest level of support for digital preservation, i.e., “Bit-level”. See some examples of Digital Preservation policy, all of which have some variation on this as a requirement:“regularly
audit checksums to ensure that no files have corrupted or changed in any way. This practice ensures the ability to provide an exact copy of original files over time”:<o:p></o:p></span></p>
<p class="MsoListParagraph" style="text-indent:-18.0pt;mso-list:l0 level1 lfo2"><![if !supportLists]><span lang="EN-CA" style="font-family:Symbol"><span style="mso-list:Ignore">·<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]><span lang="EN-CA"><a href="https://www.sfu.ca/content/dam/sfu/archives/DigitalPreservation/FormatPolicyRegistry.pdf">https://www.sfu.ca/content/dam/sfu/archives/DigitalPreservation/FormatPolicyRegistry.pdf</a> “Regularly perform
fixity checks on AIPs”<o:p></o:p></span></p>
<p class="MsoListParagraph" style="text-indent:-18.0pt;mso-list:l0 level1 lfo2"><![if !supportLists]><span lang="EN-CA" style="font-family:Symbol"><span style="mso-list:Ignore">·<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]><span lang="EN-CA"><a href="https://digital.library.yorku.ca/documentation/fixity-procedures">https://digital.library.yorku.ca/documentation/fixity-procedures</a> “York University Library are committed to maintaining the integrity
of objects in its care. This includes creating checksums for all archival format objects -- plus associated datastreams -- ingested into the repository, and regular fixity checking of those objects”<o:p></o:p></span></p>
<p class="MsoListParagraph" style="text-indent:-18.0pt;mso-list:l0 level1 lfo2"><![if !supportLists]><span lang="EN-CA" style="font-family:Symbol"><span style="mso-list:Ignore">·<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]><span lang="EN-CA"><a href="https://researchworks.lib.washington.edu/policy-preservation.html">https://researchworks.lib.washington.edu/policy-preservation.html</a> "Maintains the authenticity of the bitstream through integrity
checking”<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA">I understand that EPrints is primarily an open access platform, but I think that we should be able to provide at least the lowest “bit-level” digital preservation support with it, and without a Fixity check, I don’t think
we can ensure that no files are corrupted or changed over time.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><a href="http://preserv.eprints.org/papers/presmeta/pm-paper-draft.html">Preservation Metadata for Institutional Repositories</a>, a report looking at EPrints and digital preservation dating back to 2007 states the following
about Fixity checking </span><span lang="EN-CA" style="font-size:12.0pt;font-family:"Times New Roman",serif">“Where is fixity check first performed? Not within EPrints currently, but a script that crawls the archive comparing files with checksums is possible”.
</span><span lang="EN-CA">We are now 10 years later, and I am wondering if and how institutions running EPrints are implementing their Fixity checks? Are you using an external tool like this:
<a href="https://www.avpreserve.com/tools/fixity/">https://www.avpreserve.com/tools/fixity/</a>? Are you using your own custom script? Did you develop something that is integrated with the EPrints Admin interface?</span><span lang="EN-CA" style="font-size:12.0pt;font-family:"Times New Roman",serif"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><br>
Best wishes,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA">Tomasz<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-CA"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="FR-CA" style="font-size:8.0pt;font-family:"Courier New";color:#A6A6A6;mso-fareast-language:EN-CA">________________________________________________<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:0cm;margin-right:0cm;margin-bottom:12.0pt;margin-left:17.85pt">
<span lang="FR-CA" style="font-size:9.0pt;font-family:"Arial",sans-serif;color:black">Tomasz Neugebauer<span style="background:white"><br>
</span>Digital Projects & Systems Development Librarian / Bibliothécaire des Projets Numériques & Développement de Systèmes<span style="background:white"><br>
</span>Library / Bibliothèque<br>
Concordia University / Université Concordia</span><i><span lang="FR-CA" style="color:black"><o:p></o:p></span></i></p>
<p class="MsoNormal" style="mso-margin-top-alt:0cm;margin-right:0cm;margin-bottom:12.0pt;margin-left:17.85pt">
<span lang="FR-CA" style="font-size:9.0pt;font-family:"Arial",sans-serif;color:black">Tel. / Tél. 514-848-2424 ext. / poste 7738<br>
Email / courriel: </span><span lang="EN-CA"><a href="mailto:tomasz.neugebauer@concordia.ca"><span lang="FR-CA" style="font-size:9.0pt;font-family:"Arial",sans-serif;color:blue">tomasz.neugebauer@concordia.ca</span></a></span><span lang="EN-CA" style="font-size:9.0pt;font-family:"Arial",sans-serif;color:black">
</span><span lang="FR-CA" style="color:black"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:0cm;margin-right:0cm;margin-bottom:12.0pt;margin-left:17.85pt">
<span lang="FR-CA" style="font-size:9.0pt;font-family:"Arial",sans-serif;color:black">Mailing address / adresse postale: 1455 De Maisonneuve Blvd. W., LB-540-03, Montreal, Quebec H3G 1M8<br>
Street address / adresse municipale: 1400 De Maisonneuve Blvd. W., LB-540-03, Montreal, Quebec H3G 1M8<o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:0cm;margin-right:0cm;margin-bottom:12.0pt;margin-left:17.85pt">
<span lang="EN-CA"><a href="http://library.concordia.ca/"><span lang="FR-CA" style="font-size:9.0pt;font-family:"Arial",sans-serif;color:blue">http://library.concordia.ca</span></a></span><span lang="FR-CA" style="font-size:9.0pt;font-family:"Arial",sans-serif;color:black;background:white"><br>
</span><span lang="EN-CA"><a href="http://www.concordia.ca/faculty/tomasz-neugebauer.html"><span lang="FR-CA" style="font-size:9.0pt;font-family:"Arial",sans-serif;color:blue">http://www.concordia.ca/faculty/tomasz-neugebauer.html</span></a>
</span><i><span lang="FR-CA" style="color:black"><o:p></o:p></span></i></p>
<p class="MsoNormal"><span lang="FR-CA"><o:p> </o:p></span></p>
</div>
</body>
</html>