<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    <p><font size="4">Hi Tomasz,</font></p>
    <p><font size="4">There are two ways to work round this issue.&nbsp; One
        has been in EPrints for quite a while, another I introduced in
        3.4.3 to help deal retrospectively with this issue.</font></p>
    <p><font size="4">1.
        <a class="moz-txt-link-freetext" href="https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.eprints.org%2Fw%2FOptional_filename_sanitise.pl&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C01890e841d96494e29e008d9f4b7f289%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637809893170214283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=s%2BuLP7RStWfaHQY%2FTa1xqyoTICxFexPhVJrj%2BFL0fsI%3D&amp;reserved=0" originalSrc="https://wiki.eprints.org/w/Optional_filename_sanitise.pl" shash="xsQ6Kbu/HjRd7WdnLCrkEWjdXrbKGOByxWYpCEfTIe5HeavSEzcnLN4JamQI67VCK/POMaDOyUvDx4FhRMOiP1V8iccRlqoDhL/c7HB7X3TiUm7LlUJdf7eZz9kBo2BnZD3Th+X4FbG5uJpro2oLcqUTxi/SJVIwL+vDn7QjPMA=">https://wiki.eprints.org/w/Optional_filename_sanitise.pl</a> allows
        you to set characters that should be removed before a filename
        is recorded in the database or saved to disk.&nbsp; I have to admit I
        did not know about this until fairly recently, so I have not
        tested how well it will work or solve your problem.&nbsp; If you look
        at /opt/eprints3/lib/cfg,d/optional_filename_sanitise.pl there
        is a function that can be added under
        $c-&gt;{optional_filename_sanitise}.&nbsp; The default (albeit
        commented out) function will remove white space, brackets and @
        signs into underscores.&nbsp; You could add a line like below to deal
        with apostrophes.</font></p>
    <p><font size="4">$filepath =~ s!\x27!_!g;</font></p>
    <p><font size="4">2. The new functionality I added for 3.4.3, is to
        allow files on disk to be found under the filename
        &lt;fileid&gt;.bin.&nbsp; This allows you to fix this sort of issue
        by renaming the file on disk to &lt;fileid&gt;.bin.&nbsp; Also, you
        can enable it so that future files are automatically saved in
        the format &lt;fileid&gt;.bin by setting:</font></p>
    <p><font size="4"> $c-&gt;{generic_filenames} = 1;</font></p>
    <p><font size="4">I would probably advise against doing this on a
        live repository, especially if you have unusual uploads like
        uploading multiple files an once through &quot;Upload from URL&quot;.&nbsp; If
        you want to test this on a development repo, then please do, as
        any real-world-ish feedback on this feature would be useful.</font></p>
    <p><font size="4">Regards</font></p>
    <p><font size="4">David Newman<br>
      </font></p>
    <div class="moz-cite-prefix">On 20/02/2022 20:32, Tomasz Neugebauer
      via Eprints-tech wrote:<br>
    </div>
    <blockquote type="cite" cite="mid:YQXPR01MB24073408D07F873ED5D030098B399@YQXPR01MB2407.CANPRD01.PROD.OUTLOOK.COM">
      
      <meta name="Generator" content="Microsoft Word 15 (filtered
        medium)">
      <style>@font-face
        {font-family:SimSun;
        panose-1:2 1 6 0 3 1 1 1 1 1;}@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}@font-face
        {font-family:"Segoe UI";
        panose-1:2 11 5 2 4 2 4 2 2 3;}@font-face
        {font-family:"\@SimSun";
        panose-1:2 1 6 0 3 1 1 1 1 1;}p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Tahoma",sans-serif;}a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:#0563C1;
        text-decoration:underline;}a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:#954F72;
        text-decoration:underline;}span.EmailStyle17
        {mso-style-type:personal-compose;
        font-family:"Tahoma",sans-serif;
        color:windowtext;}.MsoChpDefault
        {mso-style-type:export-only;
        font-family:"Tahoma",sans-serif;}div.WordSection1
        {page:WordSection1;}</style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
      <div style="padding-bottom: 10px; padding-top: 5px;">
        <div style="padding:12px; border:1px solid #8D3970;
          background-color:#F7F9FA; color:#8D3970; font-size:14px;
          line-height:22px; font-family: Calibri, Arial, Helvetica,
          sans-serif;">
          <strong>CAUTION:</strong> This e-mail originated outside the
          University of Southampton.
        </div>
      </div>
      <div>
        <div class="WordSection1">
          <p class="MsoNormal">Good afternoon!<o:p></o:p></p>
          <p class="MsoNormal"><o:p>&nbsp;</o:p></p>
          <p class="MsoNormal">I’m trying to troubleshoot an issue with
            exporting out a deposited file that has an apostrophe in the
            filename.<o:p></o:p></p>
          <p class="MsoNormal">This is the issue: <a href="https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprintsug%2FEPrintsArchivematica%2Fissues%2F40&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C01890e841d96494e29e008d9f4b7f289%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637809893170214283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=4Y%2F5Ce3e3cRSoybrdmZoeSWsHtXNWb6IHVU7ByZcXw8%3D&amp;reserved=0" originalSrc="https://github.com/eprintsug/EPrintsArchivematica/issues/40" shash="kQ4CAz5sfhTlUFMUwzZRtMYvHqDFw789GkQYKrCMxWFmqoxP4ZEBwkIs+LaFOaWZnelcer8+np4qQ0+ETVS+1rXNVhU27GHrTfGHnYnpS2nrl3ho93yExPPD0gQN2JTGdoUSqk4RyZLQiIJEsTWiMOMkl0x7gNh3HrqY2vj6j/g=" originalsrc="https://github.com/eprintsug/EPrintsArchivematica/issues/40" shash="w/a3XYwT30boV5OjTLHtCD92WTkUsbbd/RpUPQOdxxa/puusvJUOsgxpFnhBZ/Bt+FDPjuhxDtPqE+SJYskjbUYdCat/ppD2qj1sZFlukRT+QqZ8+6RjtFXJj+97f4fD9i627/+iR790hxGhNtGPUIOwkaceBCmqvjlhsNKTD38=" moz-do-not-send="true">
https://github.com/eprintsug/EPrintsArchivematica/issues/40</a><o:p></o:p></p>
          <p class="MsoNormal"><o:p>&nbsp;</o:p></p>
          <p class="MsoNormal">Does EPrints replace apostrophes in
            filenames on disk with <span style="font-size:10.5pt;font-family:&quot;Segoe
              UI&quot;,sans-serif;color:#24292F;background:white">
              =0027?<o:p></o:p></span></p>
          <p class="MsoNormal"><span style="font-size:10.5pt;font-family:&quot;Segoe
              UI&quot;,sans-serif;color:#24292F;background:white">Where
              in the code does that happen?<o:p></o:p></span></p>
          <p class="MsoNormal"><span style="font-size:10.5pt;font-family:&quot;Segoe
              UI&quot;,sans-serif;color:#24292F;background:white">The
              URL of the file has the apostrophe, for example:<o:p></o:p></span></p>
          <p class="MsoNormal"><span style="font-size:10.5pt;font-family:&quot;Segoe
              UI&quot;,sans-serif;color:#24292F;background:white"><a href="https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspectrum.library.concordia.ca%2Fid%2Feprint%2F7066%2F1%2FServices_techniques_a_l%27Universite_Concordia.pdf&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C01890e841d96494e29e008d9f4b7f289%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637809893170214283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=tB0a9gaytyO8qxGN9hp5VE8UfnTSIdqO%2FCkrYAb%2Bzbg%3D&amp;reserved=0" originalSrc="https://spectrum.library.concordia.ca/id/eprint/7066/1/Services_techniques_a_l'Universite_Concordia.pdf" shash="Z0ddfKAQEtkJdijizPI4noTqKJfxWOAGp8SJGLkmfASHhySOAy/Bar2id1+upjdE1ijkeD2F6sf6t/KEUyWAp5TNcA8byBLt4G27VY9n2Wi2U9V0xOodoocZNFWkIwqsws6aiE/I1kjK37eXV8/xXnLhxJ8qLbnEC99xnRiJd1g=" originalsrc="https://spectrum.library.concordia.ca/id/eprint/7066/1/Services_techniques_a_l'Universite_Concordia.pdf" shash="WXK0B1lwU349AFBfdERuFWXsZvOHYme4ch0iZasrLEYgJ6ZUn77Uj3jgnUwBhck1y7mEedC6Ep6NokGu9g4w6FNcYNpB347Uxtt0D/RBJZkx3TSX//Ooq/HZoEz3EMn+69JQ2W8wSdiIYtcd/7ZmbOTyhXbbNGfkYddyLoCpZZU=" moz-do-not-send="true">https://spectrum.library.concordia.ca/id/eprint/7066/1/Services_techniques_a_l'Universite_Concordia.pdf</a><o:p></o:p></span></p>
          <p class="MsoNormal">But unlike other Unicode characters, the
            apostrophe doesn’t make it into the file name on disk, and
            is substituted with =0027.<o:p></o:p></p>
          <p class="MsoNormal">I’m looking for confirmation that this is
            how it is “supposed” to work, and for an understanding where
            this happens in the code, so that I might ultimately know
            how many OTHER characters are replaced in this way in the
            filename?<o:p></o:p></p>
          <p class="MsoNormal"><o:p>&nbsp;</o:p></p>
          <p class="MsoNormal">Tomasz<o:p></o:p></p>
          <p class="MsoNormal"><o:p>&nbsp;</o:p></p>
        </div>
      </div>
      <br>
      <fieldset class="moz-mime-attachment-header"></fieldset>
      <pre class="moz-quote-pre" wrap="">*** Options: <a class="moz-txt-link-freetext" href="http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech">http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech</a>
*** Archive: <a class="moz-txt-link-freetext" href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C01890e841d96494e29e008d9f4b7f289%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637809893170370515%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=p0Xc4WVlAISCt9we%2FFeFuaG5wQ%2BZZ3bzvJECC5S4AKE%3D&amp;reserved=0" originalSrc="http://www.eprints.org/tech.php/" shash="E/rpqNUDSWml4CEFa2jVuwe0GkOgzmK5PMtn6NwjdQTl7yvd9gGCQIPC4qHSCMhNu+nvuJ2GF6sJGvwuMvACO6S9C7xHe4NFQfACNnQJ+i+GaxT83IUF3vOpyaZKySI39UPKxZEHQr3Ans1rdu/14feEJ0yAU/CkeSysqIuTiqs=">http://www.eprints.org/tech.php/</a>
*** EPrints community wiki: <a class="moz-txt-link-freetext" href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C01890e841d96494e29e008d9f4b7f289%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637809893170370515%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=mc9Wd2mPfqyYEk6KDzhdXx2zOkPYkpDZTIbYohwFKwo%3D&amp;reserved=0" originalSrc="http://wiki.eprints.org/" shash="TZolc5NMk6MlKQ2tIyG9IGYiwDCWYaSTck0v2L3m61SRDSUwZgrOsyReFKLxPJiP/mYHOv6hfBEziy15wSFTFpQWFhdOpfRVkYlpX0O9R5foABkPVB31fOoktZJzIUr8IAYx8r+CGm2aDI02SAPKZwVrkfKOeBCrL+T3KFMMwJM=">http://wiki.eprints.org/</a></pre>
    </blockquote>
  </body>
</html>