<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body>
<div style="padding-bottom: 10px; padding-top: 5px;">
<div style="padding:12px; border:1px solid #8D3970; background-color:#F7F9FA; color:#8D3970; font-size:14px; line-height:22px; font-family: Calibri, Arial, Helvetica, sans-serif;">
<strong>CAUTION:</strong> This e-mail originated outside the University of Southampton.
</div>
</div>
<div>
<p><font size="2" face="sans-serif">Dear David</font><br>
<br>
<font size="2" face="sans-serif">thank you for your support!<br>
</font><br>
<font size="2" face="sans-serif">Kind regards<br>
Jens</font><br>
<br>
<font size="2" face="sans-serif">-- <br>
Jens Witzel<br>
Zentrale Informatik<br>
Universität Zürich<br>
Stampfenbachstrasse 73<br>
CH-8006 Zürich<br>
<br>
mail: jens.witzel@uzh.ch<br>
phone: +41 44 63 56777<br>
<a href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.zi.uzh.ch%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C22ae0dd442bc483bfd0608d950133140%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637628867872333639%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=1X%2BCwiVDKs%2B5XpnuHCpdWmSbPfEyoVnQPg1SXVPAcBQ%3D&reserved=0" originalSrc="http://www.zi.uzh.ch/" shash="FCAFjG2muJTUHQibhSAbRzQmP3TjolqkiLFzaQk5cI1i0Af850SnThd3YxlCw33e8XdAKjBQCiBILyiS1Te0zFrmDpqaVMYO7tEdThynepXpo/Mp106V/uPxlIBI9TqAAlMqxv0giZxi+hDZ9XHYm/2gNSWzpzEsoGD9FKciF8A=">http://www.zi.uzh.ch</a></font><br>
<br>
<img width="16" height="16" src="cid:1__=4EBB0D8DDFA285C58f9e8a93df938690918c4EBB0D8DDFA285C5@lotus.uzh.ch" border="0" alt="Inactive hide details for "David R Newman" ---26.07.2021 10:50:37---Hi Jens, I can replicate the same problem on 3.4 GitHub HEA"><font size="2" color="#424282" face="sans-serif">"David
R Newman" ---26.07.2021 10:50:37---Hi Jens, I can replicate the same problem on 3.4 GitHub HEAD [1]. I have created</font><br>
<br>
<font size="1" color="#5F5F5F" face="sans-serif">Von: </font><font size="1" face="sans-serif">"David R Newman" <drn@ecs.soton.ac.uk></font><br>
<font size="1" color="#5F5F5F" face="sans-serif">An: </font><font size="1" face="sans-serif">eprints-tech@ecs.soton.ac.uk, jens.witzel@uzh.ch</font><br>
<font size="1" color="#5F5F5F" face="sans-serif">Datum: </font><font size="1" face="sans-serif">26.07.2021 10:50</font><br>
<font size="1" color="#5F5F5F" face="sans-serif">Betreff: </font><font size="1" face="sans-serif">Re: [EP-tech] Crawler ends up with 404, dont know how to handle MIME subtype wildcard</font><br>
</p>
<hr width="100%" size="2" align="left" noshade="" style="color:#8091A5; ">
<br>
<br>
<br>
<font size="3" face="serif">Hi Jens,</font>
<p><font size="3" face="serif">I can replicate the same problem on 3.4 GitHub HEAD [1]. I have created a GitHub issue for this [2] and will investigate.</font>
</p>
<p><font size="3" face="serif">Regards </font> </p>
<p><font size="3" face="serif">David Newman</font> </p>
<p><font size="3" face="serif">[1] </font><a href="https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprints%2Feprints3.4&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C22ae0dd442bc483bfd0608d950133140%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637628867872333639%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=FCj%2FRmC7g0vTzUtvN0XgunekG01hYXSBVHSnh4L4efo%3D&reserved=0" originalSrc="https://github.com/eprints/eprints3.4" shash="gDlt3PILaU4+6ZsiiGlKmfhNfhQuLE5g0mu+gqC/EOqMoBlFNFqbtPRtMTYLaqVj7yZrgfJElTXOTGDgqWRFTuycda4KGIr4TQKe9C/RFSfRmPZ9V0kdJHO5+qWp7z/tytW9A+27wBfXnNhGy8Xe5SxUDjRjGIUKz7dwLOEchd0="><font size="3" color="#0000FF" face="serif"><u>https://github.com/eprints/eprints3.4</u></font></a>
</p>
<p><font size="3" face="serif">[2] </font><a href="https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprints%2Feprints3.4%2Fissues%2F159&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C22ae0dd442bc483bfd0608d950133140%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637628867872343598%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=UT10FbWLIjtC4u%2B9vqTpwri9QHUqQ11YDJYi05UVY4I%3D&reserved=0" originalSrc="https://github.com/eprints/eprints3.4/issues/159" shash="GYbssUcsT5rqEiEtTq8oPbpj9hLBMOZQdhkEMu45FFtuDWu3poo8xh0d34DZlepu+qYgv4nX0FX7v4NqnPgx3I0d0NsYRacYbUFrE1/DM1l901wHvJ90y8dbWNxRoqmzbDL1bWCUwJnjsoYAClq2gZ43kS2u+ZeMJQzdcaOpxrk="><font size="3" color="#0000FF" face="serif"><u>https://github.com/eprints/eprints3.4/issues/159</u></font></a>
</p>
<p><font size="3" face="serif">On 26/07/2021 09:31, jens.witzel--- via Eprints-tech wrote:</font>
</p>
<ul style="padding-left: 36pt; margin-left: 0px">
<font size="2" color="#8D3970" face="Calibri"><b>CAUTION:</b></font><font size="2" color="#8D3970" face="Calibri"> This e-mail originated outside the University of Southampton.
</font>
<p><font size="2" face="sans-serif">Dear all</font><font size="3" face="serif"><br>
</font><font size="2" face="sans-serif"><br>
unfortunately one of our partner crawlers reports a 404 error during the download, The problem occurs when wildcards are used as mime subtype.</font><font size="3" face="serif"><br>
</font><font size="2" face="sans-serif"><br>
Here an example on our repo ZORA - let us try to get publication no. 143147 via CURL:</font><font size="3" face="serif"><br>
</font><font size="2" face="sans-serif"><br>
HTTP 200 status is returned, when<br>
- no Accept header is specified: curl -v </font><a href="https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.zora.uzh.ch%2Fid%2Feprint%2F143147%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C22ae0dd442bc483bfd0608d950133140%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637628867872343598%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=a2gSp%2F314rct%2FD1eImidV6xedzRrsMZGaibYOZOPAIc%3D&reserved=0" originalSrc="https://www.zora.uzh.ch/id/eprint/143147/" shash="pc669Y6rAdjjrb8qTSgemfHWwCI0yZmFYqpk3UL01kp7Hiasg6deIiuns7yYtHkyhBH1vjchDOFUbHT8O40MwJMdjDn6Ff8ZMz5Ie86ACM1VXejJJ9lFVj49T4gwPr+R9LuHl/ICkbcEx114Es8TNySPp6qDczEtIvsZeS3p/YI="><font size="2" color="#0000FF" face="sans-serif"><u>https://www.zora.uzh.ch/id/eprint/143147/</u></font></a><font size="2" face="sans-serif"><br>
- an exact MIME type is specified: curl -v -H 'Accept: text/html' </font><a href="https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.zora.uzh.ch%2Fid%2Feprint%2F143147%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C22ae0dd442bc483bfd0608d950133140%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637628867872353554%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=QEZLicQ19mNTx2TM2rFt8Zeyez71wQv0yqmx44Cf%2Fgo%3D&reserved=0" originalSrc="https://www.zora.uzh.ch/id/eprint/143147/" shash="zXoAj7xiR6lheEHNrIJu4u2TuSSCxWPgX3z6JC2zuOfkUDK4im5qNrDrPEjMtn/KE7iPafsR+lISh3oOuekKVsFQud0NrAGUNRDhrUOIIuerPZj725T+M7vrIQ00i+rDvyEjOrzYJmtr3FfzqovN7Z4vNJFxjbcufhND92/ZhYg="><font size="2" color="#0000FF" face="sans-serif"><u>https://www.zora.uzh.ch/id/eprint/143147/</u></font></a><font size="2" face="sans-serif"><br>
- any MIME type is specified: curl -v -H 'Accept: */*' </font><a href="https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.zora.uzh.ch%2Fid%2Feprint%2F143147%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C22ae0dd442bc483bfd0608d950133140%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637628867872353554%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=QEZLicQ19mNTx2TM2rFt8Zeyez71wQv0yqmx44Cf%2Fgo%3D&reserved=0" originalSrc="https://www.zora.uzh.ch/id/eprint/143147/" shash="zXoAj7xiR6lheEHNrIJu4u2TuSSCxWPgX3z6JC2zuOfkUDK4im5qNrDrPEjMtn/KE7iPafsR+lISh3oOuekKVsFQud0NrAGUNRDhrUOIIuerPZj725T+M7vrIQ00i+rDvyEjOrzYJmtr3FfzqovN7Z4vNJFxjbcufhND92/ZhYg="><font size="2" color="#0000FF" face="sans-serif"><u>https://www.zora.uzh.ch/id/eprint/143147/</u></font></a><font size="3" face="serif"><br>
</font><font size="2" face="sans-serif"><br>
HTTP 404 status is returned if the MIME subtype is open, e.g. 'text/*'.</font><font size="3" face="serif"><br>
</font><font size="2" face="sans-serif"><br>
==> curl -v -H 'Accept: text/*,application/*' </font><a href="https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.zora.u%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C22ae0dd442bc483bfd0608d950133140%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637628867872363509%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=iJ7lCTQjOUEIB5mW3%2BQXU6%2BZldFOmb5wHOYBwwA85Ek%3D&reserved=0" originalSrc="https://www.zora.u/" shash="M7IerENiBUt/GN4Xf/45GN0Ex+MoJ8K1FV4jPh+cYBCXqCeiNiUko+Bet3ZR6hwh3ZJ61h3NA8KTN6HqwHkjTY8fVZA2uZ4BZzriqz8YUdHiWfPIR95PSjs9e5XERC/08cL96rSfg2Ynz3waxafSSohkBVMuNDq4i4sb5M7uivA="><font size="2" color="#0000FF" face="sans-serif"><u>https://www.zora.u</u></font></a><font size="2" face="sans-serif">zh.ch/id/eprint/143147/</font><font size="3" face="serif"><br>
</font><font size="2" face="sans-serif"><br>
[...]<br>
< HTTP/1.1 404 Not Found<br>
< Date: Mon, 26 Jul 2021 08:23:04 GMT<br>
< Server: Apache/2.4.6 (Red Hat Enterprise Linux) OpenSSL/1.0.2k-fips mod_perl/2.0.11 Perl/v5.16.3<br>
< Cache-Control: no-store, no-cache, must-revalidate<br>
< Strict-Transport-Security: max-age=15780000<br>
< Transfer-Encoding: chunked<br>
< Content-Type: text/html; charset=utf-8</font><font size="3" face="serif"><br>
</font><font size="2" face="sans-serif"><br>
The Header "Accept: text/*,application/*" should be valid. So, we think is goin wrong around CRUD.pm [line 948] -
</font><font size="3" face="serif">elsif( $subtype eq '*' ) {}<br>
</font><font size="2" face="sans-serif"><br>
Is this a bug or is there a workaround? Any help is appreciated.<br>
<br>
Have a nice day<br>
Jens</font><font size="3" face="serif"><br>
<br>
</font><font size="2" face="sans-serif"><br>
-- <br>
Jens Witzel<br>
Zentrale Informatik<br>
Universität Zürich<br>
Stampfenbachstrasse 73<br>
CH-8006 Zürich<br>
<br>
mail: </font><a href="mailto:jens.witzel@uzh.ch"><font size="2" color="#0000FF" face="sans-serif"><u>jens.witzel@uzh.ch</u></font></a><font size="2" face="sans-serif"><br>
phone: +41 44 63 56777</font><font size="2" color="#0000FF" face="sans-serif"><u><br>
</u></font><a href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.zi.uzh.ch%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C22ae0dd442bc483bfd0608d950133140%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637628867872363509%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=Q6UgL96mhW%2BKJAmyAlEid2T9TGDQiTuhxfCYUrI1vd8%3D&reserved=0" originalSrc="http://www.zi.uzh.ch/" shash="Ob/Ji6UG+MKPffxUaMpStuscRAo6nxA4ceQ1spWAkSontGSU5H2vYene6km53ZFv912W8ueT1XJNtodSHLfIghOAOBLGJgZR6p2r1rioIIwBaXR1zVdhEqWb8ODqTRD98KgjPL/PSiQCPbA+LtPHZHJxrtzwUXbiy7QP3BwV3eg="><font size="2" color="#0000FF" face="sans-serif"><u>http://www.zi.uzh.ch</u></font></a>
</p>
<p><br>
<tt><font size="3">*** Options: </font></tt><a href="http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech"><tt><font size="3" color="#0000FF"><u>http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech</u></font></tt></a><tt><font size="3"><br>
*** Archive: </font></tt><a href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C22ae0dd442bc483bfd0608d950133140%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637628867872373466%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=33j3puKcneAV18mzq4KomZ3dwTBkSEUGqxdMc1hL9go%3D&reserved=0" originalSrc="http://www.eprints.org/tech.php/" shash="xIc8ele0g0qKVKexDgeZN0LUUxeV8vjAo+1nkq/2UUqLV5gQJL+0WsqfJnKCCmJ4nu+dUu0o0eSwCoaUIAHKxoREncPRiGfeoTu+oUnqqFfuXIhUTjl2rp+753XDGB2BAGe2nCoYQiHo6szFRt1GBCXJtRYgfIyuvnbALPxjxRc="><tt><font size="3" color="#0000FF"><u>http://www.eprints.org/tech.php/</u></font></tt></a><tt><font size="3"><br>
*** EPrints community wiki: </font></tt><a href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C22ae0dd442bc483bfd0608d950133140%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637628867872373466%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=GzjLxGS3xhTH5Bjk9AVb%2BkiPybJkTHqPbH7srMJcTWE%3D&reserved=0" originalSrc="http://wiki.eprints.org/" shash="Liqgf8pk0B56cltx/CoGxi69UbCZsA8hkGTzzfUsCaCDIX1Io+C5umh5aIoIr5etV785ZXo34Bt8MjkkLAw+fw3BDOdL3uBlVGreahZ6jEdWUYCYCAfbNk+ReAYmhti8UdDMX48/FzDoaRsyYy+RnV1GL0DbYFEuXsUf5u5gsXE="><tt><font size="3" color="#0000FF"><u>http://wiki.eprints.org/</u></font></tt></a></p>
</ul>
<table border="1">
<tbody>
<tr valign="top">
<td width="47" valign="middle">
<ul style="padding-left: 0pt; margin-left: 0px">
<a href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.avg.com%2Femail-signature%3Futm_medium%3Demail%26utm_source%3Dlink%26utm_campaign%3Dsig-email%26utm_content%3Demailclient&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C22ae0dd442bc483bfd0608d950133140%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637628867872383419%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=%2F7JyEU02MRQyp9dDdGyZHmxUi1IdUOHTb6AYPYtSZAA%3D&reserved=0" originalSrc="http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient" shash="KBlBoi5ITG/6j7w+o8pMBT+ko2ZgAX7WY/Ir4JiFCfvS9+zZXIjOzMFYowNo2BptHGEUBBrEf603OYPZgyKUTnnBz98rJo8mBpuAz634CyCckC80W0gnfrbGd4MUz+JI2XzU8Y+Ij157A52S1z45Gh5Q54zEsVhxyvxu0Z682Pc=" target="_blank"></a></ul>
</td>
<td width="147" valign="middle">
<ul style="padding-left: 0pt; margin-left: 0px">
<font size="2" color="#41424E" face="Arial">Virus-free. </font><a href="https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.avg.com%2Femail-signature%3Futm_medium%3Demail%26utm_source%3Dlink%26utm_campaign%3Dsig-email%26utm_content%3Demailclient&data=04%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7C22ae0dd442bc483bfd0608d950133140%7C4a5378f929f44d3ebe89669d03ada9d8%7C0%7C0%7C637628867872383419%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=%2F7JyEU02MRQyp9dDdGyZHmxUi1IdUOHTb6AYPYtSZAA%3D&reserved=0" originalSrc="http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient" shash="KBlBoi5ITG/6j7w+o8pMBT+ko2ZgAX7WY/Ir4JiFCfvS9+zZXIjOzMFYowNo2BptHGEUBBrEf603OYPZgyKUTnnBz98rJo8mBpuAz634CyCckC80W0gnfrbGd4MUz+JI2XzU8Y+Ij157A52S1z45Gh5Q54zEsVhxyvxu0Z682Pc=" target="_blank"><font size="2" color="#4453EA" face="Arial"><u>www.avg.com</u></font></a><font size="2" color="#41424E" face="Arial"> </font></ul>
</td>
</tr>
</tbody>
</table>
<a href="#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2"></a><br>
<br>
</div>
</body>
</html>