[EP-tech] Re: Base64 decoding in 3.3
Tim Brody
tdb2 at ecs.soton.ac.uk
Wed May 30 15:07:22 BST 2012
Hi,
EPrints assumes a line-length of 77 (76 chars + LF).
%4 will break if the returned data happens to fall over a line break + 2
chars.
Here is hopefully a comprehensive fix:
http://trac.eprints.org/eprints/changeset/7764
Which ignores all whitespace and consumes modulo 4 chars for each chunk.
If you are talking to an EPrints instance that doesn't have this fix you
will need to format your Base64 into 76+LF lines. I would like to say
you should be doing this anyway, but I missed the CR that the spec
defines!:
http://en.wikipedia.org/wiki/Base64#Implementations_and_history
(So it ought to be 76+CR+LF i.e. modulo 78)
/Tim.
On Wed, 2012-05-30 at 14:06 +0100, James Colhoun wrote:
> Hi Tim,
>
>
> I have sent you the files. I have also been able to fix it I changed
> within File.pm I have changes "sub characters" FROM:
>
>
> print $tmpfile MIME::Base64::decode_base64( substr($_,0,length($_) -
> length($_)%77) );
>
>
> TO
>
>
> print $tmpfile MIME::Base64::decode_base64( substr($_,0,length($_) -
> length($_)%4) );
>
>
> this seem to stop the chunking from breaking up individual byes and
> causing the problem. I am still testing this but would be great to
> know what you think.
>
>
> Jim
>
>
>
>
> -----eprints-tech-bounces at ecs.soton.ac.uk wrote: -----
> To: eprints-tech at ecs.soton.ac.uk
> From: Tim Brody
> Sent by: eprints-tech-bounces at ecs.soton.ac.uk
> Date: 05/29/2012 03:50PM
> Subject: [EP-tech] Re: Base64 decoding in 3.3
>
> On Tue, 2012-05-29 at 12:18 +0100, James Colhoun wrote:
> > Hi,
> >
> >
> > I am uploading publications via sword, full text files are added to
> > the upload xml and encoded in base64 this worked fine until we
> > upgraded to 3.3. Now we get errors in the log:
> >
> >
> > failed: expected 3151 bytes but actually got 3149 bytes
> >
> >
> > So it seems the decoding of base64 is no longer working correctly.
> > Inside EPrints/DataObj/File.pm the functions: end_element,
> characters
> > and start_element seems to create a tmp file that is corrupt. If I
> > add a write to file inside "sub characters" (see below) the pdf is
> > created correctly so I know the data is passed in correctly, there
> > seems to be something fundamentally broken with the way the decoding
> > to tmpfile is working. Has anyone seen this are have a fix for it?
> >
> Hi,
>
> I can't replicate this. I did find a bug in XMLFiles for *producing*
> base64 encoded files, fixed by this:
> http://trac.eprints.org/eprints/ticket/4057
>
> This could be an edge case - can you post your XML somewhere or email
> it
> to me directly (if not too big)?
>
> --
> All the best,
> Tim
>
> *** Options:
> http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
>
>
>
> [attachment "signature.asc" removed by James
> Colhoun/sisjc5/CardiffUniversity]
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 490 bytes
Desc: This is a digitally signed message part
Url : http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20120530/48d437f7/attachment.bin
More information about the Eprints-tech
mailing list