[EP-tech] Antwort: Re: validation on upload field
John Salter
J.Salter at leeds.ac.uk
Thu Nov 30 15:43:55 GMT 2017
In GitHub, there is this:
https://github.com/eprints/eprints/blob/3.3/lib/defaultcfg/cfg.d/optional_filename_sanitise.pl
which works alongside an addition to System.pm:
https://github.com/eprints/eprints/blob/69f4c9e581df137b970ce0ab4e08572976162411/perl_lib/EPrints/System.pm#L551-L559
That might be useful to know about?
Cheers,
John
From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of martin.braendle at id.uzh.ch
Sent: 30 November 2017 13:56
To: eprints-tech at ecs.soton.ac.uk
Subject: [EP-tech] Antwort: Re: validation on upload field
Hi Alfredo,
another way, instead of validating, is to transcribe the filenames. We extended the sanitise subroutine in perl_lib/EPrints/System.pm like this:
Index: System.pm
===================================================================
--- System.pm (revision 1405)
+++ System.pm (revision 1406)
@@ -25,6 +25,7 @@
use strict;
use File::Copy<File:///\\:Copy>;
use Digest::MD5;
+use Text::Unidecode;
=item $sys = EPrints::System->new();
@@ -540,6 +541,10 @@
$filepath = Encode::decode_utf8( $filepath )
if !utf8::is_utf8( $filepath );
+ # UZH CHANGE ZORA-542 2016/12/21/mb
+ $filepath = unidecode( $filepath );
+ $filepath =~ s![\x20]!_!g;
+
# control characters + Win32 restricted
$filepath =~ s![\x00-\x0f\x7f<>:"\\|?*]!_!g;
Best regards,
Martin
--
Dr. Martin Brändle
Zentrale Informatik
Universität Zürich
Stampfenbachstr. 73
CH-8006 Zürich
mail: martin.braendle at id.uzh.ch
phone: +41 44 63 56705
fax: +41 44 63 54505
http://www.zi.uzh.ch
[Inactive hide details for th.lauke---30.11.2017 13:52:25---Hi Alfredo, we solved an similar feature request by a either reposit]th.lauke---30.11.2017 13:52:25---Hi Alfredo, we solved an similar feature request by a either repository specific (i.e. Eprints/archi
Von: th.lauke at arcor.de
An: eprints-tech at ecs.soton.ac.uk
Datum: 30.11.2017 13:52
Betreff: Re: [EP-tech] validation on upload field
Gesendet von: eprints-tech-bounces at ecs.soton.ac.uk
________________________________
Hi Alfredo,
we solved an similar feature request by a either repository specific (i.e. Eprints/archives/repoID/cfg/cfg.d/) or server specific (i.e. Eprints/site_lib/cfg.d/) document_validate.pl:
$c->{validate_document} = sub
{
my( $document, $repository, $for_archive ) = @_;
my @problems = ();
my $xml = $repository->xml();
# default checks
# :
# site-specific checks
# check for proper filename, i.e. accepted by tivoli backup ingesting only ASCI-filenames without blanks
# print STDERR "main: ", $document->value( "main" )," escaped: ",URI::Escape::uri_escape_utf8($document->value( "main" ), "^A-Za-z0-9\-\._~\/");
my $doc_name_uri = URI::Escape::uri_escape_utf8($document->value( "main" ), "^A-Za-z0-9\-\._~\/");
if( $document->value( "main" ) ne $doc_name_uri )
{
my $fieldname = $repository->make_element( "span", class=>"ep_problem_field:documents" );
$fieldname->appendChild( $document->dataset->render_name( $repository ) );
my $prob = $repository->make_doc_fragment;
$prob->appendChild( $repository->html_phrase( "validate:bad_filename", fieldname=>$fieldname ) );
$prob->appendChild( $repository->make_text( $doc_name_uri ) );
$prob->appendChild( $repository->html_phrase( "validate:original_filename") );
$prob->appendChild( $repository->make_text( $document->value( "main" ) ) );
push @problems, $prob;
}
return( @problems );
};
After setting the introduced phrases by
<epp:phrase id="validate:bad_filename">Please replace non-ASCII characters (e.g. 'äöü') or blanks in the name of uploaded <epc:pin name="fieldname" /> appropriately to simplify future handling!<br/>Following filename prepared for repository<br/></epp:phrase>
<epp:phrase id="validate:original_filename"><br/>is different to original one :(<br/></epp:phrase>
in an appropriate .../lang/en/phrases/... file it should work :-)
Hth
Thomas
*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20171130/c7be47a8/attachment.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.gif
Type: image/gif
Size: 105 bytes
Desc: image001.gif
Url : http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20171130/c7be47a8/attachment.gif
More information about the Eprints-tech
mailing list