[EP-tech] Antwort: Re: validation on upload field

martin.braendle at id.uzh.ch martin.braendle at id.uzh.ch
Thu Nov 30 13:55:33 GMT 2017


Hi Alfredo,

another way, instead of validating, is to transcribe the filenames. We
extended the sanitise subroutine in perl_lib/EPrints/System.pm like this:

Index: System.pm
===================================================================
--- System.pm	(revision 1405)
+++ System.pm	(revision 1406)
@@ -25,6 +25,7 @@
 use strict;
 use File::Copy;
 use Digest::MD5;
+use Text::Unidecode;

 =item $sys = EPrints::System->new();

@@ -540,6 +541,10 @@
 	$filepath = Encode::decode_utf8( $filepath )
 		if !utf8::is_utf8( $filepath );

+	# UZH CHANGE ZORA-542 2016/12/21/mb
+ 	$filepath = unidecode( $filepath );
+	$filepath =~ s![\x20]!_!g;
+
 	# control characters + Win32 restricted
 	$filepath =~ s![\x00-\x0f\x7f<>:"\\|?*]!_!g;



Best regards,

Martin

--
Dr. Martin Brändle
Zentrale Informatik
Universität Zürich
Stampfenbachstr. 73
CH-8006 Zürich

mail: martin.braendle at id.uzh.ch
phone: +41 44 63 56705
fax: +41 44 63 54505
http://www.zi.uzh.ch



Von:	th.lauke at arcor.de
An:	eprints-tech at ecs.soton.ac.uk
Datum:	30.11.2017 13:52
Betreff:	Re: [EP-tech] validation on upload field
Gesendet von:	eprints-tech-bounces at ecs.soton.ac.uk



Hi Alfredo,

we solved an similar feature request by a either repository specific (i.e.
Eprints/archives/repoID/cfg/cfg.d/) or server specific (i.e.
Eprints/site_lib/cfg.d/) document_validate.pl:

$c->{validate_document} = sub
{
        my( $document, $repository, $for_archive ) = @_;

        my @problems = ();

        my $xml = $repository->xml();

        # default checks
# :
        # site-specific checks

        # check for proper filename, i.e. accepted by tivoli backup
ingesting only ASCI-filenames without blanks
        # print STDERR "main: ", $document->value( "main" )," escaped:
",URI::Escape::uri_escape_utf8($document->value( "main" ), "^A-Za-z0-9
\-\._~\/");
        my $doc_name_uri = URI::Escape::uri_escape_utf8($document->value
( "main" ), "^A-Za-z0-9\-\._~\/");
        if( $document->value( "main" ) ne $doc_name_uri )
        {
                my $fieldname = $repository->make_element( "span",
class=>"ep_problem_field:documents" );
                $fieldname->appendChild( $document->dataset->render_name
( $repository ) );

                my $prob = $repository->make_doc_fragment;
                $prob->appendChild( $repository->html_phrase
( "validate:bad_filename", fieldname=>$fieldname ) );
                $prob->appendChild( $repository->make_text
( $doc_name_uri ) );

                $prob->appendChild( $repository->html_phrase
( "validate:original_filename") );
                $prob->appendChild( $repository->make_text( $document->
value( "main" ) ) );

                push @problems, $prob;
        }


        return( @problems );
};

After setting the introduced phrases by
<epp:phrase id="validate:bad_filename">Please replace non-ASCII characters
(e.g. 'äöü') or blanks in the name of uploaded <epc:pin name="fieldname" />
appropriately to simplify future handling!<br/>Following filename prepared
for repository<br/></epp:phrase>
<epp:phrase id="validate:original_filename"><br/>is different to original
one :(<br/></epp:phrase>
in an appropriate .../lang/en/phrases/... file it should work :-)

Hth
Thomas

*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20171130/d133c485/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
Url : http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20171130/d133c485/attachment.gif 


More information about the Eprints-tech mailing list