[EP-tech] URI::Escape error in URI::_query

Matthew Kerwin matthew.kerwin at qut.edu.au
Fri Apr 13 01:02:53 BST 2012

URI::_query cannot handle wide characters (including non-ASCII Unicode), throwing up the following error:

Use of uninitialized value within %URI::Escape::escapes in substitution iterator at /usr/local/eprints/perl_lib/URI/_query.pm line 16, <$fh> line 287.

I added a trivial patch to said file, based on the difference between URI::Escape's uri_escape and uri_escape_utf8 methods (see below).  My question is: is it reasonable for the query() method to assume all inputs are octet strings instead of character strings?  And if so: how do we police all entrypoints to ensure data sanity?

--- perl_lib/URI/_query.pm	(revision 4369)
+++ perl_lib/URI/_query.pm	(working copy)
@@ -13,6 +13,7 @@
 	my $q = shift;
 	$$self = $1;
 	if (defined $q) {
+ 	    utf8::encode($q);
 	    $q =~ s/([^$URI::uric])/$URI::Escape::escapes{$1}/go;
 	    $$self .= "?$q";

Matthew Kerwin | Web Developer | TILS | Digital Repository Team | Level 2, I Block, Kelvin Grove | ph 3138 3910 | matthew.kerwin at qut.edu.au | CRICOS No 00213J

More information about the Eprints-tech mailing list