← Index
NYTProf Performance Profile   « line view »
For /home/ss5/perl5/perlbrew/perls/perl-5.22.0/bin/benchmarkanything-storage
  Run on Mon Jan 29 16:55:34 2018
Reported on Mon Jan 29 16:57:07 2018

Filename/home/ss5/perl5/perlbrew/perls/perl-5.22.0/lib/site_perl/5.22.0/URI/Escape.pm
StatementsExecuted 19281 statements in 44.4ms
Subroutines
Calls P F Exclusive
Time
Inclusive
Time
Subroutine
30011119.8ms25.1msURI::Escape::::uri_escapeURI::Escape::uri_escape
1001114.92ms5.44msURI::Escape::::uri_unescapeURI::Escape::uri_unescape
3001113.46ms3.46msURI::Escape::::CORE:regcompURI::Escape::CORE:regcomp (opcode)
4002212.36ms2.36msURI::Escape::::CORE:substURI::Escape::CORE:subst (opcode)
1117µs16µsURI::Escape::::BEGIN@140URI::Escape::BEGIN@140
1116µs8µsURI::Escape::::BEGIN@3URI::Escape::BEGIN@3
1113µs6µsURI::Escape::::BEGIN@4URI::Escape::BEGIN@4
1112µs2µsURI::Escape::::BEGIN@146URI::Escape::BEGIN@146
2111µs1µsURI::Escape::::CORE:qrURI::Escape::CORE:qr (opcode)
0000s0sURI::Escape::::_fail_hiURI::Escape::_fail_hi
0000s0sURI::Escape::::escape_charURI::Escape::escape_char
0000s0sURI::Escape::::uri_escape_utf8URI::Escape::uri_escape_utf8
Call graph for these subroutines as a Graphviz dot language file.
Line State
ments
Time
on line
Calls Time
in subs
Code
1package URI::Escape;
2
3211µs29µs
# spent 8µs (6+1) within URI::Escape::BEGIN@3 which was called: # once (6µs+1µs) by URI::BEGIN@22 at line 3
use strict;
# spent 8µs making 1 call to URI::Escape::BEGIN@3 # spent 1µs making 1 call to strict::import
4253µs28µs
# spent 6µs (3+2) within URI::Escape::BEGIN@4 which was called: # once (3µs+2µs) by URI::BEGIN@22 at line 4
use warnings;
# spent 6µs making 1 call to URI::Escape::BEGIN@4 # spent 2µs making 1 call to warnings::import
5
6=head1 NAME
7
8URI::Escape - Percent-encode and percent-decode unsafe characters
9
10=head1 SYNOPSIS
11
12 use URI::Escape;
13 $safe = uri_escape("10% is enough\n");
14 $verysafe = uri_escape("foo", "\0-\377");
15 $str = uri_unescape($safe);
16
17=head1 DESCRIPTION
18
19This module provides functions to percent-encode and percent-decode URI strings as
20defined by RFC 3986. Percent-encoding URI's is informally called "URI escaping".
21This is the terminology used by this module, which predates the formalization of the
22terms by the RFC by several years.
23
24A URI consists of a restricted set of characters. The restricted set
25of characters consists of digits, letters, and a few graphic symbols
26chosen from those common to most of the character encodings and input
27facilities available to Internet users. They are made up of the
28"unreserved" and "reserved" character sets as defined in RFC 3986.
29
30 unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
31 reserved = ":" / "/" / "?" / "#" / "[" / "]" / "@"
32 "!" / "$" / "&" / "'" / "(" / ")"
33 / "*" / "+" / "," / ";" / "="
34
35In addition, any byte (octet) can be represented in a URI by an escape
36sequence: a triplet consisting of the character "%" followed by two
37hexadecimal digits. A byte can also be represented directly by a
38character, using the US-ASCII character for that octet.
39
40Some of the characters are I<reserved> for use as delimiters or as
41part of certain URI components. These must be escaped if they are to
42be treated as ordinary data. Read RFC 3986 for further details.
43
44The functions provided (and exported by default) from this module are:
45
46=over 4
47
48=item uri_escape( $string )
49
50=item uri_escape( $string, $unsafe )
51
52Replaces each unsafe character in the $string with the corresponding
53escape sequence and returns the result. The $string argument should
54be a string of bytes. The uri_escape() function will croak if given a
55characters with code above 255. Use uri_escape_utf8() if you know you
56have such chars or/and want chars in the 128 .. 255 range treated as
57UTF-8.
58
59The uri_escape() function takes an optional second argument that
60overrides the set of characters that are to be escaped. The set is
61specified as a string that can be used in a regular expression
62character class (between [ ]). E.g.:
63
64 "\x00-\x1f\x7f-\xff" # all control and hi-bit characters
65 "a-z" # all lower case characters
66 "^A-Za-z" # everything not a letter
67
68The default set of characters to be escaped is all those which are
69I<not> part of the C<unreserved> character class shown above as well
70as the reserved characters. I.e. the default is:
71
72 "^A-Za-z0-9\-\._~"
73
74=item uri_escape_utf8( $string )
75
76=item uri_escape_utf8( $string, $unsafe )
77
78Works like uri_escape(), but will encode chars as UTF-8 before
79escaping them. This makes this function able to deal with characters
80with code above 255 in $string. Note that chars in the 128 .. 255
81range will be escaped differently by this function compared to what
82uri_escape() would. For chars in the 0 .. 127 range there is no
83difference.
84
85Equivalent to:
86
87 utf8::encode($string);
88 my $uri = uri_escape($string);
89
90Note: JavaScript has a function called escape() that produces the
91sequence "%uXXXX" for chars in the 256 .. 65535 range. This function
92has really nothing to do with URI escaping but some folks got confused
93since it "does the right thing" in the 0 .. 255 range. Because of
94this you sometimes see "URIs" with these kind of escapes. The
95JavaScript encodeURIComponent() function is similar to uri_escape_utf8().
96
97=item uri_unescape($string,...)
98
99Returns a string with each %XX sequence replaced with the actual byte
100(octet).
101
102This does the same as:
103
104 $string =~ s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg;
105
106but does not modify the string in-place as this RE would. Using the
107uri_unescape() function instead of the RE might make the code look
108cleaner and is a few characters less to type.
109
110In a simple benchmark test I did,
111calling the function (instead of the inline RE above) if a few chars
112were unescaped was something like 40% slower, and something like 700% slower if none were. If
113you are going to unescape a lot of times it might be a good idea to
114inline the RE.
115
116If the uri_unescape() function is passed multiple strings, then each
117one is returned unescaped.
118
119=back
120
121The module can also export the C<%escapes> hash, which contains the
122mapping from all 256 bytes to the corresponding escape codes. Lookup
123in this hash is faster than evaluating C<sprintf("%%%02X", ord($byte))>
124each time.
125
126=head1 SEE ALSO
127
128L<URI>
129
130
131=head1 COPYRIGHT
132
133Copyright 1995-2004 Gisle Aas.
134
135This program is free software; you can redistribute it and/or modify
136it under the same terms as Perl itself.
137
138=cut
139
140340µs326µs
# spent 16µs (7+10) within URI::Escape::BEGIN@140 which was called: # once (7µs+10µs) by URI::BEGIN@22 at line 140
use Exporter 5.57 'import';
# spent 16µs making 1 call to URI::Escape::BEGIN@140 # spent 5µs making 1 call to UNIVERSAL::VERSION # spent 4µs making 1 call to Exporter::import
141our %escapes;
1421800nsour @EXPORT = qw(uri_escape uri_unescape uri_escape_utf8);
1431300nsour @EXPORT_OK = qw(%escapes);
1441200nsour $VERSION = "3.31";
145
1462284µs12µs
# spent 2µs within URI::Escape::BEGIN@146 which was called: # once (2µs+0s) by URI::BEGIN@22 at line 146
use Carp ();
# spent 2µs making 1 call to URI::Escape::BEGIN@146
147
148# Build a char->hex map
14911µsfor (0..255) {
150256165µs $escapes{chr($_)} = sprintf("%%%02X", $_);
151}
152
1531200nsmy %subst; # compiled patterns
154
15518µs21µsmy %Unsafe = (
# spent 1µs making 2 calls to URI::Escape::CORE:qr, avg 650ns/call
156 RFC2732 => qr/[^A-Za-z0-9\-_.!~*'()]/,
157 RFC3986 => qr/[^A-Za-z0-9\-\._~]/,
158);
159
160
# spent 25.1ms (19.8+5.31) within URI::Escape::uri_escape which was called 3001 times, avg 8µs/call: # 3001 times (19.8ms+5.31ms) by Search::Elasticsearch::Role::Client::Direct::_parse_path at line 67 of Search/Elasticsearch/Role/Client/Direct.pm, avg 8µs/call
sub uri_escape {
1613001776µs my($text, $patn) = @_;
1623001652µs return undef unless defined $text;
16330011.07ms if (defined $patn){
164 unless (exists $subst{$patn}) {
165 # Because we can't compile the regex we fake it with a cached sub
166 (my $tmp = $patn) =~ s,/,\\/,g;
167 eval "\$subst{\$patn} = sub {\$_[0] =~ s/([$tmp])/\$escapes{\$1} || _fail_hi(\$1)/ge; }";
168 Carp::croak("uri_escape: $@") if $@;
169 }
170 &{$subst{$patn}}($text);
171 } else {
172300119.8ms60025.31ms $text =~ s/($Unsafe{RFC3986})/$escapes{$1} || _fail_hi($1)/ge;
# spent 3.46ms making 3001 calls to URI::Escape::CORE:regcomp, avg 1µs/call # spent 1.84ms making 3001 calls to URI::Escape::CORE:subst, avg 615ns/call
173 }
174300116.1ms $text;
175}
176
177sub _fail_hi {
178 my $chr = shift;
179 Carp::croak(sprintf "Can't escape \\x{%04X}, try uri_escape_utf8() instead", ord($chr));
180}
181
182sub uri_escape_utf8 {
183 my $text = shift;
184 utf8::encode($text);
185 return uri_escape($text, @_);
186}
187
188
# spent 5.44ms (4.92+515µs) within URI::Escape::uri_unescape which was called 1001 times, avg 5µs/call: # 1001 times (4.92ms+515µs) by URI::_server::host at line 97 of URI/_server.pm, avg 5µs/call
sub uri_unescape {
189 # Note from RFC1630: "Sequences which start with a percent sign
190 # but are not followed by two hexadecimal characters are reserved
191 # for future extension"
1921001456µs my $str = shift;
1931001435µs if (@_ && wantarray) {
194 # not executed for the common case of a single argument
195 my @str = ($str, @_); # need to copy
196 for (@str) {
197 s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg;
198 }
199 return @str;
200 }
20110012.79ms1001515µs $str =~ s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg if defined $str;
# spent 515µs making 1001 calls to URI::Escape::CORE:subst, avg 515ns/call
20210011.71ms $str;
203}
204
205# XXX FIXME escape_char is buggy as it assigns meaning to the string's storage format.
206sub escape_char {
207 # Old versions of utf8::is_utf8() didn't properly handle magical vars (e.g. $1).
208 # The following forces a fetch to occur beforehand.
209 my $dummy = substr($_[0], 0, 0);
210
211 if (utf8::is_utf8($_[0])) {
212 my $s = shift;
213 utf8::encode($s);
214 unshift(@_, $s);
215 }
216
217 return join '', @URI::Escape::escapes{split //, $_[0]};
218}
219
22014µs1;
 
# spent 1µs within URI::Escape::CORE:qr which was called 2 times, avg 650ns/call: # 2 times (1µs+0s) by URI::BEGIN@22 at line 155, avg 650ns/call
sub URI::Escape::CORE:qr; # opcode
# spent 3.46ms within URI::Escape::CORE:regcomp which was called 3001 times, avg 1µs/call: # 3001 times (3.46ms+0s) by URI::Escape::uri_escape at line 172, avg 1µs/call
sub URI::Escape::CORE:regcomp; # opcode
# spent 2.36ms within URI::Escape::CORE:subst which was called 4002 times, avg 590ns/call: # 3001 times (1.84ms+0s) by URI::Escape::uri_escape at line 172, avg 615ns/call # 1001 times (515µs+0s) by URI::Escape::uri_unescape at line 201, avg 515ns/call
sub URI::Escape::CORE:subst; # opcode