← Index
NYTProf Performance Profile   « line view »
For /Users/timbo/perl5/perlbrew/perls/perl-5.18.2/bin/perlcritic
  Run on Sat Mar 19 22:12:22 2016
Reported on Sat Mar 19 22:14:13 2016

Filename/Users/timbo/perl5/perlbrew/perls/perl-5.18.2/lib/site_perl/5.18.2/PPIx/Regexp.pm
StatementsExecuted 22 statements in 890µs
Subroutines
Calls P F Exclusive
Time
Inclusive
Time
Subroutine
1111.79ms32.9msPPIx::Regexp::::BEGIN@90PPIx::Regexp::BEGIN@90
11112µs24µsPPIx::Regexp::::BEGIN@85PPIx::Regexp::BEGIN@85
11111µs16µsPPIx::Regexp::::BEGIN@86PPIx::Regexp::BEGIN@86
1118µs29µsPPIx::Regexp::::BEGIN@92PPIx::Regexp::BEGIN@92
1117µs4.17msPPIx::Regexp::::BEGIN@88PPIx::Regexp::BEGIN@88
1117µs26µsPPIx::Regexp::::BEGIN@93PPIx::Regexp::BEGIN@93
1114µs4µsPPIx::Regexp::::BEGIN@91PPIx::Regexp::BEGIN@91
0000s0sPPIx::Regexp::::_cache_sizePPIx::Regexp::_cache_size
0000s0sPPIx::Regexp::::_componentPPIx::Regexp::_component
0000s0sPPIx::Regexp::::can_be_quantifiedPPIx::Regexp::can_be_quantified
0000s0sPPIx::Regexp::::capture_namesPPIx::Regexp::capture_names
0000s0sPPIx::Regexp::::delimitersPPIx::Regexp::delimiters
0000s0sPPIx::Regexp::::errstrPPIx::Regexp::errstr
0000s0sPPIx::Regexp::::failuresPPIx::Regexp::failures
0000s0sPPIx::Regexp::::flush_cachePPIx::Regexp::flush_cache
0000s0sPPIx::Regexp::::max_capture_numberPPIx::Regexp::max_capture_number
0000s0sPPIx::Regexp::::modifierPPIx::Regexp::modifier
0000s0sPPIx::Regexp::::modifier_assertedPPIx::Regexp::modifier_asserted
0000s0sPPIx::Regexp::::newPPIx::Regexp::new
0000s0sPPIx::Regexp::::new_from_cachePPIx::Regexp::new_from_cache
0000s0sPPIx::Regexp::::regular_expressionPPIx::Regexp::regular_expression
0000s0sPPIx::Regexp::::replacementPPIx::Regexp::replacement
0000s0sPPIx::Regexp::::sourcePPIx::Regexp::source
0000s0sPPIx::Regexp::::typePPIx::Regexp::type
Call graph for these subroutines as a Graphviz dot language file.
Line State
ments
Time
on line
Calls Time
in subs
Code
1=head1 NAME
2
3PPIx::Regexp - Represent a regular expression of some sort
4
5=head1 SYNOPSIS
6
7 use PPIx::Regexp;
8 use PPIx::Regexp::Dumper;
9 my $re = PPIx::Regexp->new( 'qr{foo}smx' );
10 PPIx::Regexp::Dumper->new( $re )
11 ->print();
12
13=head1 INHERITANCE
14
15C<PPIx::Regexp> is a L<PPIx::Regexp::Node|PPIx::Regexp::Node>.
16
17C<PPIx::Regexp> has no descendants.
18
19=head1 DESCRIPTION
20
21The purpose of the F<PPIx-Regexp> package is to parse regular
22expressions in a manner similar to the way the L<PPI|PPI> package parses
23Perl. This class forms the root of the parse tree, playing a role
24similar to L<PPI::Document|PPI::Document>.
25
26This package shares with L<PPI|PPI> the property of being round-trip
27safe. That is,
28
29 my $expr = 's/ ( \d+ ) ( \D+ ) /$2$1/smxg';
30 my $re = PPIx::Regexp->new( $expr );
31 print $re->content() eq $expr ? "yes\n" : "no\n"
32
33should print 'yes' for any valid regular expression.
34
35Navigation is similar to that provided by L<PPI|PPI>. That is to say,
36things like C<children>, C<find_first>, C<snext_sibling> and so on all
37work pretty much the same way as in L<PPI|PPI>.
38
39The class hierarchy is also similar to L<PPI|PPI>. Except for some
40utility classes (the dumper, the lexer, and the tokenizer) all classes
41are descended from L<PPIx::Regexp::Element|PPIx::Regexp::Element>, which
42provides basic navigation. Tokens are descended from
43L<PPIx::Regexp::Token|PPIx::Regexp::Token>, which provides content. All
44containers are descended from L<PPIx::Regexp::Node|PPIx::Regexp::Node>,
45which provides for children, and all structure elements are descended
46from L<PPIx::Regexp::Structure|PPIx::Regexp::Structure>, which provides
47beginning and ending delimiters, and a type.
48
49There are two features of L<PPI|PPI> that this package does not provide
50- mutability and operator overloading. There are no plans for serious
51mutability, though something like L<PPI|PPI>'s C<prune> functionality
52might be considered. Similarly there are no plans for operator
53overloading, which appears to the author to represent a performance hit
54for little tangible gain.
55
56=head1 NOTICE
57
58The author will attempt to preserve the documented interface, but if the
59interface needs to change to correct some egregiously bad design or
60implementation decision, then it will change. Any incompatible changes
61will go through a deprecation cycle.
62
63The goal of this package is to parse well-formed regular expressions
64correctly. A secondary goal is not to blow up on ill-formed regular
65expressions. The correct identification and characterization of
66ill-formed regular expressions is B<not> a goal of this package.
67
68This policy attempts to track features in development releases as well
69as public releases. However, features added in a development release and
70then removed before the next production release B<will not> be tracked,
71and any functionality relating to such features B<will be removed>. The
72issue here is the potential re-use (with different semantics) of syntax
73that did not make it into the production release.
74
75=head1 METHODS
76
77This class provides the following public methods. Methods not documented
78here are private, and unsupported in the sense that the author reserves
79the right to change or remove them without notice.
80
81=cut
82
83package PPIx::Regexp;
84
85220µs235µs
# spent 24µs (12+12) within PPIx::Regexp::BEGIN@85 which was called: # once (12µs+12µs) by Perl::Critic::Document::BEGIN@28 at line 85
use strict;
# spent 24µs making 1 call to PPIx::Regexp::BEGIN@85 # spent 12µs making 1 call to strict::import
86223µs221µs
# spent 16µs (11+5) within PPIx::Regexp::BEGIN@86 which was called: # once (11µs+5µs) by Perl::Critic::Document::BEGIN@28 at line 86
use warnings;
# spent 16µs making 1 call to PPIx::Regexp::BEGIN@86 # spent 5µs making 1 call to warnings::import
87
88226µs28.32ms
# spent 4.17ms (7µs+4.16) within PPIx::Regexp::BEGIN@88 which was called: # once (7µs+4.16ms) by Perl::Critic::Document::BEGIN@28 at line 88
use base qw{ PPIx::Regexp::Node };
# spent 4.17ms making 1 call to PPIx::Regexp::BEGIN@88 # spent 4.16ms making 1 call to base::import
89
90298µs132.9ms
# spent 32.9ms (1.79+31.1) within PPIx::Regexp::BEGIN@90 which was called: # once (1.79ms+31.1ms) by Perl::Critic::Document::BEGIN@28 at line 90
use PPIx::Regexp::Lexer ();
# spent 32.9ms making 1 call to PPIx::Regexp::BEGIN@90
91220µs14µs
# spent 4µs within PPIx::Regexp::BEGIN@91 which was called: # once (4µs+0s) by Perl::Critic::Document::BEGIN@28 at line 91
use PPIx::Regexp::Token::Modifier (); # For its modifier manipulations.
# spent 4µs making 1 call to PPIx::Regexp::BEGIN@91
92221µs250µs
# spent 29µs (8+22) within PPIx::Regexp::BEGIN@92 which was called: # once (8µs+22µs) by Perl::Critic::Document::BEGIN@28 at line 92
use PPIx::Regexp::Util qw{ __instance };
# spent 29µs making 1 call to PPIx::Regexp::BEGIN@92 # spent 22µs making 1 call to Exporter::import
932675µs246µs
# spent 26µs (7+20) within PPIx::Regexp::BEGIN@93 which was called: # once (7µs+20µs) by Perl::Critic::Document::BEGIN@28 at line 93
use Scalar::Util qw{ refaddr };
# spent 26µs making 1 call to PPIx::Regexp::BEGIN@93 # spent 20µs making 1 call to Exporter::import
94
951600nsour $VERSION = '0.036';
96
97=head2 new
98
99 my $re = PPIx::Regexp->new('/foo/');
100
101This method instantiates a C<PPIx::Regexp> object from a string, a
102L<PPI::Token::QuoteLike::Regexp|PPI::Token::QuoteLike::Regexp>, a
103L<PPI::Token::Regexp::Match|PPI::Token::Regexp::Match>, or a
104L<PPI::Token::Regexp::Substitute|PPI::Token::Regexp::Substitute>.
105Honestly, any L<PPI::Element|PPI::Element> will do, but only the three
106Regexp classes mentioned previously are likely to do anything useful.
107
108Optionally you can pass one or more name/value pairs after the regular
109expression. The possible options are:
110
111=over
112
113=item default_modifiers array_reference
114
115This option specifies a reference to an array of default modifiers to
116apply to the regular expression being parsed. Each modifier is specified
117as a string. Any actual modifiers found supersede the defaults.
118
119When applying the defaults, C<'?'> and C<'/'> are completely ignored,
120and C<'^'> is ignored unless it occurs at the beginning of the modifier.
121The first dash (C<'-'>) causes subsequent modifiers to be negated.
122
123So, for example, if you wish to produce a C<PPIx::Regexp> object
124representing the regular expression in
125
126 use re '/smx';
127 {
128 no re '/x';
129 m/ foo /;
130 }
131
132you would (after some help from L<PPI|PPI> in finding the relevant
133statements), do something like
134
135 my $re = PPIx::Regexp->new( 'm/ foo /',
136 default_modifiers => [ '/smx', '-/x' ] );
137`
138=item encoding name
139
140This option specifies the encoding of the regular expression. This is
141passed to the tokenizer, which will C<decode> the regular expression
142string before it tokenizes it. For example:
143
144 my $re = PPIx::Regexp->new( '/foo/',
145 encoding => 'iso-8859-1',
146 );
147
148=item trace number
149
150If greater than zero, this option causes trace output from the parse.
151The author reserves the right to change or eliminate this without
152notice.
153
154=back
155
156Passing optional input other than the above is not an error, but neither
157is it supported.
158
159=cut
160
161{
162
1632600ns my $errstr;
164
165 sub new {
166 my ( $class, $content, %args ) = @_;
167 ref $class and $class = ref $class;
168
169 $errstr = undef;
170
171 my $tokenizer = PPIx::Regexp::Tokenizer->new(
172 $content, %args ) or do {
173 $errstr = PPIx::Regexp::Tokenizer->errstr();
174 return;
175 };
176
177 my $lexer = PPIx::Regexp::Lexer->new( $tokenizer, %args );
178 my @nodes = $lexer->lex();
179 my $self = $class->SUPER::_new( @nodes );
180 $self->{source} = $content;
181 $self->{failures} = $lexer->failures();
182 $self->{effective_modifiers} =
183 $tokenizer->__effective_modifiers();
184 return $self;
185 }
186
187 sub errstr {
188 return $errstr;
189 }
190
191}
192
193=head2 new_from_cache
194
195This static method wraps L</new> in a caching mechanism. Only one object
196will be generated for a given L<PPI::Element|PPI::Element>, no matter
197how many times this method is called. Calls after the first for a given
198L<PPI::Element|PPI::ELement> simply return the same C<PPIx::Regexp>
199object.
200
201When the C<PPIx::Regexp> object is returned from cache, the values of
202the optional arguments are ignored.
203
204Calls to this method with the regular expression in a string rather than
205a L<PPI::Element|PPI::Element> will not be cached.
206
207B<Caveat:> This method is provided for code like
208L<Perl::Critic|Perl::Critic> which might instantiate the same object
209multiple times. The cache will persist until L</flush_cache> is called.
210
211=head2 flush_cache
212
213 $re->flush_cache(); # Remove $re from cache
214 PPIx::Regexp->flush_cache(); # Empty the cache
215
216This method flushes the cache used by L</new_from_cache>. If called as a
217static method with no arguments, the entire cache is emptied. Otherwise
218any objects specified are removed from the cache.
219
220=cut
221
222{
223
2242100ns my %cache;
225
2261200ns our $DISABLE_CACHE; # Leave this undocumented, at least for
227 # now.
228
229 sub _cache_size {
230 return scalar keys %cache;
231 }
232
233 sub new_from_cache {
234 my ( $class, $content, %args ) = @_;
235
236 __instance( $content, 'PPI::Element' )
237 or return $class->new( $content, %args );
238
239 $DISABLE_CACHE and return $class->new( $content, %args );
240
241 my $addr = refaddr( $content );
242 exists $cache{$addr} and return $cache{$addr};
243
244 my $self = $class->new( $content, %args )
245 or return;
246
247 $cache{$addr} = $self;
248
249 return $self;
250
251 }
252
253 sub flush_cache {
254 my @args = @_;
255
256 ref $args[0] or shift @args;
257
258 if ( @args ) {
259 foreach my $obj ( @args ) {
260 if ( __instance( $obj, __PACKAGE__ ) &&
261 __instance( ( my $parent = $obj->source() ),
262 'PPI::Element' ) ) {
263 delete $cache{ refaddr( $parent ) };
264 }
265 }
266 } else {
267 %cache = ();
268 }
269 return;
270 }
271
272}
273
274sub can_be_quantified { return; }
275
276
277=head2 capture_names
278
279 foreach my $name ( $re->capture_names() ) {
280 print "Capture name '$name'\n";
281 }
282
283This convenience method returns the capture names found in the regular
284expression.
285
286This method is equivalent to
287
288 $self->regular_expression()->capture_names();
289
290except that if C<< $self->regular_expression() >> returns C<undef>
291(meaning that something went terribly wrong with the parse) this method
292will simply return.
293
294=cut
295
296sub capture_names {
297 my ( $self ) = @_;
298 my $re = $self->regular_expression() or return;
299 return $re->capture_names();
300}
301
302=head2 delimiters
303
304 print join("\t", PPIx::Regexp->new('s/foo/bar/')->delimiters());
305 # prints '// //'
306
307When called in list context, this method returns either one or two
308strings, depending on whether the parsed expression has a replacement
309string. In the case of non-bracketed substitutions, the start delimiter
310of the replacement string is considered to be the same as its finish
311delimiter, as illustrated by the above example.
312
313When called in scalar context, you get the delimiters of the regular
314expression; that is, element 0 of the array that is returned in list
315context.
316
317Optionally, you can pass an index value and the corresponding delimiters
318will be returned; index 0 represents the regular expression's
319delimiters, and index 1 represents the replacement string's delimiters,
320which may be undef. For example,
321
322 print PPIx::Regexp->new('s{foo}<bar>')-delimiters(1);
323 # prints '<>'
324
325If the object was not initialized with a valid regexp of some sort, the
326results of this method are undefined.
327
328=cut
329
330sub delimiters {
331 my ( $self, $inx ) = @_;
332
333 my @rslt;
334 foreach my $method ( qw{ regular_expression replacement } ) {
335 defined ( my $obj = $self->$method() ) or next;
336 push @rslt, $obj->delimiters();
337 }
338
339 defined $inx and return $rslt[$inx];
340 wantarray and return @rslt;
341 defined wantarray and return $rslt[0];
342 return;
343}
344
345=head2 errstr
346
347This static method returns the error string from the most recent attempt
348to instantiate a C<PPIx::Regexp>. It will be C<undef> if the most recent
349attempt succeeded.
350
351=cut
352
353# defined above, just after sub new.
354
355=head2 failures
356
357 print "There were ", $re->failures(), " parse failures\n";
358
359This method returns the number of parse failures. This is a count of the
360number of unknown tokens plus the number of unterminated structures plus
361the number of unmatched right brackets of any sort.
362
363=cut
364
365sub failures {
366 my ( $self ) = @_;
367 return $self->{failures};
368}
369
370=head2 max_capture_number
371
372 print "Highest used capture number ",
373 $re->max_capture_number(), "\n";
374
375This convenience method returns the highest capture number used by the
376regular expression. If there are no captures, the return will be 0.
377
378This method is equivalent to
379
380 $self->regular_expression()->max_capture_number();
381
382except that if C<< $self->regular_expression() >> returns C<undef>
383(meaning that something went terribly wrong with the parse) this method
384will too.
385
386=cut
387
388sub max_capture_number {
389 my ( $self ) = @_;
390 my $re = $self->regular_expression() or return;
391 return $re->max_capture_number();
392}
393
394=head2 modifier
395
396 my $re = PPIx::Regexp->new( 's/(foo)/${1}bar/smx' );
397 print $re->modifier()->content(), "\n";
398 # prints 'smx'.
399
400This method retrieves the modifier of the object. This comes from the
401end of the initializing string or object and will be a
402L<PPIx::Regexp::Token::Modifier|PPIx::Regexp::Token::Modifier>.
403
404B<Note> that this object represents the actual modifiers present on the
405regexp, and does not take into account any that may have been applied by
406default (i.e. via the C<default_modifiers> argument to C<new()>). For
407something that takes account of default modifiers, see
408L<modifier_asserted()|/modifier_asserted>, below.
409
410In the event of a parse failure, there may not be a modifier present, in
411which case nothing is returned.
412
413=cut
414
415sub modifier {
416 my ( $self ) = @_;
417 return $self->_component( 'PPIx::Regexp::Token::Modifier' );
418}
419
420=head2 modifier_asserted
421
422 my $re = PPIx::Regexp->new( '/ . /',
423 default_modifiers => [ 'smx' ] );
424 print $re->modifier_asserted( 'x' ) ? "yes\n" : "no\n";
425 # prints 'yes'.
426
427This method returns true if the given modifier is asserted for the
428regexp, whether explicitly or by the modifiers passed in the
429C<default_modifiers> argument.
430
431=cut
432
433sub modifier_asserted {
434 my ( $self, $modifier ) = @_;
435 return PPIx::Regexp::Token::Modifier::__asserts(
436 $self->{effective_modifiers},
437 $modifier,
438 );
439}
440
441# This is a kluge for both determining whether the object asserts
442# modifiers (hence the 'ductype') and determining whether the given
443# modifier is actually asserted. The signature is the invocant and the
444# modifier name, which must not be undef. The return is a boolean.
4451800ns*__ducktype_modifier_asserted = \&modifier_asserted;
446
447=head2 regular_expression
448
449 my $re = PPIx::Regexp->new( 's/(foo)/${1}bar/smx' );
450 print $re->regular_expression()->content(), "\n";
451 # prints '/(foo)/'.
452
453This method returns that portion of the object which actually represents
454a regular expression.
455
456=cut
457
458sub regular_expression {
459 my ( $self ) = @_;
460 return $self->_component( 'PPIx::Regexp::Structure::Regexp' );
461}
462
463=head2 replacement
464
465 my $re = PPIx::Regexp->new( 's/(foo)/${1}bar/smx' );
466 print $re->replacement()->content(), "\n";
467 # prints '${1}bar/'.
468
469This method returns that portion of the object which represents the
470replacement string. This will be C<undef> unless the regular expression
471actually has a replacement string. Delimiters will be included, but
472there will be no beginning delimiter unless the regular expression was
473bracketed.
474
475=cut
476
477sub replacement {
478 my ( $self ) = @_;
479 return $self->_component( 'PPIx::Regexp::Structure::Replacement' );
480}
481
482=head2 source
483
484 my $source = $re->source();
485
486This method returns the object or string that was used to instantiate
487the object.
488
489=cut
490
491sub source {
492 my ( $self ) = @_;
493 return $self->{source};
494}
495
496=head2 type
497
498 my $re = PPIx::Regexp->new( 's/(foo)/${1}bar/smx' );
499 print $re->type()->content(), "\n";
500 # prints 's'.
501
502This method retrieves the type of the object. This comes from the
503beginning of the initializing string or object, and will be a
504L<PPIx::Regexp::Token::Structure|PPIx::Regexp::Token::Structure>
505whose C<content> is one of 's',
506'm', 'qr', or ''.
507
508=cut
509
510sub type {
511 my ( $self ) = @_;
512 return $self->_component( 'PPIx::Regexp::Token::Structure' );
513}
514
515sub _component {
516 my ( $self, $class ) = @_;
517 foreach my $elem ( $self->children() ) {
518 $elem->isa( $class ) and return $elem;
519 }
520 return;
521}
522
52314µs1;
524
525__END__