| Filename | /Users/timbo/perl5/perlbrew/perls/perl-5.18.2/lib/site_perl/5.18.2/PPI/Tokenizer.pm |
| Statements | Executed 3487328 statements in 3.64s |
| Calls | P | F | Exclusive Time |
Inclusive Time |
Subroutine |
|---|---|---|---|---|---|
| 149609 | 1 | 1 | 1.30s | 4.58s | PPI::Tokenizer::_process_next_char |
| 26904 | 2 | 1 | 857ms | 6.13s | PPI::Tokenizer::_process_next_line |
| 94513 | 1 | 1 | 602ms | 6.79s | PPI::Tokenizer::get_token |
| 56533 | 14 | 7 | 428ms | 681ms | PPI::Tokenizer::_new_token |
| 20542 | 6 | 4 | 261ms | 305ms | PPI::Tokenizer::_previous_significant_tokens |
| 94513 | 29 | 16 | 218ms | 218ms | PPI::Tokenizer::_finalize_token |
| 27281 | 3 | 2 | 186ms | 246ms | PPI::Tokenizer::_fill_line |
| 144 | 1 | 1 | 162ms | 162ms | PPI::Tokenizer::CORE:subst (opcode) |
| 144 | 1 | 1 | 118ms | 503ms | PPI::Tokenizer::new |
| 27287 | 3 | 2 | 60.1ms | 60.1ms | PPI::Tokenizer::_get_line |
| 1866 | 1 | 1 | 16.1ms | 34.6ms | PPI::Tokenizer::_opcontext |
| 15534 | 1 | 1 | 4.15ms | 4.15ms | PPI::Tokenizer::CORE:match (opcode) |
| 144 | 1 | 1 | 1.34ms | 1.76ms | PPI::Tokenizer::_clean_eof |
| 52 | 2 | 1 | 488µs | 589µs | PPI::Tokenizer::_last_significant_token |
| 1 | 1 | 1 | 135µs | 224µs | PPI::Tokenizer::BEGIN@88 |
| 1 | 1 | 1 | 12µs | 23µs | PPI::Tokenizer::BEGIN@81 |
| 1 | 1 | 1 | 7µs | 35µs | PPI::Tokenizer::BEGIN@82 |
| 1 | 1 | 1 | 6µs | 23µs | PPI::Tokenizer::BEGIN@90 |
| 1 | 1 | 1 | 3µs | 3µs | PPI::Tokenizer::BEGIN@83 |
| 1 | 1 | 1 | 3µs | 3µs | PPI::Tokenizer::BEGIN@84 |
| 1 | 1 | 1 | 3µs | 3µs | PPI::Tokenizer::BEGIN@85 |
| 1 | 1 | 1 | 3µs | 3µs | PPI::Tokenizer::BEGIN@87 |
| 1 | 1 | 1 | 3µs | 3µs | PPI::Tokenizer::BEGIN@86 |
| 1 | 1 | 1 | 3µs | 3µs | PPI::Tokenizer::BEGIN@91 |
| 0 | 0 | 0 | 0s | 0s | PPI::Tokenizer::__ANON__[:211] |
| 0 | 0 | 0 | 0s | 0s | PPI::Tokenizer::_char |
| 0 | 0 | 0 | 0s | 0s | PPI::Tokenizer::_last_token |
| 0 | 0 | 0 | 0s | 0s | PPI::Tokenizer::all_tokens |
| 0 | 0 | 0 | 0s | 0s | PPI::Tokenizer::decrement_cursor |
| 0 | 0 | 0 | 0s | 0s | PPI::Tokenizer::increment_cursor |
| Line | State ments |
Time on line |
Calls | Time in subs |
Code |
|---|---|---|---|---|---|
| 1 | package PPI::Tokenizer; | ||||
| 2 | |||||
| 3 | =pod | ||||
| 4 | |||||
| 5 | =head1 NAME | ||||
| 6 | |||||
| 7 | PPI::Tokenizer - The Perl Document Tokenizer | ||||
| 8 | |||||
| 9 | =head1 SYNOPSIS | ||||
| 10 | |||||
| 11 | # Create a tokenizer for a file, array or string | ||||
| 12 | $Tokenizer = PPI::Tokenizer->new( 'filename.pl' ); | ||||
| 13 | $Tokenizer = PPI::Tokenizer->new( \@lines ); | ||||
| 14 | $Tokenizer = PPI::Tokenizer->new( \$source ); | ||||
| 15 | |||||
| 16 | # Return all the tokens for the document | ||||
| 17 | my $tokens = $Tokenizer->all_tokens; | ||||
| 18 | |||||
| 19 | # Or we can use it as an iterator | ||||
| 20 | while ( my $Token = $Tokenizer->get_token ) { | ||||
| 21 | print "Found token '$Token'\n"; | ||||
| 22 | } | ||||
| 23 | |||||
| 24 | # If we REALLY need to manually nudge the cursor, you | ||||
| 25 | # can do that to (The lexer needs this ability to do rollbacks) | ||||
| 26 | $is_incremented = $Tokenizer->increment_cursor; | ||||
| 27 | $is_decremented = $Tokenizer->decrement_cursor; | ||||
| 28 | |||||
| 29 | =head1 DESCRIPTION | ||||
| 30 | |||||
| 31 | PPI::Tokenizer is the class that provides Tokenizer objects for use in | ||||
| 32 | breaking strings of Perl source code into Tokens. | ||||
| 33 | |||||
| 34 | By the time you are reading this, you probably need to know a little | ||||
| 35 | about the difference between how perl parses Perl "code" and how PPI | ||||
| 36 | parsers Perl "documents". | ||||
| 37 | |||||
| 38 | "perl" itself (the interpreter) uses a heavily modified lex specification | ||||
| 39 | to specify its parsing logic, maintains several types of state as it | ||||
| 40 | goes, and incrementally tokenizes, lexes AND EXECUTES at the same time. | ||||
| 41 | |||||
| 42 | In fact, it is provably impossible to use perl's parsing method without | ||||
| 43 | simultaneously executing code. A formal mathematical proof has been | ||||
| 44 | published demonstrating the method. | ||||
| 45 | |||||
| 46 | This is where the truism "Only perl can parse Perl" comes from. | ||||
| 47 | |||||
| 48 | PPI uses a completely different approach by abandoning the (impossible) | ||||
| 49 | ability to parse Perl the same way that the interpreter does, and instead | ||||
| 50 | parsing the source as a document, using a document structure independantly | ||||
| 51 | derived from the Perl documentation and approximating the perl interpreter | ||||
| 52 | interpretation as closely as possible. | ||||
| 53 | |||||
| 54 | It was touch and go for a long time whether we could get it close enough, | ||||
| 55 | but in the end it turned out that it could be done. | ||||
| 56 | |||||
| 57 | In this approach, the tokenizer C<PPI::Tokenizer> is implemented separately | ||||
| 58 | from the lexer L<PPI::Lexer>. | ||||
| 59 | |||||
| 60 | The job of C<PPI::Tokenizer> is to take pure source as a string and break it | ||||
| 61 | up into a stream/set of tokens, and contains most of the "black magic" used | ||||
| 62 | in PPI. By comparison, the lexer implements a relatively straight forward | ||||
| 63 | tree structure, and has an implementation that is uncomplicated (compared | ||||
| 64 | to the insanity in the tokenizer at least). | ||||
| 65 | |||||
| 66 | The Tokenizer uses an immense amount of heuristics, guessing and cruft, | ||||
| 67 | supported by a very B<VERY> flexible internal API, but fortunately it was | ||||
| 68 | possible to largely encapsulate the black magic, so there is not a lot that | ||||
| 69 | gets exposed to people using the C<PPI::Tokenizer> itself. | ||||
| 70 | |||||
| 71 | =head1 METHODS | ||||
| 72 | |||||
| 73 | Despite the incredible complexity, the Tokenizer itself only exposes a | ||||
| 74 | relatively small number of methods, with most of the complexity implemented | ||||
| 75 | in private methods. | ||||
| 76 | |||||
| 77 | =cut | ||||
| 78 | |||||
| 79 | # Make sure everything we need is loaded so | ||||
| 80 | # we don't have to go and load all of PPI. | ||||
| 81 | 2 | 21µs | 2 | 34µs | # spent 23µs (12+11) within PPI::Tokenizer::BEGIN@81 which was called:
# once (12µs+11µs) by PPI::BEGIN@28 at line 81 # spent 23µs making 1 call to PPI::Tokenizer::BEGIN@81
# spent 11µs making 1 call to strict::import |
| 82 | 2 | 19µs | 2 | 63µs | # spent 35µs (7+28) within PPI::Tokenizer::BEGIN@82 which was called:
# once (7µs+28µs) by PPI::BEGIN@28 at line 82 # spent 35µs making 1 call to PPI::Tokenizer::BEGIN@82
# spent 28µs making 1 call to Exporter::import |
| 83 | 2 | 18µs | 1 | 3µs | # spent 3µs within PPI::Tokenizer::BEGIN@83 which was called:
# once (3µs+0s) by PPI::BEGIN@28 at line 83 # spent 3µs making 1 call to PPI::Tokenizer::BEGIN@83 |
| 84 | 2 | 15µs | 1 | 3µs | # spent 3µs within PPI::Tokenizer::BEGIN@84 which was called:
# once (3µs+0s) by PPI::BEGIN@28 at line 84 # spent 3µs making 1 call to PPI::Tokenizer::BEGIN@84 |
| 85 | 2 | 14µs | 1 | 3µs | # spent 3µs within PPI::Tokenizer::BEGIN@85 which was called:
# once (3µs+0s) by PPI::BEGIN@28 at line 85 # spent 3µs making 1 call to PPI::Tokenizer::BEGIN@85 |
| 86 | 2 | 20µs | 1 | 3µs | # spent 3µs within PPI::Tokenizer::BEGIN@86 which was called:
# once (3µs+0s) by PPI::BEGIN@28 at line 86 # spent 3µs making 1 call to PPI::Tokenizer::BEGIN@86 |
| 87 | 2 | 15µs | 1 | 3µs | # spent 3µs within PPI::Tokenizer::BEGIN@87 which was called:
# once (3µs+0s) by PPI::BEGIN@28 at line 87 # spent 3µs making 1 call to PPI::Tokenizer::BEGIN@87 |
| 88 | 2 | 79µs | 1 | 224µs | # spent 224µs (135+89) within PPI::Tokenizer::BEGIN@88 which was called:
# once (135µs+89µs) by PPI::BEGIN@28 at line 88 # spent 224µs making 1 call to PPI::Tokenizer::BEGIN@88 |
| 89 | |||||
| 90 | 2 | 22µs | 2 | 39µs | # spent 23µs (6+16) within PPI::Tokenizer::BEGIN@90 which was called:
# once (6µs+16µs) by PPI::BEGIN@28 at line 90 # spent 23µs making 1 call to PPI::Tokenizer::BEGIN@90
# spent 16µs making 1 call to vars::import |
| 91 | # spent 3µs within PPI::Tokenizer::BEGIN@91 which was called:
# once (3µs+0s) by PPI::BEGIN@28 at line 93 | ||||
| 92 | 1 | 4µs | $VERSION = '1.215'; | ||
| 93 | 1 | 1.57ms | 1 | 3µs | } # spent 3µs making 1 call to PPI::Tokenizer::BEGIN@91 |
| 94 | |||||
| - - | |||||
| 99 | ##################################################################### | ||||
| 100 | # Creation and Initialization | ||||
| 101 | |||||
| 102 | =pod | ||||
| 103 | |||||
| 104 | =head2 new $file | \@lines | \$source | ||||
| 105 | |||||
| 106 | The main C<new> constructor creates a new Tokenizer object. These | ||||
| 107 | objects have no configuration parameters, and can only be used once, | ||||
| 108 | to tokenize a single perl source file. | ||||
| 109 | |||||
| 110 | It takes as argument either a normal scalar containing source code, | ||||
| 111 | a reference to a scalar containing source code, or a reference to an | ||||
| 112 | ARRAY containing newline-terminated lines of source code. | ||||
| 113 | |||||
| 114 | Returns a new C<PPI::Tokenizer> object on success, or throws a | ||||
| 115 | L<PPI::Exception> exception on error. | ||||
| 116 | |||||
| 117 | =cut | ||||
| 118 | |||||
| 119 | # spent 503ms (118+384) within PPI::Tokenizer::new which was called 144 times, avg 3.49ms/call:
# 144 times (118ms+384ms) by PPI::Lexer::lex_file at line 159 of PPI/Lexer.pm, avg 3.49ms/call | ||||
| 120 | 144 | 119µs | my $class = ref($_[0]) || $_[0]; | ||
| 121 | |||||
| 122 | # Create the empty tokenizer struct | ||||
| 123 | 144 | 1.61ms | my $self = bless { | ||
| 124 | # Source code | ||||
| 125 | source => undef, | ||||
| 126 | source_bytes => undef, | ||||
| 127 | |||||
| 128 | # Line buffer | ||||
| 129 | line => undef, | ||||
| 130 | line_length => undef, | ||||
| 131 | line_cursor => undef, | ||||
| 132 | line_count => 0, | ||||
| 133 | |||||
| 134 | # Parse state | ||||
| 135 | token => undef, | ||||
| 136 | class => 'PPI::Token::BOM', | ||||
| 137 | zone => 'PPI::Token::Whitespace', | ||||
| 138 | |||||
| 139 | # Output token buffer | ||||
| 140 | tokens => [], | ||||
| 141 | token_cursor => 0, | ||||
| 142 | token_eof => 0, | ||||
| 143 | |||||
| 144 | # Perl 6 blocks | ||||
| 145 | perl6 => [], | ||||
| 146 | }, $class; | ||||
| 147 | |||||
| 148 | 144 | 208µs | if ( ! defined $_[1] ) { | ||
| 149 | # We weren't given anything | ||||
| 150 | PPI::Exception->throw("No source provided to Tokenizer"); | ||||
| 151 | |||||
| 152 | } elsif ( ! ref $_[1] ) { | ||||
| 153 | 144 | 566µs | 144 | 187ms | my $source = PPI::Util::_slurp($_[1]); # spent 187ms making 144 calls to PPI::Util::_slurp, avg 1.30ms/call |
| 154 | 144 | 1.20ms | if ( ref $source ) { | ||
| 155 | # Content returned by reference | ||||
| 156 | $self->{source} = $$source; | ||||
| 157 | } else { | ||||
| 158 | # Errors returned as a string | ||||
| 159 | return( $source ); | ||||
| 160 | } | ||||
| 161 | |||||
| 162 | } elsif ( _SCALAR0($_[1]) ) { | ||||
| 163 | $self->{source} = ${$_[1]}; | ||||
| 164 | |||||
| 165 | } elsif ( _ARRAY0($_[1]) ) { | ||||
| 166 | $self->{source} = join '', map { "\n" } @{$_[1]}; | ||||
| 167 | |||||
| 168 | } else { | ||||
| 169 | # We don't support whatever this is | ||||
| 170 | PPI::Exception->throw(ref($_[1]) . " is not supported as a source provider"); | ||||
| 171 | } | ||||
| 172 | |||||
| 173 | # We can't handle a null string | ||||
| 174 | 144 | 289µs | $self->{source_bytes} = length $self->{source}; | ||
| 175 | 144 | 3.62ms | if ( $self->{source_bytes} > 1048576 ) { | ||
| 176 | # Dammit! It's ALWAYS the "Perl" modules larger than a | ||||
| 177 | # meg that seems to blow up the Tokenizer/Lexer. | ||||
| 178 | # Nobody actually writes real programs larger than a meg | ||||
| 179 | # Perl::Tidy (the largest) is only 800k. | ||||
| 180 | # It is always these idiots with massive Data::Dumper | ||||
| 181 | # structs or huge RecDescent parser. | ||||
| 182 | PPI::Exception::ParserRejection->throw("File is too large"); | ||||
| 183 | |||||
| 184 | } elsif ( $self->{source_bytes} ) { | ||||
| 185 | # Split on local newlines | ||||
| 186 | 144 | 163ms | 144 | 162ms | $self->{source} =~ s/(?:\015{1,2}\012|\015|\012)/\n/g; # spent 162ms making 144 calls to PPI::Tokenizer::CORE:subst, avg 1.12ms/call |
| 187 | 144 | 107ms | $self->{source} = [ split /(?<=\n)/, $self->{source} ]; | ||
| 188 | |||||
| 189 | } else { | ||||
| 190 | $self->{source} = [ ]; | ||||
| 191 | } | ||||
| 192 | |||||
| 193 | ### EVIL | ||||
| 194 | # I'm explaining this earlier than I should so you can understand | ||||
| 195 | # why I'm about to do something that looks very strange. There's | ||||
| 196 | # a problem with the Tokenizer, in that tokens tend to change | ||||
| 197 | # classes as each letter is added, but they don't get allocated | ||||
| 198 | # their definite final class until the "end" of the token, the | ||||
| 199 | # detection of which occurs in about a hundred different places, | ||||
| 200 | # all through various crufty code (that triples the speed). | ||||
| 201 | # | ||||
| 202 | # However, in general, this does not apply to tokens in which a | ||||
| 203 | # whitespace character is valid, such as comments, whitespace and | ||||
| 204 | # big strings. | ||||
| 205 | # | ||||
| 206 | # So what we do is add a space to the end of the source. This | ||||
| 207 | # triggers normal "end of token" functionality for all cases. Then, | ||||
| 208 | # once the tokenizer hits end of file, it examines the last token to | ||||
| 209 | # manually either remove the ' ' token, or chop it off the end of | ||||
| 210 | # a longer one in which the space would be valid. | ||||
| 211 | 15678 | 34.2ms | 15678 | 39.0ms | if ( List::MoreUtils::any { /^__(?:DATA|END)__\s*$/ } @{$self->{source}} ) { # spent 34.9ms making 144 calls to List::MoreUtils::any, avg 242µs/call
# spent 4.15ms making 15534 calls to PPI::Tokenizer::CORE:match, avg 267ns/call |
| 212 | $self->{source_eof_chop} = ''; | ||||
| 213 | } elsif ( ! defined $self->{source}->[0] ) { | ||||
| 214 | $self->{source_eof_chop} = ''; | ||||
| 215 | } elsif ( $self->{source}->[-1] =~ /\s$/ ) { | ||||
| 216 | $self->{source_eof_chop} = ''; | ||||
| 217 | } else { | ||||
| 218 | $self->{source_eof_chop} = 1; | ||||
| 219 | $self->{source}->[-1] .= ' '; | ||||
| 220 | } | ||||
| 221 | |||||
| 222 | 144 | 765µs | $self; | ||
| 223 | } | ||||
| 224 | |||||
| - - | |||||
| 229 | ##################################################################### | ||||
| 230 | # Main Public Methods | ||||
| 231 | |||||
| 232 | =pod | ||||
| 233 | |||||
| 234 | =head2 get_token | ||||
| 235 | |||||
| 236 | When using the PPI::Tokenizer object as an iterator, the C<get_token> | ||||
| 237 | method is the primary method that is used. It increments the cursor | ||||
| 238 | and returns the next Token in the output array. | ||||
| 239 | |||||
| 240 | The actual parsing of the file is done only as-needed, and a line at | ||||
| 241 | a time. When C<get_token> hits the end of the token array, it will | ||||
| 242 | cause the parser to pull in the next line and parse it, continuing | ||||
| 243 | as needed until there are more tokens on the output array that | ||||
| 244 | get_token can then return. | ||||
| 245 | |||||
| 246 | This means that a number of Tokenizer objects can be created, and | ||||
| 247 | won't consume significant CPU until you actually begin to pull tokens | ||||
| 248 | from it. | ||||
| 249 | |||||
| 250 | Return a L<PPI::Token> object on success, C<0> if the Tokenizer had | ||||
| 251 | reached the end of the file, or C<undef> on error. | ||||
| 252 | |||||
| 253 | =cut | ||||
| 254 | |||||
| 255 | # spent 6.79s (602ms+6.19) within PPI::Tokenizer::get_token which was called 94513 times, avg 72µs/call:
# 94513 times (602ms+6.19s) by PPI::Lexer::_get_token at line 1413 of PPI/Lexer.pm, avg 72µs/call | ||||
| 256 | 94513 | 17.5ms | my $self = shift; | ||
| 257 | |||||
| 258 | # Shortcut for EOF | ||||
| 259 | 94513 | 15.6ms | if ( $self->{token_eof} | ||
| 260 | and $self->{token_cursor} > scalar @{$self->{tokens}} | ||||
| 261 | ) { | ||||
| 262 | return 0; | ||||
| 263 | } | ||||
| 264 | |||||
| 265 | # Return the next token if we can | ||||
| 266 | 94513 | 298ms | 82384 | 48.3ms | if ( my $token = $self->{tokens}->[ $self->{token_cursor} ] ) { # spent 48.3ms making 82384 calls to PPI::Util::TRUE, avg 587ns/call |
| 267 | 82384 | 11.9ms | $self->{token_cursor}++; | ||
| 268 | 82384 | 244ms | return $token; | ||
| 269 | } | ||||
| 270 | |||||
| 271 | 12129 | 268µs | my $line_rv; | ||
| 272 | |||||
| 273 | # Catch exceptions and return undef, so that we | ||||
| 274 | # can start to convert code to exception-based code. | ||||
| 275 | 12129 | 4.52ms | my $rv = eval { | ||
| 276 | # No token, we need to get some more | ||||
| 277 | 12129 | 14.1ms | 12129 | 4.32s | while ( $line_rv = $self->_process_next_line ) { # spent 4.32s making 12129 calls to PPI::Tokenizer::_process_next_line, avg 356µs/call |
| 278 | # If there is something in the buffer, return it | ||||
| 279 | # The defined() prevents a ton of calls to PPI::Util::TRUE | ||||
| 280 | 26616 | 31.1ms | 14775 | 1.81s | if ( defined( my $token = $self->{tokens}->[ $self->{token_cursor} ] ) ) { # spent 1.81s making 14775 calls to PPI::Tokenizer::_process_next_line, avg 123µs/call |
| 281 | 11841 | 1.48ms | $self->{token_cursor}++; | ||
| 282 | 11841 | 5.62ms | return $token; | ||
| 283 | } | ||||
| 284 | } | ||||
| 285 | 288 | 56µs | return undef; | ||
| 286 | }; | ||||
| 287 | 12129 | 80.8ms | 11841 | 8.35ms | if ( $@ ) { # spent 8.35ms making 11841 calls to PPI::Util::TRUE, avg 705ns/call |
| 288 | if ( _INSTANCE($@, 'PPI::Exception') ) { | ||||
| 289 | $@->throw; | ||||
| 290 | } else { | ||||
| 291 | my $errstr = $@; | ||||
| 292 | $errstr =~ s/^(.*) at line .+$/$1/; | ||||
| 293 | PPI::Exception->throw( $errstr ); | ||||
| 294 | } | ||||
| 295 | } elsif ( $rv ) { | ||||
| 296 | return $rv; | ||||
| 297 | } | ||||
| 298 | |||||
| 299 | 288 | 63µs | if ( defined $line_rv ) { | ||
| 300 | # End of file, but we can still return things from the buffer | ||||
| 301 | 288 | 181µs | if ( my $token = $self->{tokens}->[ $self->{token_cursor} ] ) { | ||
| 302 | $self->{token_cursor}++; | ||||
| 303 | return $token; | ||||
| 304 | } | ||||
| 305 | |||||
| 306 | # Set our token end of file flag | ||||
| 307 | 288 | 82µs | $self->{token_eof} = 1; | ||
| 308 | 288 | 489µs | return 0; | ||
| 309 | } | ||||
| 310 | |||||
| 311 | # Error, pass it up to our caller | ||||
| 312 | undef; | ||||
| 313 | } | ||||
| 314 | |||||
| 315 | =pod | ||||
| 316 | |||||
| 317 | =head2 all_tokens | ||||
| 318 | |||||
| 319 | When not being used as an iterator, the C<all_tokens> method tells | ||||
| 320 | the Tokenizer to parse the entire file and return all of the tokens | ||||
| 321 | in a single ARRAY reference. | ||||
| 322 | |||||
| 323 | It should be noted that C<all_tokens> does B<NOT> interfere with the | ||||
| 324 | use of the Tokenizer object as an iterator (does not modify the token | ||||
| 325 | cursor) and use of the two different mechanisms can be mixed safely. | ||||
| 326 | |||||
| 327 | Returns a reference to an ARRAY of L<PPI::Token> objects on success | ||||
| 328 | or throws an exception on error. | ||||
| 329 | |||||
| 330 | =cut | ||||
| 331 | |||||
| 332 | sub all_tokens { | ||||
| 333 | my $self = shift; | ||||
| 334 | |||||
| 335 | # Catch exceptions and return undef, so that we | ||||
| 336 | # can start to convert code to exception-based code. | ||||
| 337 | eval { | ||||
| 338 | # Process lines until we get EOF | ||||
| 339 | unless ( $self->{token_eof} ) { | ||||
| 340 | my $rv; | ||||
| 341 | while ( $rv = $self->_process_next_line ) {} | ||||
| 342 | unless ( defined $rv ) { | ||||
| 343 | PPI::Exception->throw("Error while processing source"); | ||||
| 344 | } | ||||
| 345 | |||||
| 346 | # Clean up the end of the tokenizer | ||||
| 347 | $self->_clean_eof; | ||||
| 348 | } | ||||
| 349 | }; | ||||
| 350 | if ( $@ ) { | ||||
| 351 | my $errstr = $@; | ||||
| 352 | $errstr =~ s/^(.*) at line .+$/$1/; | ||||
| 353 | PPI::Exception->throw( $errstr ); | ||||
| 354 | } | ||||
| 355 | |||||
| 356 | # End of file, return a copy of the token array. | ||||
| 357 | return [ @{$self->{tokens}} ]; | ||||
| 358 | } | ||||
| 359 | |||||
| 360 | =pod | ||||
| 361 | |||||
| 362 | =head2 increment_cursor | ||||
| 363 | |||||
| 364 | Although exposed as a public method, C<increment_method> is implemented | ||||
| 365 | for expert use only, when writing lexers or other components that work | ||||
| 366 | directly on token streams. | ||||
| 367 | |||||
| 368 | It manually increments the token cursor forward through the file, in effect | ||||
| 369 | "skipping" the next token. | ||||
| 370 | |||||
| 371 | Return true if the cursor is incremented, C<0> if already at the end of | ||||
| 372 | the file, or C<undef> on error. | ||||
| 373 | |||||
| 374 | =cut | ||||
| 375 | |||||
| 376 | sub increment_cursor { | ||||
| 377 | # Do this via the get_token method, which makes sure there | ||||
| 378 | # is actually a token there to move to. | ||||
| 379 | $_[0]->get_token and 1; | ||||
| 380 | } | ||||
| 381 | |||||
| 382 | =pod | ||||
| 383 | |||||
| 384 | =head2 decrement_cursor | ||||
| 385 | |||||
| 386 | Although exposed as a public method, C<decrement_method> is implemented | ||||
| 387 | for expert use only, when writing lexers or other components that work | ||||
| 388 | directly on token streams. | ||||
| 389 | |||||
| 390 | It manually decrements the token cursor backwards through the file, in | ||||
| 391 | effect "rolling back" the token stream. And indeed that is what it is | ||||
| 392 | primarily intended for, when the component that is consuming the token | ||||
| 393 | stream needs to implement some sort of "roll back" feature in its use | ||||
| 394 | of the token stream. | ||||
| 395 | |||||
| 396 | Return true if the cursor is decremented, C<0> if already at the | ||||
| 397 | beginning of the file, or C<undef> on error. | ||||
| 398 | |||||
| 399 | =cut | ||||
| 400 | |||||
| 401 | sub decrement_cursor { | ||||
| 402 | my $self = shift; | ||||
| 403 | |||||
| 404 | # Check for the beginning of the file | ||||
| 405 | return 0 unless $self->{token_cursor}; | ||||
| 406 | |||||
| 407 | # Decrement the token cursor | ||||
| 408 | $self->{token_eof} = 0; | ||||
| 409 | --$self->{token_cursor}; | ||||
| 410 | } | ||||
| 411 | |||||
| - - | |||||
| 416 | ##################################################################### | ||||
| 417 | # Working With Source | ||||
| 418 | |||||
| 419 | # Fetches the next line from the input line buffer | ||||
| 420 | # Returns undef at EOF. | ||||
| 421 | # spent 60.1ms within PPI::Tokenizer::_get_line which was called 27287 times, avg 2µs/call:
# 27281 times (60.1ms+0s) by PPI::Tokenizer::_fill_line at line 443, avg 2µs/call
# 5 times (10µs+0s) by PPI::Token::HereDoc::__TOKENIZER__on_char at line 222 of PPI/Token/HereDoc.pm, avg 2µs/call
# once (3µs+0s) by PPI::Token::HereDoc::__TOKENIZER__on_char at line 211 of PPI/Token/HereDoc.pm | ||||
| 422 | 27287 | 3.41ms | my $self = shift; | ||
| 423 | 27287 | 6.10ms | return undef unless $self->{source}; # EOF hit previously | ||
| 424 | |||||
| 425 | # Pull off the next line | ||||
| 426 | 27143 | 15.3ms | my $line = shift @{$self->{source}}; | ||
| 427 | |||||
| 428 | # Flag EOF if we hit it | ||||
| 429 | 27143 | 3.09ms | $self->{source} = undef unless defined $line; | ||
| 430 | |||||
| 431 | # Return the line (or EOF flag) | ||||
| 432 | 27143 | 113ms | return $line; # string or undef | ||
| 433 | } | ||||
| 434 | |||||
| 435 | # Fetches the next line, ready to process | ||||
| 436 | # Returns 1 on success | ||||
| 437 | # Returns 0 on EOF | ||||
| 438 | # spent 246ms (186+60.1) within PPI::Tokenizer::_fill_line which was called 27281 times, avg 9µs/call:
# 26904 times (184ms+59.2ms) by PPI::Tokenizer::_process_next_line at line 490, avg 9µs/call
# 372 times (1.89ms+884µs) by PPI::Token::_QuoteEngine::_scan_for_brace_character at line 183 of PPI/Token/_QuoteEngine.pm, avg 7µs/call
# 5 times (38µs+16µs) by PPI::Token::_QuoteEngine::_scan_for_unescaped_character at line 137 of PPI/Token/_QuoteEngine.pm, avg 11µs/call | ||||
| 439 | 27281 | 3.17ms | my $self = shift; | ||
| 440 | 27281 | 3.02ms | my $inscan = shift; | ||
| 441 | |||||
| 442 | # Get the next line | ||||
| 443 | 27281 | 27.1ms | 27281 | 60.1ms | my $line = $self->_get_line; # spent 60.1ms making 27281 calls to PPI::Tokenizer::_get_line, avg 2µs/call |
| 444 | 27281 | 2.96ms | unless ( defined $line ) { | ||
| 445 | # End of file | ||||
| 446 | 288 | 32µs | unless ( $inscan ) { | ||
| 447 | 288 | 199µs | delete $self->{line}; | ||
| 448 | 288 | 52µs | delete $self->{line_cursor}; | ||
| 449 | 288 | 46µs | delete $self->{line_length}; | ||
| 450 | 288 | 529µs | return 0; | ||
| 451 | } | ||||
| 452 | |||||
| 453 | # In the scan version, just set the cursor to the end | ||||
| 454 | # of the line, and the rest should just cascade out. | ||||
| 455 | $self->{line_cursor} = $self->{line_length}; | ||||
| 456 | return 0; | ||||
| 457 | } | ||||
| 458 | |||||
| 459 | # Populate the appropriate variables | ||||
| 460 | 26993 | 6.62ms | $self->{line} = $line; | ||
| 461 | 26993 | 4.61ms | $self->{line_cursor} = -1; | ||
| 462 | 26993 | 6.80ms | $self->{line_length} = length $line; | ||
| 463 | 26993 | 3.62ms | $self->{line_count}++; | ||
| 464 | |||||
| 465 | 26993 | 68.3ms | 1; | ||
| 466 | } | ||||
| 467 | |||||
| 468 | # Get the current character | ||||
| 469 | sub _char { | ||||
| 470 | my $self = shift; | ||||
| 471 | substr( $self->{line}, $self->{line_cursor}, 1 ); | ||||
| 472 | } | ||||
| 473 | |||||
| - - | |||||
| 478 | #################################################################### | ||||
| 479 | # Per line processing methods | ||||
| 480 | |||||
| 481 | # Processes the next line | ||||
| 482 | # Returns 1 on success completion | ||||
| 483 | # Returns 0 if EOF | ||||
| 484 | # Returns undef on error | ||||
| 485 | sub _process_next_line { | ||||
| 486 | 26904 | 3.78ms | my $self = shift; | ||
| 487 | |||||
| 488 | # Fill the line buffer | ||||
| 489 | 26904 | 903µs | my $rv; | ||
| 490 | 26904 | 23.3ms | 26904 | 243ms | unless ( $rv = $self->_fill_line ) { # spent 243ms making 26904 calls to PPI::Tokenizer::_fill_line, avg 9µs/call |
| 491 | 288 | 38µs | return undef unless defined $rv; | ||
| 492 | |||||
| 493 | # End of file, finalize last token | ||||
| 494 | 288 | 275µs | 288 | 397µs | $self->_finalize_token; # spent 397µs making 288 calls to PPI::Tokenizer::_finalize_token, avg 1µs/call |
| 495 | 288 | 450µs | return 0; | ||
| 496 | } | ||||
| 497 | |||||
| 498 | # Run the __TOKENIZER__on_line_start | ||||
| 499 | 26616 | 39.3ms | 26616 | 354ms | $rv = $self->{class}->__TOKENIZER__on_line_start( $self ); # spent 269ms making 14943 calls to PPI::Token::Whitespace::__TOKENIZER__on_line_start, avg 18µs/call
# spent 65.6ms making 9695 calls to PPI::Token::Pod::__TOKENIZER__on_line_start, avg 7µs/call
# spent 14.1ms making 1834 calls to PPI::Token::End::__TOKENIZER__on_line_start, avg 8µs/call
# spent 4.66ms making 144 calls to PPI::Token::BOM::__TOKENIZER__on_line_start, avg 32µs/call |
| 500 | 26616 | 3.26ms | unless ( $rv ) { | ||
| 501 | # If there are no more source lines, then clean up | ||||
| 502 | 16923 | 9.78ms | 144 | 1.76ms | if ( ref $self->{source} eq 'ARRAY' and ! @{$self->{source}} ) { # spent 1.76ms making 144 calls to PPI::Tokenizer::_clean_eof, avg 12µs/call |
| 503 | $self->_clean_eof; | ||||
| 504 | } | ||||
| 505 | |||||
| 506 | # Defined but false means next line | ||||
| 507 | 16923 | 66.4ms | return 1 if defined $rv; | ||
| 508 | PPI::Exception->throw("Error at line $self->{line_count}"); | ||||
| 509 | } | ||||
| 510 | |||||
| 511 | # If we can't deal with the entire line, process char by char | ||||
| 512 | 9693 | 203ms | 149609 | 4.58s | while ( $rv = $self->_process_next_char ) {} # spent 4.58s making 149609 calls to PPI::Tokenizer::_process_next_char, avg 31µs/call |
| 513 | 9693 | 1.15ms | unless ( defined $rv ) { | ||
| 514 | PPI::Exception->throw("Error at line $self->{line_count}, character $self->{line_cursor}"); | ||||
| 515 | } | ||||
| 516 | |||||
| 517 | # Trigger any action that needs to happen at the end of a line | ||||
| 518 | 9693 | 13.4ms | 9693 | 94.6ms | $self->{class}->__TOKENIZER__on_line_end( $self ); # spent 94.4ms making 9549 calls to PPI::Token::Whitespace::__TOKENIZER__on_line_end, avg 10µs/call
# spent 224µs making 144 calls to PPI::Token::__TOKENIZER__on_line_end, avg 2µs/call |
| 519 | |||||
| 520 | # If there are no more source lines, then clean up | ||||
| 521 | 9693 | 7.24ms | unless ( ref($self->{source}) eq 'ARRAY' and @{$self->{source}} ) { | ||
| 522 | return $self->_clean_eof; | ||||
| 523 | } | ||||
| 524 | |||||
| 525 | 9693 | 37.6ms | return 1; | ||
| 526 | } | ||||
| 527 | |||||
| - - | |||||
| 532 | ##################################################################### | ||||
| 533 | # Per-character processing methods | ||||
| 534 | |||||
| 535 | # Process on a per-character basis. | ||||
| 536 | # Note that due the the high number of times this gets | ||||
| 537 | # called, it has been fairly heavily in-lined, so the code | ||||
| 538 | # might look a bit ugly and duplicated. | ||||
| 539 | # spent 4.58s (1.30+3.28) within PPI::Tokenizer::_process_next_char which was called 149609 times, avg 31µs/call:
# 149609 times (1.30s+3.28s) by PPI::Tokenizer::_process_next_line at line 512, avg 31µs/call | ||||
| 540 | 149609 | 24.1ms | my $self = shift; | ||
| 541 | |||||
| 542 | ### FIXME - This checks for a screwed up condition that triggers | ||||
| 543 | ### several warnings, amoungst other things. | ||||
| 544 | 149609 | 48.5ms | if ( ! defined $self->{line_cursor} or ! defined $self->{line_length} ) { | ||
| 545 | # $DB::single = 1; | ||||
| 546 | return undef; | ||||
| 547 | } | ||||
| 548 | |||||
| 549 | # Increment the counter and check for end of line | ||||
| 550 | 149609 | 57.7ms | return 0 if ++$self->{line_cursor} >= $self->{line_length}; | ||
| 551 | |||||
| 552 | # Pass control to the token class | ||||
| 553 | 139916 | 1.69ms | my $result; | ||
| 554 | 139916 | 221ms | 139916 | 2.94s | unless ( $result = $self->{class}->__TOKENIZER__on_char( $self ) ) { # spent 1.87s making 106218 calls to PPI::Token::Whitespace::__TOKENIZER__on_char, avg 18µs/call
# spent 362ms making 7754 calls to PPI::Token::Symbol::__TOKENIZER__on_char, avg 47µs/call
# spent 299ms making 10634 calls to PPI::Token::Operator::__TOKENIZER__on_char, avg 28µs/call
# spent 201ms making 8180 calls to PPI::Token::Unknown::__TOKENIZER__on_char, avg 25µs/call
# spent 90.9ms making 1688 calls to PPI::Token::_QuoteEngine::__TOKENIZER__on_char, avg 54µs/call
# spent 69.1ms making 3157 calls to PPI::Token::Structure::__TOKENIZER__on_char, avg 22µs/call
# spent 38.4ms making 1170 calls to PPI::Token::Number::__TOKENIZER__on_char, avg 33µs/call
# spent 13.3ms making 1018 calls to PPI::Token::Number::Float::__TOKENIZER__on_char, avg 13µs/call
# spent 1.61ms making 34 calls to PPI::Token::Magic::__TOKENIZER__on_char, avg 47µs/call
# spent 654µs making 61 calls to PPI::Token::Cast::__TOKENIZER__on_char, avg 11µs/call
# spent 69µs making 2 calls to PPI::Token::DashedWord::__TOKENIZER__on_char, avg 34µs/call |
| 555 | # undef is error. 0 is "Did stuff ourself, you don't have to do anything" | ||||
| 556 | return defined $result ? 1 : undef; | ||||
| 557 | } | ||||
| 558 | |||||
| 559 | # We will need the value of the current character | ||||
| 560 | 123420 | 54.3ms | my $char = substr( $self->{line}, $self->{line_cursor}, 1 ); | ||
| 561 | 123420 | 15.8ms | if ( $result eq '1' ) { | ||
| 562 | # If __TOKENIZER__on_char returns 1, it is signaling that it thinks that | ||||
| 563 | # the character is part of it. | ||||
| 564 | |||||
| 565 | # Add the character | ||||
| 566 | 12474 | 6.66ms | if ( defined $self->{token} ) { | ||
| 567 | $self->{token}->{content} .= $char; | ||||
| 568 | } else { | ||||
| 569 | defined($self->{token} = $self->{class}->new($char)) or return undef; | ||||
| 570 | } | ||||
| 571 | |||||
| 572 | 12474 | 37.1ms | return 1; | ||
| 573 | } | ||||
| 574 | |||||
| 575 | # We have been provided with the name of a class | ||||
| 576 | 110946 | 85.8ms | 21222 | 254ms | if ( $self->{class} ne "PPI::Token::$result" ) { # spent 254ms making 21222 calls to PPI::Tokenizer::_new_token, avg 12µs/call |
| 577 | # New class | ||||
| 578 | $self->_new_token( $result, $char ); | ||||
| 579 | } elsif ( defined $self->{token} ) { | ||||
| 580 | # Same class as current | ||||
| 581 | $self->{token}->{content} .= $char; | ||||
| 582 | } else { | ||||
| 583 | # Same class, but no current | ||||
| 584 | 37692 | 61.1ms | 37692 | 85.7ms | defined($self->{token} = $self->{class}->new($char)) or return undef; # spent 85.7ms making 37692 calls to PPI::Token::new, avg 2µs/call |
| 585 | } | ||||
| 586 | |||||
| 587 | 110946 | 352ms | 1; | ||
| 588 | } | ||||
| 589 | |||||
| - - | |||||
| 594 | ##################################################################### | ||||
| 595 | # Altering Tokens in Tokenizer | ||||
| 596 | |||||
| 597 | # Finish the end of a token. | ||||
| 598 | # Returns the resulting parse class as a convenience. | ||||
| 599 | # spent 218ms within PPI::Tokenizer::_finalize_token which was called 94513 times, avg 2µs/call:
# 31193 times (67.2ms+0s) by PPI::Tokenizer::_new_token at line 620, avg 2µs/call
# 14291 times (35.5ms+0s) by PPI::Token::Word::__TOKENIZER__commit at line 539 of PPI/Token/Word.pm, avg 2µs/call
# 13365 times (29.4ms+0s) by PPI::Token::Structure::__TOKENIZER__commit at line 76 of PPI/Token/Structure.pm, avg 2µs/call
# 9549 times (20.9ms+0s) by PPI::Token::Whitespace::__TOKENIZER__on_line_end at line 417 of PPI/Token/Whitespace.pm, avg 2µs/call
# 7437 times (16.8ms+0s) by PPI::Token::Operator::__TOKENIZER__on_char at line 112 of PPI/Token/Operator.pm, avg 2µs/call
# 7245 times (21.2ms+0s) by PPI::Token::Symbol::__TOKENIZER__on_char at line 216 of PPI/Token/Symbol.pm, avg 3µs/call
# 3157 times (6.88ms+0s) by PPI::Token::Structure::__TOKENIZER__on_char at line 70 of PPI/Token/Structure.pm, avg 2µs/call
# 2743 times (7.54ms+0s) by PPI::Token::_QuoteEngine::__TOKENIZER__on_char at line 58 of PPI/Token/_QuoteEngine.pm, avg 3µs/call
# 1668 times (3.76ms+0s) by PPI::Token::Whitespace::__TOKENIZER__on_line_start at line 165 of PPI/Token/Whitespace.pm, avg 2µs/call
# 1252 times (2.71ms+0s) by PPI::Token::Whitespace::__TOKENIZER__on_char at line 213 of PPI/Token/Whitespace.pm, avg 2µs/call
# 832 times (2.14ms+0s) by PPI::Token::Number::__TOKENIZER__on_char at line 125 of PPI/Token/Number.pm, avg 3µs/call
# 509 times (1.33ms+0s) by PPI::Token::Symbol::__TOKENIZER__on_char at line 174 of PPI/Token/Symbol.pm, avg 3µs/call
# 288 times (397µs+0s) by PPI::Tokenizer::_process_next_line at line 494, avg 1µs/call
# 148 times (513µs+0s) by PPI::Token::Number::Float::__TOKENIZER__on_char at line 108 of PPI/Token/Number/Float.pm, avg 3µs/call
# 146 times (415µs+0s) by PPI::Token::Pod::__TOKENIZER__on_line_start at line 148 of PPI/Token/Pod.pm, avg 3µs/call
# 144 times (335µs+0s) by PPI::Tokenizer::_clean_eof at line 635, avg 2µs/call
# 144 times (308µs+0s) by PPI::Token::Word::__TOKENIZER__commit at line 458 of PPI/Token/Word.pm, avg 2µs/call
# 144 times (299µs+0s) by PPI::Token::Word::__TOKENIZER__commit at line 441 of PPI/Token/Word.pm, avg 2µs/call
# 85 times (215µs+0s) by PPI::Token::Unknown::__TOKENIZER__on_char at line 179 of PPI/Token/Unknown.pm, avg 3µs/call
# 61 times (125µs+0s) by PPI::Token::Cast::__TOKENIZER__on_char at line 51 of PPI/Token/Cast.pm, avg 2µs/call
# 51 times (105µs+0s) by PPI::Token::Whitespace::__TOKENIZER__on_char at line 261 of PPI/Token/Whitespace.pm, avg 2µs/call
# 30 times (105µs+0s) by PPI::Token::Magic::__TOKENIZER__on_char at line 228 of PPI/Token/Magic.pm, avg 4µs/call
# 22 times (54µs+0s) by PPI::Token::Unknown::__TOKENIZER__on_char at line 216 of PPI/Token/Unknown.pm, avg 2µs/call
# 3 times (8µs+0s) by PPI::Token::ArrayIndex::__TOKENIZER__on_char at line 56 of PPI/Token/ArrayIndex.pm, avg 3µs/call
# 2 times (5µs+0s) by PPI::Token::DashedWord::__TOKENIZER__on_char at line 95 of PPI/Token/DashedWord.pm, avg 2µs/call
# once (2µs+0s) by PPI::Token::Magic::__TOKENIZER__on_char at line 170 of PPI/Token/Magic.pm
# once (2µs+0s) by PPI::Token::Unknown::__TOKENIZER__on_char at line 150 of PPI/Token/Unknown.pm
# once (2µs+0s) by PPI::Token::HereDoc::__TOKENIZER__on_char at line 218 of PPI/Token/HereDoc.pm
# once (2µs+0s) by PPI::Token::Whitespace::__TOKENIZER__on_char at line 316 of PPI/Token/Whitespace.pm | ||||
| 600 | 94513 | 16.2ms | my $self = shift; | ||
| 601 | 94513 | 16.8ms | return $self->{class} unless defined $self->{token}; | ||
| 602 | |||||
| 603 | # Add the token to the token buffer | ||||
| 604 | 94225 | 34.9ms | push @{ $self->{tokens} }, $self->{token}; | ||
| 605 | 94225 | 16.6ms | $self->{token} = undef; | ||
| 606 | |||||
| 607 | # Return the parse class to that of the zone we are in | ||||
| 608 | 94225 | 297ms | $self->{class} = $self->{zone}; | ||
| 609 | } | ||||
| 610 | |||||
| 611 | # Creates a new token and sets it in the tokenizer | ||||
| 612 | # The defined() in here prevent a ton of calls to PPI::Util::TRUE | ||||
| 613 | # spent 681ms (428+253) within PPI::Tokenizer::_new_token which was called 56533 times, avg 12µs/call:
# 21222 times (159ms+94.4ms) by PPI::Tokenizer::_process_next_char at line 576, avg 12µs/call
# 14291 times (103ms+63.5ms) by PPI::Token::Word::__TOKENIZER__commit at line 533 of PPI/Token/Word.pm, avg 12µs/call
# 13365 times (103ms+47.8ms) by PPI::Token::Structure::__TOKENIZER__commit at line 75 of PPI/Token/Structure.pm, avg 11µs/call
# 3724 times (24.9ms+10.0ms) by PPI::Token::Whitespace::__TOKENIZER__on_line_start at line 159 of PPI/Token/Whitespace.pm, avg 9µs/call
# 1668 times (19.7ms+6.52ms) by PPI::Token::Whitespace::__TOKENIZER__on_line_start at line 164 of PPI/Token/Whitespace.pm, avg 16µs/call
# 1055 times (10.0ms+26.4ms) by PPI::Token::Word::__TOKENIZER__commit at line 497 of PPI/Token/Word.pm, avg 35µs/call
# 288 times (1.53ms+796µs) by PPI::Token::End::__TOKENIZER__on_line_start at line 84 of PPI/Token/End.pm, avg 8µs/call
# 242 times (1.71ms+1.08ms) by PPI::Token::Comment::__TOKENIZER__commit at line 93 of PPI/Token/Comment.pm, avg 12µs/call
# 242 times (1.62ms+1.01ms) by PPI::Token::Comment::__TOKENIZER__commit at line 94 of PPI/Token/Comment.pm, avg 11µs/call
# 144 times (1.30ms+760µs) by PPI::Token::Word::__TOKENIZER__commit at line 440 of PPI/Token/Word.pm, avg 14µs/call
# 144 times (1.29ms+646µs) by PPI::Token::End::__TOKENIZER__on_line_start at line 70 of PPI/Token/End.pm, avg 13µs/call
# 144 times (703µs+318µs) by PPI::Token::Word::__TOKENIZER__commit at line 454 of PPI/Token/Word.pm, avg 7µs/call
# 2 times (15µs+9µs) by PPI::Token::Whitespace::__TOKENIZER__on_line_start at line 170 of PPI/Token/Whitespace.pm, avg 12µs/call
# 2 times (14µs+8µs) by PPI::Token::Number::Float::__TOKENIZER__on_char at line 93 of PPI/Token/Number/Float.pm, avg 11µs/call | ||||
| 614 | 56533 | 9.70ms | my $self = shift; | ||
| 615 | # throw PPI::Exception() unless @_; | ||||
| 616 | 56533 | 31.6ms | my $class = substr( $_[0], 0, 12 ) eq 'PPI::Token::' | ||
| 617 | ? shift : 'PPI::Token::' . shift; | ||||
| 618 | |||||
| 619 | # Finalize any existing token | ||||
| 620 | 56533 | 38.5ms | 31193 | 67.2ms | $self->_finalize_token if defined $self->{token}; # spent 67.2ms making 31193 calls to PPI::Tokenizer::_finalize_token, avg 2µs/call |
| 621 | |||||
| 622 | # Create the new token and update the parse class | ||||
| 623 | 56533 | 96.6ms | 56533 | 186ms | defined($self->{token} = $class->new($_[0])) or PPI::Exception->throw; # spent 138ms making 53790 calls to PPI::Token::new, avg 3µs/call
# spent 24.2ms making 1061 calls to PPI::Token::_QuoteEngine::Full::new, avg 23µs/call
# spent 23.6ms making 1682 calls to PPI::Token::_QuoteEngine::Simple::new, avg 14µs/call |
| 624 | 56533 | 11.2ms | $self->{class} = $class; | ||
| 625 | |||||
| 626 | 56533 | 165ms | 1; | ||
| 627 | } | ||||
| 628 | |||||
| 629 | # At the end of the file, we need to clean up the results of the erroneous | ||||
| 630 | # space that we inserted at the beginning of the process. | ||||
| 631 | # spent 1.76ms (1.34+424µs) within PPI::Tokenizer::_clean_eof which was called 144 times, avg 12µs/call:
# 144 times (1.34ms+424µs) by PPI::Tokenizer::_process_next_line at line 502, avg 12µs/call | ||||
| 632 | 144 | 47µs | my $self = shift; | ||
| 633 | |||||
| 634 | # Finish any partially completed token | ||||
| 635 | 144 | 645µs | 288 | 424µs | $self->_finalize_token if $self->{token}; # spent 335µs making 144 calls to PPI::Tokenizer::_finalize_token, avg 2µs/call
# spent 89µs making 144 calls to PPI::Util::TRUE, avg 618ns/call |
| 636 | |||||
| 637 | # Find the last token, and if it has no content, kill it. | ||||
| 638 | # There appears to be some evidence that such "null tokens" are | ||||
| 639 | # somehow getting created accidentally. | ||||
| 640 | 144 | 132µs | my $last_token = $self->{tokens}->[ -1 ]; | ||
| 641 | 144 | 91µs | unless ( length $last_token->{content} ) { | ||
| 642 | pop @{$self->{tokens}}; | ||||
| 643 | } | ||||
| 644 | |||||
| 645 | # Now, if the last character of the last token is a space we added, | ||||
| 646 | # chop it off, deleting the token if there's nothing else left. | ||||
| 647 | 144 | 80µs | if ( $self->{source_eof_chop} ) { | ||
| 648 | $last_token = $self->{tokens}->[ -1 ]; | ||||
| 649 | $last_token->{content} =~ s/ $//; | ||||
| 650 | unless ( length $last_token->{content} ) { | ||||
| 651 | # Popping token | ||||
| 652 | pop @{$self->{tokens}}; | ||||
| 653 | } | ||||
| 654 | |||||
| 655 | # The hack involving adding an extra space is now reversed, and | ||||
| 656 | # now nobody will ever know. The perfect crime! | ||||
| 657 | $self->{source_eof_chop} = ''; | ||||
| 658 | } | ||||
| 659 | |||||
| 660 | 144 | 331µs | 1; | ||
| 661 | } | ||||
| 662 | |||||
| - - | |||||
| 667 | ##################################################################### | ||||
| 668 | # Utility Methods | ||||
| 669 | |||||
| 670 | # Context | ||||
| 671 | sub _last_token { | ||||
| 672 | $_[0]->{tokens}->[-1]; | ||||
| 673 | } | ||||
| 674 | |||||
| 675 | # spent 589µs (488+101) within PPI::Tokenizer::_last_significant_token which was called 52 times, avg 11µs/call:
# 51 times (479µs+99µs) by PPI::Token::Whitespace::__TOKENIZER__on_char at line 265 of PPI/Token/Whitespace.pm, avg 11µs/call
# once (10µs+2µs) by PPI::Token::Whitespace::__TOKENIZER__on_char at line 321 of PPI/Token/Whitespace.pm | ||||
| 676 | 52 | 19µs | my $self = shift; | ||
| 677 | 52 | 41µs | my $cursor = $#{ $self->{tokens} }; | ||
| 678 | 52 | 20µs | while ( $cursor >= 0 ) { | ||
| 679 | 104 | 45µs | my $token = $self->{tokens}->[$cursor--]; | ||
| 680 | 104 | 266µs | 104 | 101µs | return $token if $token->significant; # spent 54µs making 52 calls to PPI::Token::Whitespace::significant, avg 1µs/call
# spent 46µs making 52 calls to PPI::Element::significant, avg 894ns/call |
| 681 | } | ||||
| 682 | |||||
| 683 | # Nothing... | ||||
| 684 | PPI::Token::Whitespace->null; | ||||
| 685 | } | ||||
| 686 | |||||
| 687 | # Get an array ref of previous significant tokens. | ||||
| 688 | # Like _last_significant_token except it gets more than just one token | ||||
| 689 | # Returns array ref on success. | ||||
| 690 | # Returns 0 on not enough tokens | ||||
| 691 | # spent 305ms (261+43.9) within PPI::Tokenizer::_previous_significant_tokens which was called 20542 times, avg 15µs/call:
# 15490 times (172ms+28.5ms) by PPI::Token::Word::__TOKENIZER__commit at line 430 of PPI/Token/Word.pm, avg 13µs/call
# 3157 times (72.8ms+13.4ms) by PPI::Token::Whitespace::__TOKENIZER__on_char at line 222 of PPI/Token/Whitespace.pm, avg 27µs/call
# 1866 times (16.1ms+1.91ms) by PPI::Tokenizer::_opcontext at line 741, avg 10µs/call
# 25 times (469µs+119µs) by PPI::Token::Unknown::__TOKENIZER__is_an_attribute at line 305 of PPI/Token/Unknown.pm, avg 24µs/call
# 2 times (17µs+3µs) by PPI::Token::Unknown::__TOKENIZER__on_char at line 57 of PPI/Token/Unknown.pm, avg 10µs/call
# 2 times (11µs+2µs) by PPI::Token::Whitespace::__TOKENIZER__on_char at line 384 of PPI/Token/Whitespace.pm, avg 6µs/call | ||||
| 692 | 20542 | 4.29ms | my $self = shift; | ||
| 693 | 20542 | 2.60ms | my $count = shift || 1; | ||
| 694 | 20542 | 8.90ms | my $cursor = $#{ $self->{tokens} }; | ||
| 695 | |||||
| 696 | 20542 | 1.91ms | my ($token, @tokens); | ||
| 697 | 20542 | 4.68ms | while ( $cursor >= 0 ) { | ||
| 698 | 42181 | 14.9ms | $token = $self->{tokens}->[$cursor--]; | ||
| 699 | 42181 | 53.6ms | 42181 | 40.9ms | if ( $token->significant ) { # spent 25.1ms making 26762 calls to PPI::Element::significant, avg 940ns/call
# spent 13.8ms making 13592 calls to PPI::Token::Whitespace::significant, avg 1µs/call
# spent 1.88ms making 1824 calls to PPI::Token::Comment::significant, avg 1µs/call
# spent 3µs making 3 calls to PPI::Token::Pod::significant, avg 1µs/call |
| 700 | 26762 | 10.4ms | push @tokens, $token; | ||
| 701 | 26762 | 107ms | return \@tokens if scalar @tokens >= $count; | ||
| 702 | } | ||||
| 703 | } | ||||
| 704 | |||||
| 705 | # Pad with empties | ||||
| 706 | 144 | 424µs | foreach ( 1 .. ($count - scalar @tokens) ) { | ||
| 707 | 144 | 703µs | 144 | 3.03ms | push @tokens, PPI::Token::Whitespace->null; # spent 3.03ms making 144 calls to PPI::Token::Whitespace::null, avg 21µs/call |
| 708 | } | ||||
| 709 | |||||
| 710 | 144 | 466µs | \@tokens; | ||
| 711 | } | ||||
| 712 | |||||
| 713 | 1 | 7µs | my %OBVIOUS_CLASS = ( | ||
| 714 | 'PPI::Token::Symbol' => 'operator', | ||||
| 715 | 'PPI::Token::Magic' => 'operator', | ||||
| 716 | 'PPI::Token::Number' => 'operator', | ||||
| 717 | 'PPI::Token::ArrayIndex' => 'operator', | ||||
| 718 | 'PPI::Token::Quote::Double' => 'operator', | ||||
| 719 | 'PPI::Token::Quote::Interpolate' => 'operator', | ||||
| 720 | 'PPI::Token::Quote::Literal' => 'operator', | ||||
| 721 | 'PPI::Token::Quote::Single' => 'operator', | ||||
| 722 | 'PPI::Token::QuoteLike::Backtick' => 'operator', | ||||
| 723 | 'PPI::Token::QuoteLike::Command' => 'operator', | ||||
| 724 | 'PPI::Token::QuoteLike::Readline' => 'operator', | ||||
| 725 | 'PPI::Token::QuoteLike::Regexp' => 'operator', | ||||
| 726 | 'PPI::Token::QuoteLike::Words' => 'operator', | ||||
| 727 | ); | ||||
| 728 | |||||
| 729 | 1 | 2µs | my %OBVIOUS_CONTENT = ( | ||
| 730 | '(' => 'operand', | ||||
| 731 | '{' => 'operand', | ||||
| 732 | '[' => 'operand', | ||||
| 733 | ';' => 'operand', | ||||
| 734 | '}' => 'operator', | ||||
| 735 | ); | ||||
| 736 | |||||
| 737 | # Try to determine operator/operand context, is possible. | ||||
| 738 | # Returns "operator", "operand", or "" if unknown. | ||||
| 739 | # spent 34.6ms (16.1+18.5) within PPI::Tokenizer::_opcontext which was called 1866 times, avg 19µs/call:
# 1866 times (16.1ms+18.5ms) by PPI::Token::Whitespace::__TOKENIZER__on_char at line 397 of PPI/Token/Whitespace.pm, avg 19µs/call | ||||
| 740 | 1866 | 419µs | my $self = shift; | ||
| 741 | 1866 | 2.31ms | 1866 | 18.0ms | my $tokens = $self->_previous_significant_tokens(1); # spent 18.0ms making 1866 calls to PPI::Tokenizer::_previous_significant_tokens, avg 10µs/call |
| 742 | 1866 | 635µs | my $p0 = $tokens->[0]; | ||
| 743 | 1866 | 905µs | my $c0 = ref $p0; | ||
| 744 | |||||
| 745 | # Map the obvious cases | ||||
| 746 | 1866 | 5.32ms | return $OBVIOUS_CLASS{$c0} if defined $OBVIOUS_CLASS{$c0}; | ||
| 747 | 133 | 334µs | 153 | 247µs | return $OBVIOUS_CONTENT{$p0} if defined $OBVIOUS_CONTENT{$p0}; # spent 247µs making 153 calls to PPI::Token::content, avg 2µs/call |
| 748 | |||||
| 749 | # Most of the time after an operator, we are an operand | ||||
| 750 | 113 | 485µs | 113 | 168µs | return 'operand' if $p0->isa('PPI::Token::Operator'); # spent 168µs making 113 calls to UNIVERSAL::isa, avg 1µs/call |
| 751 | |||||
| 752 | # If there's NOTHING, it's operand | ||||
| 753 | 107 | 149µs | 107 | 140µs | return 'operand' if $p0->content eq ''; # spent 140µs making 107 calls to PPI::Token::content, avg 1µs/call |
| 754 | |||||
| 755 | # Otherwise, we don't know | ||||
| 756 | 107 | 283µs | return '' | ||
| 757 | } | ||||
| 758 | |||||
| 759 | 1 | 6µs | 1; | ||
| 760 | |||||
| 761 | =pod | ||||
| 762 | |||||
| 763 | =head1 NOTES | ||||
| 764 | |||||
| 765 | =head2 How the Tokenizer Works | ||||
| 766 | |||||
| 767 | Understanding the Tokenizer is not for the feint-hearted. It is by far | ||||
| 768 | the most complex and twisty piece of perl I've ever written that is actually | ||||
| 769 | still built properly and isn't a terrible spaghetti-like mess. In fact, you | ||||
| 770 | probably want to skip this section. | ||||
| 771 | |||||
| 772 | But if you really want to understand, well then here goes. | ||||
| 773 | |||||
| 774 | =head2 Source Input and Clean Up | ||||
| 775 | |||||
| 776 | The Tokenizer starts by taking source in a variety of forms, sucking it | ||||
| 777 | all in and merging into one big string, and doing our own internal line | ||||
| 778 | split, using a "universal line separator" which allows the Tokenizer to | ||||
| 779 | take source for any platform (and even supports a few known types of | ||||
| 780 | broken newlines caused by mixed mac/pc/*nix editor screw ups). | ||||
| 781 | |||||
| 782 | The resulting array of lines is used to feed the tokenizer, and is also | ||||
| 783 | accessed directly by the heredoc-logic to do the line-oriented part of | ||||
| 784 | here-doc support. | ||||
| 785 | |||||
| 786 | =head2 Doing Things the Old Fashioned Way | ||||
| 787 | |||||
| 788 | Due to the complexity of perl, and after 2 previously aborted parser | ||||
| 789 | attempts, in the end the tokenizer was fashioned around a line-buffered | ||||
| 790 | character-by-character method. | ||||
| 791 | |||||
| 792 | That is, the Tokenizer pulls and holds a line at a time into a line buffer, | ||||
| 793 | and then iterates a cursor along it. At each cursor position, a method is | ||||
| 794 | called in whatever token class we are currently in, which will examine the | ||||
| 795 | character at the current position, and handle it. | ||||
| 796 | |||||
| 797 | As the handler methods in the various token classes are called, they | ||||
| 798 | build up a output token array for the source code. | ||||
| 799 | |||||
| 800 | Various parts of the Tokenizer use look-ahead, arbitrary-distance | ||||
| 801 | look-behind (although currently the maximum is three significant tokens), | ||||
| 802 | or both, and various other heuristic guesses. | ||||
| 803 | |||||
| 804 | I've been told it is officially termed a I<"backtracking parser | ||||
| 805 | with infinite lookaheads">. | ||||
| 806 | |||||
| 807 | =head2 State Variables | ||||
| 808 | |||||
| 809 | Aside from the current line and the character cursor, the Tokenizer | ||||
| 810 | maintains a number of different state variables. | ||||
| 811 | |||||
| 812 | =over | ||||
| 813 | |||||
| 814 | =item Current Class | ||||
| 815 | |||||
| 816 | The Tokenizer maintains the current token class at all times. Much of the | ||||
| 817 | time is just going to be the "Whitespace" class, which is what the base of | ||||
| 818 | a document is. As the tokenizer executes the various character handlers, | ||||
| 819 | the class changes a lot as it moves a long. In fact, in some instances, | ||||
| 820 | the character handler may not handle the character directly itself, but | ||||
| 821 | rather change the "current class" and then hand off to the character | ||||
| 822 | handler for the new class. | ||||
| 823 | |||||
| 824 | Because of this, and some other things I'll deal with later, the number of | ||||
| 825 | times the character handlers are called does not in fact have a direct | ||||
| 826 | relationship to the number of actual characters in the document. | ||||
| 827 | |||||
| 828 | =item Current Zone | ||||
| 829 | |||||
| 830 | Rather than create a class stack to allow for infinitely nested layers of | ||||
| 831 | classes, the Tokenizer recognises just a single layer. | ||||
| 832 | |||||
| 833 | To put it a different way, in various parts of the file, the Tokenizer will | ||||
| 834 | recognise different "base" or "substrate" classes. When a Token such as a | ||||
| 835 | comment or a number is finalised by the tokenizer, it "falls back" to the | ||||
| 836 | base state. | ||||
| 837 | |||||
| 838 | This allows proper tokenization of special areas such as __DATA__ | ||||
| 839 | and __END__ blocks, which also contain things like comments and POD, | ||||
| 840 | without allowing the creation of any significant Tokens inside these areas. | ||||
| 841 | |||||
| 842 | For the main part of a document we use L<PPI::Token::Whitespace> for this, | ||||
| 843 | with the idea being that code is "floating in a sea of whitespace". | ||||
| 844 | |||||
| 845 | =item Current Token | ||||
| 846 | |||||
| 847 | The final main state variable is the "current token". This is the Token | ||||
| 848 | that is currently being built by the Tokenizer. For certain types, it | ||||
| 849 | can be manipulated and morphed and change class quite a bit while being | ||||
| 850 | assembled, as the Tokenizer's understanding of the token content changes. | ||||
| 851 | |||||
| 852 | When the Tokenizer is confident that it has seen the end of the Token, it | ||||
| 853 | will be "finalized", which adds it to the output token array and resets | ||||
| 854 | the current class to that of the zone that we are currently in. | ||||
| 855 | |||||
| 856 | I should also note at this point that the "current token" variable is | ||||
| 857 | optional. The Tokenizer is capable of knowing what class it is currently | ||||
| 858 | set to, without actually having accumulated any characters in the Token. | ||||
| 859 | |||||
| 860 | =back | ||||
| 861 | |||||
| 862 | =head2 Making It Faster | ||||
| 863 | |||||
| 864 | As I'm sure you can imagine, calling several different methods for each | ||||
| 865 | character and running regexes and other complex heuristics made the first | ||||
| 866 | fully working version of the tokenizer extremely slow. | ||||
| 867 | |||||
| 868 | During testing, I created a metric to measure parsing speed called | ||||
| 869 | LPGC, or "lines per gigacycle" . A gigacycle is simple a billion CPU | ||||
| 870 | cycles on a typical single-core CPU, and so a Tokenizer running at | ||||
| 871 | "1000 lines per gigacycle" should generate around 1200 lines of tokenized | ||||
| 872 | code when running on a 1200 MHz processor. | ||||
| 873 | |||||
| 874 | The first working version of the tokenizer ran at only 350 LPGC, so to | ||||
| 875 | tokenize a typical large module such as L<ExtUtils::MakeMaker> took | ||||
| 876 | 10-15 seconds. This sluggishness made it unpractical for many uses. | ||||
| 877 | |||||
| 878 | So in the current parser, there are multiple layers of optimisation | ||||
| 879 | very carefully built in to the basic. This has brought the tokenizer | ||||
| 880 | up to a more reasonable 1000 LPGC, at the expense of making the code | ||||
| 881 | quite a bit twistier. | ||||
| 882 | |||||
| 883 | =head2 Making It Faster - Whole Line Classification | ||||
| 884 | |||||
| 885 | The first step in the optimisation process was to add a hew handler to | ||||
| 886 | enable several of the more basic classes (whitespace, comments) to be | ||||
| 887 | able to be parsed a line at a time. At the start of each line, a | ||||
| 888 | special optional handler (only supported by a few classes) is called to | ||||
| 889 | check and see if the entire line can be parsed in one go. | ||||
| 890 | |||||
| 891 | This is used mainly to handle things like POD, comments, empty lines, | ||||
| 892 | and a few other minor special cases. | ||||
| 893 | |||||
| 894 | =head2 Making It Faster - Inlining | ||||
| 895 | |||||
| 896 | The second stage of the optimisation involved inlining a small | ||||
| 897 | number of critical methods that were repeated an extremely high number | ||||
| 898 | of times. Profiling suggested that there were about 1,000,000 individual | ||||
| 899 | method calls per gigacycle, and by cutting these by two thirds a significant | ||||
| 900 | speed improvement was gained, in the order of about 50%. | ||||
| 901 | |||||
| 902 | You may notice that many methods in the C<PPI::Tokenizer> code look | ||||
| 903 | very nested and long hand. This is primarily due to this inlining. | ||||
| 904 | |||||
| 905 | At around this time, some statistics code that existed in the early | ||||
| 906 | versions of the parser was also removed, as it was determined that | ||||
| 907 | it was consuming around 15% of the CPU for the entire parser, while | ||||
| 908 | making the core more complicated. | ||||
| 909 | |||||
| 910 | A judgment call was made that with the difficulties likely to be | ||||
| 911 | encountered with future planned enhancements, and given the relatively | ||||
| 912 | high cost involved, the statistics features would be removed from the | ||||
| 913 | Tokenizer. | ||||
| 914 | |||||
| 915 | =head2 Making It Faster - Quote Engine | ||||
| 916 | |||||
| 917 | Once inlining had reached diminishing returns, it became obvious from | ||||
| 918 | the profiling results that a huge amount of time was being spent | ||||
| 919 | stepping a char at a time though long, simple and "syntactically boring" | ||||
| 920 | code such as comments and strings. | ||||
| 921 | |||||
| 922 | The existing regex engine was expanded to also encompass quotes and | ||||
| 923 | other quote-like things, and a special abstract base class was added | ||||
| 924 | that provided a number of specialised parsing methods that would "scan | ||||
| 925 | ahead", looking out ahead to find the end of a string, and updating | ||||
| 926 | the cursor to leave it in a valid position for the next call. | ||||
| 927 | |||||
| 928 | This is also the point at which the number of character handler calls began | ||||
| 929 | to greatly differ from the number of characters. But it has been done | ||||
| 930 | in a way that allows the parser to retain the power of the original | ||||
| 931 | version at the critical points, while skipping through the "boring bits" | ||||
| 932 | as needed for additional speed. | ||||
| 933 | |||||
| 934 | The addition of this feature allowed the tokenizer to exceed 1000 LPGC | ||||
| 935 | for the first time. | ||||
| 936 | |||||
| 937 | =head2 Making It Faster - The "Complete" Mechanism | ||||
| 938 | |||||
| 939 | As it became evident that great speed increases were available by using | ||||
| 940 | this "skipping ahead" mechanism, a new handler method was added that | ||||
| 941 | explicitly handles the parsing of an entire token, where the structure | ||||
| 942 | of the token is relatively simple. Tokens such as symbols fit this case, | ||||
| 943 | as once we are passed the initial sigil and word char, we know that we | ||||
| 944 | can skip ahead and "complete" the rest of the token much more easily. | ||||
| 945 | |||||
| 946 | A number of these have been added for most or possibly all of the common | ||||
| 947 | cases, with most of these "complete" handlers implemented using regular | ||||
| 948 | expressions. | ||||
| 949 | |||||
| 950 | In fact, so many have been added that at this point, you could arguably | ||||
| 951 | reclassify the tokenizer as a "hybrid regex, char-by=char heuristic | ||||
| 952 | tokenizer". More tokens are now consumed in "complete" methods in a | ||||
| 953 | typical program than are handled by the normal char-by-char methods. | ||||
| 954 | |||||
| 955 | Many of the these complete-handlers were implemented during the writing | ||||
| 956 | of the Lexer, and this has allowed the full parser to maintain around | ||||
| 957 | 1000 LPGC despite the increasing weight of the Lexer. | ||||
| 958 | |||||
| 959 | =head2 Making It Faster - Porting To C (In Progress) | ||||
| 960 | |||||
| 961 | While it would be extraordinarily difficult to port all of the Tokenizer | ||||
| 962 | to C, work has started on a L<PPI::XS> "accelerator" package which acts as | ||||
| 963 | a separate and automatically-detected add-on to the main PPI package. | ||||
| 964 | |||||
| 965 | L<PPI::XS> implements faster versions of a variety of functions scattered | ||||
| 966 | over the entire PPI codebase, from the Tokenizer Core, Quote Engine, and | ||||
| 967 | various other places, and implements them identically in XS/C. | ||||
| 968 | |||||
| 969 | In particular, the skip-ahead methods from the Quote Engine would appear | ||||
| 970 | to be extremely amenable to being done in C, and a number of other | ||||
| 971 | functions could be cherry-picked one at a time and implemented in C. | ||||
| 972 | |||||
| 973 | Each method is heavily tested to ensure that the functionality is | ||||
| 974 | identical, and a versioning mechanism is included to ensure that if a | ||||
| 975 | function gets out of sync, L<PPI::XS> will degrade gracefully and just | ||||
| 976 | not replace that single method. | ||||
| 977 | |||||
| 978 | =head1 TO DO | ||||
| 979 | |||||
| 980 | - Add an option to reset or seek the token stream... | ||||
| 981 | |||||
| 982 | - Implement more Tokenizer functions in L<PPI::XS> | ||||
| 983 | |||||
| 984 | =head1 SUPPORT | ||||
| 985 | |||||
| 986 | See the L<support section|PPI/SUPPORT> in the main module. | ||||
| 987 | |||||
| 988 | =head1 AUTHOR | ||||
| 989 | |||||
| 990 | Adam Kennedy E<lt>adamk@cpan.orgE<gt> | ||||
| 991 | |||||
| 992 | =head1 COPYRIGHT | ||||
| 993 | |||||
| 994 | Copyright 2001 - 2011 Adam Kennedy. | ||||
| 995 | |||||
| 996 | This program is free software; you can redistribute | ||||
| 997 | it and/or modify it under the same terms as Perl itself. | ||||
| 998 | |||||
| 999 | The full text of the license can be found in the | ||||
| 1000 | LICENSE file included with this module. | ||||
| 1001 | |||||
| 1002 | =cut | ||||
# spent 4.15ms within PPI::Tokenizer::CORE:match which was called 15534 times, avg 267ns/call:
# 15534 times (4.15ms+0s) by List::MoreUtils::any at line 211, avg 267ns/call | |||||
# spent 162ms within PPI::Tokenizer::CORE:subst which was called 144 times, avg 1.12ms/call:
# 144 times (162ms+0s) by PPI::Tokenizer::new at line 186, avg 1.12ms/call |