Profile of PPI/Token/HereDoc.pm

Filename	/Users/timbo/perl5/perlbrew/perls/perl-5.18.2/lib/site_perl/5.18.2/PPI/Token/HereDoc.pm
Statements	Executed 39 statements in 966µs

Subroutines
Calls	P	F	Exclusive Time	Inclusive Time	Subroutine
1	1	1	115µs	138µs	PPI::Token::HereDoc::__TOKENIZER__on_char
1	1	1	20µs	43µs	PPI::Token::HereDoc::::BEGIN@87PPI::Token::HereDoc::BEGIN@87
1	1	1	15µs	15µs	PPI::Token::HereDoc::::BEGIN@91PPI::Token::HereDoc::BEGIN@91
1	1	1	10µs	59µs	PPI::Token::HereDoc::::BEGIN@90PPI::Token::HereDoc::BEGIN@90
2	2	1	7µs	7µs	PPI::Token::HereDoc::::heredocPPI::Token::HereDoc::heredoc
1	1	1	6µs	6µs	PPI::Token::HereDoc::::BEGIN@88PPI::Token::HereDoc::BEGIN@88
3	2	1	5µs	5µs	PPI::Token::HereDoc::::CORE:matchPPI::Token::HereDoc::CORE:match (opcode)
1	1	1	1µs	1µs	PPI::Token::HereDoc::::CORE:substPPI::Token::HereDoc::CORE:subst (opcode)
0	0	0	0s	0s	PPI::Token::HereDoc::::terminatorPPI::Token::HereDoc::terminator

Call graph for these subroutines as a Graphviz dot language file.

Line	State ments	Time on line	Calls	Time in subs	Code
1					package PPI::Token::HereDoc;
2
3					=pod
4
5					=head1 NAME
6
7					PPI::Token::HereDoc - Token class for the here-doc
8
9					=head1 INHERITANCE
10
11					PPI::Token::HereDoc
12					isa PPI::Token
13					isa PPI::Element
14
15					=head1 DESCRIPTION
16
17					Here-docs are incredibly handy when writing Perl, but incredibly tricky
18					when parsing it, primarily because they don't follow the general flow of
19					input.
20
21					They jump ahead and nab lines directly off the input buffer. Whitespace
22					and newlines may not matter in most Perl code, but they matter in here-docs.
23
24					They are also tricky to store as an object. They look sort of like an
25					operator and a string, but they don't act like it. And they have a second
26					section that should be something like a separate token, but isn't because a
27					strong can span from above the here-doc content to below it.
28
29					So when parsing, this is what we do.
30
31					Firstly, the PPI::Token::HereDoc object, does not represent the C<<< << >>>
32					operator, or the "END_FLAG", or the content, or even the terminator.
33
34					It represents all of them at once.
35
36					The token itself has only the declaration part as its "content".
37
38					# This is what the content of a HereDoc token is
39					<<FOO
40
41					# Or this
42					<<"FOO"
43
44					# Or even this
45					<< 'FOO'
46
47					That is, the "operator", any whitespace separator, and the quoted or bare
48					terminator. So when you call the C<content> method on a HereDoc token, you
49					get '<< "FOO"'.
50
51					As for the content and the terminator, when treated purely in "content" terms
52					they do not exist.
53
54					The content is made available with the C<heredoc> method, and the name of
55					the terminator with the C<terminator> method.
56
57					To make things work in the way you expect, PPI has to play some games
58					when doing line/column location calculation for tokens, and also during
59					the content parsing and generation processes.
60
61					Documents cannot simply by recreated by stitching together the token
62					contents, and involve a somewhat more expensive procedure, but the extra
63					expense should be relatively negligible unless you are doing huge
64					quantities of them.
65
66					Please note that due to the immature nature of PPI in general, we expect
67					C<HereDocs> to be a rich (bad) source of corner-case bugs for quite a while,
68					but for the most part they should more or less DWYM.
69
70					=head2 Comparison to other string types
71
72					Although technically it can be considered a quote, for the time being
73					C<HereDocs> are being treated as a completely separate C<Token> subclass,
74					and will not be found in a search for L<PPI::Token::Quote> or
75					L<PPI::Token::QuoteLike objects>.
76
77					This may change in the future, with it most likely to end up under
78					QuoteLike.
79
80					=head1 METHODS
81
82					Although it has the standard set of C<Token> methods, C<HereDoc> objects
83					have a relatively large number of unique methods all of their own.
84
85					=cut
86
87	2	35µs	2	66µs	# spent 43µs (20+23) within PPI::Token::HereDoc::BEGIN@87 which was called: # once (20µs+23µs) by PPI::Token::BEGIN@70 at line 87 use strict; # spent 43µs making 1 call to PPI::Token::HereDoc::BEGIN@87 # spent 23µs making 1 call to strict::import
88	2	32µs	1	6µs	# spent 6µs within PPI::Token::HereDoc::BEGIN@88 which was called: # once (6µs+0s) by PPI::Token::BEGIN@70 at line 88 use PPI::Token (); # spent 6µs making 1 call to PPI::Token::HereDoc::BEGIN@88
89
90	2	50µs	2	108µs	# spent 59µs (10+49) within PPI::Token::HereDoc::BEGIN@90 which was called: # once (10µs+49µs) by PPI::Token::BEGIN@70 at line 90 use vars qw{$VERSION @ISA}; # spent 59µs making 1 call to PPI::Token::HereDoc::BEGIN@90 # spent 49µs making 1 call to vars::import
91					# spent 15µs within PPI::Token::HereDoc::BEGIN@91 which was called: # once (15µs+0s) by PPI::Token::BEGIN@70 at line 94 BEGIN {
92	1	600ns			$VERSION = '1.215';
93	1	22µs			@ISA = 'PPI::Token';
94	1	706µs	1	15µs	} # spent 15µs making 1 call to PPI::Token::HereDoc::BEGIN@91
95
- -
100					#####################################################################
101					# PPI::Token::HereDoc Methods
102
103					=pod
104
105					=head2 heredoc
106
107					The C<heredoc> method is the authoritative method for accessing the contents
108					of the C<HereDoc> object.
109
110					It returns the contents of the here-doc as a list of newline-terminated
111					strings. If called in scalar context, it returns the number of lines in
112					the here-doc, B<excluding> the terminator line.
113
114					=cut
115
116					# spent 7µs within PPI::Token::HereDoc::heredoc which was called 2 times, avg 3µs/call: # once (4µs+0s) by PPI::Document::serialize at line 464 of PPI/Document.pm # once (3µs+0s) by PPI::Document::index_locations at line 627 of PPI/Document.pm sub heredoc {
117					wantarray
118					? @{shift->{_heredoc}}
119	2	12µs			: scalar @{shift->{_heredoc}};
120					}
121
122					=pod
123
124					=head2 terminator
125
126					The C<terminator> method returns the name of the terminating string for the
127					here-doc.
128
129					Returns the terminating string as an unescaped string (in the rare case
130					the terminator has an escaped quote in it).
131
132					=cut
133
134					sub terminator {
135					shift->{_terminator};
136					}
137
- -
142					#####################################################################
143					# Tokenizer Methods
144
145					# Parse in the entire here-doc in one call
146					# spent 138µs (115+23) within PPI::Token::HereDoc::__TOKENIZER__on_char which was called: # once (115µs+23µs) by PPI::Token::Operator::__TOKENIZER__on_char at line 102 of PPI/Token/Operator.pm sub __TOKENIZER__on_char {
147	1	600ns			my $t = $_[1];
148
149					# We are currently located on the first char after the <<
150
151					# Handle the most common form first for simplicity and speed reasons
152					### FIXME - This regex, and this method in general, do not yet allow
153					### for the null here-doc, which terminates at the first
154					### empty line.
155	1	1µs			my $rest_of_line = substr( $t->{line}, $t->{line_cursor} );
156	1	9µs	1	2µs	unless ( $rest_of_line =~ /^( \s* (?: "[^"]" \| '[^']' \| `[^`]*` \| \\?\w+ ) )/x ) { # spent 2µs making 1 call to PPI::Token::HereDoc::CORE:match
157					# Degenerate to a left-shift operation
158					$t->{token}->set_class('Operator');
159					return $t->_finalize_token->__TOKENIZER__on_char( $t );
160					}
161
162					# Add the rest of the token, work out what type it is,
163					# and suck in the content until the end.
164	1	500ns			my $token = $t->{token};
165	1	54µs			$token->{content} .= $1;
166	1	1µs			$t->{line_cursor} += length $1;
167
168					# Find the terminator, clean it up and determine
169					# the type of here-doc we are dealing with.
170	1	900ns			my $content = $token->{content};
171	1	8µs	2	3µs	if ( $content =~ /^\<\<(\w+)$/ ) { # spent 3µs making 2 calls to PPI::Token::HereDoc::CORE:match, avg 1µs/call
172					# Bareword
173					$token->{_mode} = 'interpolate';
174					$token->{_terminator} = $1;
175
176					} elsif ( $content =~ /^\<\<\s\'(.)\'$/ ) {
177					# ''-quoted literal
178	1	1µs			$token->{_mode} = 'literal';
179	1	1µs			$token->{_terminator} = $1;
180	1	7µs	1	1µs	$token->{_terminator} =~ s/\\'/'/g; # spent 1µs making 1 call to PPI::Token::HereDoc::CORE:subst
181
182					} elsif ( $content =~ /^\<\<\s\"(.)\"$/ ) {
183					# ""-quoted literal
184					$token->{_mode} = 'interpolate';
185					$token->{_terminator} = $1;
186					$token->{_terminator} =~ s/\\"/"/g;
187
188					} elsif ( $content =~ /^\<\<\s\`(.)\`$/ ) {
189					# ``-quoted command
190					$token->{_mode} = 'command';
191					$token->{_terminator} = $1;
192					$token->{_terminator} =~ s/\\`/`/g;
193
194					} elsif ( $content =~ /^\<\<\\(\w+)$/ ) {
195					# Legacy forward-slashed bareword
196					$token->{_mode} = 'literal';
197					$token->{_terminator} = $1;
198
199					} else {
200					# WTF?
201					return undef;
202					}
203
204					# Define $line outside of the loop, so that if we encounter the
205					# end of the file, we have access to the last line still.
206	1	500ns			my $line;
207
208					# Suck in the HEREDOC
209	1	1µs			$token->{_heredoc} = [];
210	1	1µs			my $terminator = $token->{_terminator} . "\n";
211	1	2µs	1	3µs	while ( defined($line = $t->_get_line) ) { # spent 3µs making 1 call to PPI::Tokenizer::_get_line
212	6	900ns			if ( $line eq $terminator ) {
213					# Keep the actual termination line for consistency
214					# when we are re-assembling the file
215	1	600ns			$token->{_terminator_line} = $line;
216
217					# The HereDoc is now fully parsed
218	1	6µs	2	5µs	return $t->_finalize_token->__TOKENIZER__on_char( $t ); # spent 2µs making 1 call to PPI::Token::Whitespace::__TOKENIZER__on_char # spent 2µs making 1 call to PPI::Tokenizer::_finalize_token
219					}
220
221					# Add the line
222	5	9µs	5	10µs	push @{$token->{_heredoc}}, $line; # spent 10µs making 5 calls to PPI::Tokenizer::_get_line, avg 2µs/call
223					}
224
225					# End of file.
226					# Error: Didn't reach end of here-doc before end of file.
227					# $line might be undef if we get NO lines.
228					if ( defined $line and $line eq $token->{_terminator} ) {
229					# If the last line matches the terminator
230					# but is missing the newline, we want to allow
231					# it anyway (like perl itself does). In this case
232					# perl would normally throw a warning, but we will
233					# also ignore that as well.
234					pop @{$token->{_heredoc}};
235					$token->{_terminator_line} = $line;
236					} else {
237					# The HereDoc was not properly terminated.
238					$token->{_terminator_line} = undef;
239
240					# Trim off the trailing whitespace
241					if ( defined $token->{_heredoc}->[-1] and $t->{source_eof_chop} ) {
242					chop $token->{_heredoc}->[-1];
243					$t->{source_eof_chop} = '';
244					}
245					}
246
247					# Set a hint for PPI::Document->serialize so it can
248					# inexpensively repair it if needed when writing back out.
249					$token->{_damaged} = 1;
250
251					# The HereDoc is not fully parsed
252					$t->_finalize_token->__TOKENIZER__on_char( $t );
253					}
254
255	1	3µs			1;
256
257					=pod
258
259					=head1 TO DO
260
261					- Implement PPI::Token::Quote interface compatibility
262
263					- Check CPAN for any use of the null here-doc or here-doc-in-s///e
264
265					- Add support for the null here-doc
266
267					- Add support for here-doc in s///e
268
269					=head1 SUPPORT
270
271					See the L<support section\|PPI/SUPPORT> in the main module.
272
273					=head1 AUTHOR
274
275					Adam Kennedy E<lt>adamk@cpan.orgE<gt>
276
277					=head1 COPYRIGHT
278
279					Copyright 2001 - 2011 Adam Kennedy.
280
281					This program is free software; you can redistribute
282					it and/or modify it under the same terms as Perl itself.
283
284					The full text of the license can be found in the
285					LICENSE file included with this module.
286
287					=cut

					# spent 5µs within PPI::Token::HereDoc::CORE:match which was called 3 times, avg 2µs/call: # 2 times (3µs+0s) by PPI::Token::HereDoc::__TOKENIZER__on_char at line 171, avg 1µs/call # once (2µs+0s) by PPI::Token::HereDoc::__TOKENIZER__on_char at line 156 sub PPI::Token::HereDoc::CORE:match; # opcode
					# spent 1µs within PPI::Token::HereDoc::CORE:subst which was called: # once (1µs+0s) by PPI::Token::HereDoc::__TOKENIZER__on_char at line 180 sub PPI::Token::HereDoc::CORE:subst; # opcode