Skip to content

Commit 8eb94c6

Browse files
jrfnlgsherwood
authored andcommitted
PHP 8.0 | Tokenizer/PHP: stabilize comment tokenization
As described in issue 3002, in PHP 8 a trailing new line is no longer included in a `T_COMMENT` token. This commit "forward-fills" the PHP 5/7 tokenization of `T_COMMENT` tokens to PHP 8. Includes extensive unit tests. I'm hoping to have caught everything affected :fingers_crossed: The initial set of unit tests `StableCommentWhitespaceTest` use Linux line endings `\n`. The secondary set of unit tests `StableCommentWhitespaceWinTest` use Windows line endings `\r\n` to test that the fix is stable for files using different line ending. For the tests with Windows line endings, both the test case file as well as the actual test file have been set up to use Windows line endings for all lines, not just the test data lines, to make it simpler to manage the line endings for the files. The test file has been excluded from the line endings CS check for that reason and a directive has been added to the `.gitattributes` file to safeguard that the line endings of those files will remain Windows line endings. Fixes 3002
1 parent 40e71ae commit 8eb94c6

File tree

7 files changed

+1432
-1
lines changed

7 files changed

+1432
-1
lines changed

.gitattributes

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,3 +12,7 @@
1212
phpunit.xml.dist export-ignore
1313
php5-testingConfig.ini export-ignore
1414
php7-testingConfig.ini export-ignore
15+
16+
# Declare files that should always have CRLF line endings on checkout.
17+
*WinTest.inc text eol=crlf
18+
*WinTest.php text eol=crlf

phpcs.xml.dist

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -145,7 +145,12 @@
145145

146146
<!-- The testing bootstrap file uses string concats to stop IDEs seeing the class aliases -->
147147
<rule ref="Generic.Strings.UnnecessaryStringConcat">
148-
<exclude-pattern>tests/bootstrap.php</exclude-pattern>
148+
<exclude-pattern>tests/bootstrap\.php</exclude-pattern>
149+
</rule>
150+
151+
<!-- This test file specifically *needs* Windows line endings for testing purposes. -->
152+
<rule ref="Generic.Files.LineEndings.InvalidEOLChar">
153+
<exclude-pattern>tests/Core/Tokenizer/StableCommentWhitespaceWinTest\.php</exclude-pattern>
149154
</rule>
150155

151156
</ruleset>

src/Tokenizers/PHP.php

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -568,6 +568,47 @@ protected function tokenize($string)
568568
continue;
569569
}
570570

571+
/*
572+
PHP 8 tokenizes a new line after a slash comment to the next whitespace token.
573+
*/
574+
575+
if (PHP_VERSION_ID >= 80000
576+
&& $tokenIsArray === true
577+
&& ($token[0] === T_COMMENT && strpos($token[1], '//') === 0)
578+
&& isset($tokens[($stackPtr + 1)]) === true
579+
&& is_array($tokens[($stackPtr + 1)]) === true
580+
&& $tokens[($stackPtr + 1)][0] === T_WHITESPACE
581+
) {
582+
$nextToken = $tokens[($stackPtr + 1)];
583+
584+
// If the next token is a single new line, merge it into the comment token
585+
// and set to it up to be skipped.
586+
if ($nextToken[1] === "\n" || $nextToken[1] === "\r\n" || $nextToken[1] === "\n\r") {
587+
$token[1] .= $nextToken[1];
588+
$tokens[($stackPtr + 1)] = null;
589+
590+
if (PHP_CODESNIFFER_VERBOSITY > 1) {
591+
Common::printStatusMessage("* merged newline after comment into comment token $stackPtr", 2);
592+
}
593+
} else {
594+
// This may be a whitespace token consisting of multiple new lines.
595+
if (strpos($nextToken[1], "\r\n") === 0) {
596+
$token[1] .= "\r\n";
597+
$tokens[($stackPtr + 1)][1] = substr($nextToken[1], 2);
598+
} else if (strpos($nextToken[1], "\n\r") === 0) {
599+
$token[1] .= "\n\r";
600+
$tokens[($stackPtr + 1)][1] = substr($nextToken[1], 2);
601+
} else if (strpos($nextToken[1], "\n") === 0) {
602+
$token[1] .= "\n";
603+
$tokens[($stackPtr + 1)][1] = substr($nextToken[1], 1);
604+
}
605+
606+
if (PHP_CODESNIFFER_VERBOSITY > 1) {
607+
Common::printStatusMessage("* stripped first newline after comment and added it to comment token $stackPtr", 2);
608+
}
609+
}//end if
610+
}//end if
611+
571612
/*
572613
If this is a double quoted string, PHP will tokenize the whole
573614
thing which causes problems with the scope map when braces are
Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
<?php
2+
3+
/* testSingleLineSlashComment */
4+
// Comment
5+
6+
/* testSingleLineSlashCommentTrailing */
7+
echo 'a'; // Comment
8+
9+
/* testSingleLineSlashAnnotation */
10+
// phpcs:disable Stnd.Cat
11+
12+
/* testMultiLineSlashComment */
13+
// Comment1
14+
// Comment2
15+
// Comment3
16+
17+
/* testMultiLineSlashCommentWithIndent */
18+
// Comment1
19+
// Comment2
20+
// Comment3
21+
22+
/* testMultiLineSlashCommentWithAnnotationStart */
23+
// phpcs:ignore Stnd.Cat
24+
// Comment2
25+
// Comment3
26+
27+
/* testMultiLineSlashCommentWithAnnotationMiddle */
28+
// Comment1
29+
// @phpcs:ignore Stnd.Cat
30+
// Comment3
31+
32+
/* testMultiLineSlashCommentWithAnnotationEnd */
33+
// Comment1
34+
// Comment2
35+
// phpcs:ignore Stnd.Cat
36+
37+
38+
/* testSingleLineStarComment */
39+
/* Single line star comment */
40+
41+
/* testSingleLineStarCommentTrailing */
42+
echo 'a'; /* Comment */
43+
44+
/* testSingleLineStarAnnotation */
45+
/* phpcs:ignore Stnd.Cat */
46+
47+
/* testMultiLineStarComment */
48+
/* Comment1
49+
* Comment2
50+
* Comment3 */
51+
52+
/* testMultiLineStarCommentWithIndent */
53+
/* Comment1
54+
* Comment2
55+
* Comment3 */
56+
57+
/* testMultiLineStarCommentWithAnnotationStart */
58+
/* @phpcs:ignore Stnd.Cat
59+
* Comment2
60+
* Comment3 */
61+
62+
/* testMultiLineStarCommentWithAnnotationMiddle */
63+
/* Comment1
64+
* phpcs:ignore Stnd.Cat
65+
* Comment3 */
66+
67+
/* testMultiLineStarCommentWithAnnotationEnd */
68+
/* Comment1
69+
* Comment2
70+
* phpcs:ignore Stnd.Cat */
71+
72+
73+
/* testSingleLineDocblockComment */
74+
/** Comment */
75+
76+
/* testSingleLineDocblockCommentTrailing */
77+
$prop = 123; /** Comment */
78+
79+
/* testSingleLineDocblockAnnotation */
80+
/** phpcs:ignore Stnd.Cat.Sniff */
81+
82+
/* testMultiLineDocblockComment */
83+
/**
84+
* Comment1
85+
* Comment2
86+
*
87+
* @tag Comment
88+
*/
89+
90+
/* testMultiLineDocblockCommentWithIndent */
91+
/**
92+
* Comment1
93+
* Comment2
94+
*
95+
* @tag Comment
96+
*/
97+
98+
/* testMultiLineDocblockCommentWithAnnotation */
99+
/**
100+
* Comment
101+
*
102+
* phpcs:ignore Stnd.Cat
103+
* @tag Comment
104+
*/
105+
106+
/* testMultiLineDocblockCommentWithTagAnnotation */
107+
/**
108+
* Comment
109+
*
110+
* @phpcs:ignore Stnd.Cat
111+
* @tag Comment
112+
*/
113+
114+
/* testSingleLineSlashCommentNoNewLineAtEnd */
115+
// Slash ?>
116+
<?php
117+
118+
/* testCommentAtEndOfFile */
119+
/* Comment

0 commit comments

Comments
 (0)