processwire-issues
processwire-issues copied to clipboard
$files->getNamespace() incorrect detection of commented namespace declaration
Short description of the issue
<?php //namespace ProcessWire;
or
<?php
//namespace ProcessWire;
still gets detected as having the ProcessWire namespace
Expected behavior
Commented namespaces should be ignored, just as this version is:
<?php /*namespace ProcessWire;*/
Actual behavior
Method returns ProcessWire namespace.
Optional: Suggestion for a possible fix
Not suggesting this is the right approach to fix, but the old regex from here (https://github.com/processwire/processwire/blob/3cc76cc886a49313b4bfb9a1a904bd88d11b7cb7/wire/core/WireFileTools.php#L1718) does seem to handle this scenario.
Steps to reproduce the issue
- Create file with a commented namespace as described above
- Call
$files->getNamespace()on that file
Thanks @adrianbj I've pushed a fix for this
Thanks @ryancramerdesign - that does seem to solve the problem, but I do wonder about the complexity of this code vs going with token_get_all - I know I have mentioned this to you before, but it really is a cool function.
How about something like this which in my tests seems just as fast, if not faster. I have used token_get_all a lot and find it much cleaner than regexes for these sorts of tasks.
function get_namespace($file) {
$src = file_get_contents($file);
$tokens = token_get_all($src);
$count = count($tokens);
$i = 0;
$namespace = '';
$namespace_ok = false;
while ($i < $count) {
$token = $tokens[$i];
if (is_array($token) && $token[0] === T_NAMESPACE) {
// Found namespace declaration
while (++$i < $count) {
if ($tokens[$i] === ';') {
$namespace_ok = true;
$namespace = trim($namespace);
break;
}
$namespace .= is_array($tokens[$i]) ? $tokens[$i][1] : $tokens[$i];
}
break;
}
$i++;
}
if (!$namespace_ok) {
return '\\';
} else {
return $namespace;
}
}
@adrianbj The code might look long, but that's just covering all of the fallback and unusual cases. 99% of the time, the namespace is identified just in the first few lines, making it nice and fast.
The token method does look nice and simple too, but I think the downside is that it has to load all the tokens for the entire file just to identify something that would most likely be in the first line. Still, if I had to refactor the namespace method later, then it might make sense to replace some of the later fallback conditions with the token method you've described.
@ryancramerdesign - yes of course - sorry, I thought those initial quick checks would still remain before the token code I provided kicked in, like this:
public function get_namespace($file, $fileIsContents = false) {
$namespace = "\\"; // root namespace, if no namespace found
if($fileIsContents) {
$data = trim($file);
} else {
$data = file_get_contents($file);
if($data === false) return $namespace;
$data = trim($data);
}
// if there's no "namespace" keyword in the file, it's not declaring one
$namespacePos = strpos($data, 'namespace');
if($namespacePos === false) return $namespace;
// quick optimization for common ProcessWire namespace usage
if(strpos($data, '<' . '?php namespace ProcessWire;') === 0) return 'ProcessWire';
// if file doesn't start with an opening PHP tag, then it's not going to have a namespace declaration
$phpOpen = strpos($data, '<' . '?');
if($phpOpen !== 0) {
// file does not begin with opening php tag
// note: this fails for PHP files executable on their own (like shell scripts)
return $namespace;
}
$tokens = token_get_all($data);
$count = count($tokens);
$i = 0;
$namespace = '';
$namespace_ok = false;
while ($i < $count) {
$token = $tokens[$i];
if (is_array($token) && $token[0] === T_NAMESPACE) {
// Found namespace declaration
while (++$i < $count) {
if ($tokens[$i] === ';') {
$namespace_ok = true;
$namespace = trim($namespace);
break;
}
$namespace .= is_array($tokens[$i]) ? $tokens[$i][1] : $tokens[$i];
}
break;
}
$i++;
}
if (!$namespace_ok) {
return '\\';
} else {
return $namespace;
}
}