processwire-issues icon indicating copy to clipboard operation
processwire-issues copied to clipboard

$files->getNamespace() incorrect detection of commented namespace declaration

Open adrianbj opened this issue 1 year ago • 4 comments

Short description of the issue

<?php //namespace ProcessWire;

or

<?php 
//namespace ProcessWire;

still gets detected as having the ProcessWire namespace

Expected behavior

Commented namespaces should be ignored, just as this version is:

<?php /*namespace ProcessWire;*/

Actual behavior

Method returns ProcessWire namespace.

Optional: Suggestion for a possible fix

Not suggesting this is the right approach to fix, but the old regex from here (https://github.com/processwire/processwire/blob/3cc76cc886a49313b4bfb9a1a904bd88d11b7cb7/wire/core/WireFileTools.php#L1718) does seem to handle this scenario.

Steps to reproduce the issue

  1. Create file with a commented namespace as described above
  2. Call $files->getNamespace() on that file

adrianbj avatar Feb 10 '24 18:02 adrianbj

Thanks @adrianbj I've pushed a fix for this

ryancramerdesign avatar Feb 21 '24 16:02 ryancramerdesign

Thanks @ryancramerdesign - that does seem to solve the problem, but I do wonder about the complexity of this code vs going with token_get_all - I know I have mentioned this to you before, but it really is a cool function.

How about something like this which in my tests seems just as fast, if not faster. I have used token_get_all a lot and find it much cleaner than regexes for these sorts of tasks.

function get_namespace($file) {
    $src = file_get_contents($file);
	$tokens = token_get_all($src);
	$count = count($tokens);
	$i = 0;
	$namespace = '';
	$namespace_ok = false;
	while ($i < $count) {
		$token = $tokens[$i];
		if (is_array($token) && $token[0] === T_NAMESPACE) {
			// Found namespace declaration
			while (++$i < $count) {
				if ($tokens[$i] === ';') {
					$namespace_ok = true;
					$namespace = trim($namespace);
					break;
				}
				$namespace .= is_array($tokens[$i]) ? $tokens[$i][1] : $tokens[$i];
			}
			break;
		}
		$i++;
	}

	if (!$namespace_ok) {
		return '\\';
	} else {
		return $namespace;
	}
}

adrianbj avatar Feb 21 '24 16:02 adrianbj

@adrianbj The code might look long, but that's just covering all of the fallback and unusual cases. 99% of the time, the namespace is identified just in the first few lines, making it nice and fast.

The token method does look nice and simple too, but I think the downside is that it has to load all the tokens for the entire file just to identify something that would most likely be in the first line. Still, if I had to refactor the namespace method later, then it might make sense to replace some of the later fallback conditions with the token method you've described.

ryancramerdesign avatar Mar 28 '24 15:03 ryancramerdesign

@ryancramerdesign - yes of course - sorry, I thought those initial quick checks would still remain before the token code I provided kicked in, like this:

public function get_namespace($file, $fileIsContents = false) {
    
	$namespace = "\\"; // root namespace, if no namespace found
	
	if($fileIsContents) {
		$data = trim($file);
	} else {
		$data = file_get_contents($file);
		if($data === false) return $namespace;
		$data = trim($data);
	}

	// if there's no "namespace" keyword in the file, it's not declaring one
	$namespacePos = strpos($data, 'namespace');
	if($namespacePos === false) return $namespace;

	// quick optimization for common ProcessWire namespace usage
	if(strpos($data, '<' . '?php namespace ProcessWire;') === 0) return 'ProcessWire';

	// if file doesn't start with an opening PHP tag, then it's not going to have a namespace declaration
	$phpOpen = strpos($data, '<' . '?');
	if($phpOpen !== 0) {
		// file does not begin with opening php tag	
		// note: this fails for PHP files executable on their own (like shell scripts)
		return $namespace;
	}    
    
	$tokens = token_get_all($data);
	$count = count($tokens);
	$i = 0;
	$namespace = '';
	$namespace_ok = false;
	while ($i < $count) {
		$token = $tokens[$i];
		if (is_array($token) && $token[0] === T_NAMESPACE) {
			// Found namespace declaration
			while (++$i < $count) {
				if ($tokens[$i] === ';') {
					$namespace_ok = true;
					$namespace = trim($namespace);
					break;
				}
				$namespace .= is_array($tokens[$i]) ? $tokens[$i][1] : $tokens[$i];
			}
			break;
		}
		$i++;
	}

	if (!$namespace_ok) {
		return '\\';
	} else {
		return $namespace;
	}
}

adrianbj avatar Mar 28 '24 16:03 adrianbj