Home » Php » PHP namespace removal / mapping and rewriting identifiers

PHP namespace removal / mapping and rewriting identifiers

Posted by: admin April 23, 2020 Leave a comment

Questions:

I’m attempting to automate the removal of namespaces from a PHP class collection to make them PHP 5.2 compatible. (Shared hosting providers do not fancy rogue PHP 5.3 installations. No idea why. Also the code in question doesn’t use any 5.3 feature additions, just that syntax. Autoconversion seems easier than doing it by hand or reimplementing the codebase.)

For rewriting the *.php scripts I’m basically running over a tokenizer list. The identifier searching+merging is already complete. But I’m a bit confused now how to accomplish the actual rewriting.

function rewrite($name, $namespace, $use) {

    global $identifiers2;            // list of known/existing classes

    /*
        bounty on missing code here
    */

    return strtr($name, "\", "_");  // goal: backslash to underscore
}

That function is going to be invoked on each found identifier (whether class, function or const). It will receive some context information to transform a local identifier into an absolute/global $name:

$name =
    rewrite(
        "classfuncconst",      # <-- foreach ($names as $name)
        "current\name\space",
        array(
           'namespc' => 'use\this\namespc',
           'alias' => 'from\name\too',
           ...
        )
    );

At this stage I’ve already prepared an $identifiers2 list. It contains a list of all known classes, functions and constant names (merged for simplicity here).

$identifiers2 = array(             // Alternative suggestions welcome.
   "name\space\Class" => "Class",  // - list structure usable for task?
   "other\ns\func1" => "func1",    // - local name aliases helpful?
   "blip\CONST" => "CONST",        // - (ignore case-insensitivity)

The $name parameter as received by the rewrite() function can be a local, unqualified, \absolute or name\spaced identifier (but just identifers, no expressions). The $identifiers2 list is crucial to resolve unqualified identifiers, which can refer to things in the current namespace, or if not found there, global stuff.

And the various use namespace aliases have to be taken into account and add some complication besides the namespace resolving and precedence rules.

So, how / in which order would you attempt to convert the variations of class/function names here?

Mental Laziness Bounty.

To make this a less blatant plzsendtehcodez question: an explainative instruction list or pseudo-code answer would be eligible too. And if another approach would be more suitable for the task, please elaborate on that rather. (But no, upgrading PHP or changing the hoster is not an option.)

I think I’ve figured it out meanwhile, but the question is still open for answers / implementation proposals. (Otherwise the bounty will obviously go to nikic.)

How to&Answers:

In an existing question on migration of namespaces to pseudo namespaced code I already introduced a conversion tool I have written as part of a larger project. I haven’t maintained this project anymore since that point, but as far as I remember the namespace replacements did work. (I may reimplement this project using a proper parser at some point. Working with plain tokens has proven to be quite a tedious task.)

You will find my implementation of namespace -> pseudo-namespace resolution in the namespace.php. I based the implementation on the namespace resolution rules, which will probably be of help for you, too.

To make this a less blatant readmycodez answer, here the basic steps the code does:

  1. Get the identifier to be resolved and ensure that it is not a class, interface, function or constant declaration (these are resolved in registerClass and registerOther by simply prepending the current namespace with ns separators replaced by underscores).
  2. Determine what type of identifier it is: A class, a function or a constant. (As these need different resolution.)
  3. Make sure we do not resolve the self and parent classes, nor the true, false and null constants.
  4. Resolve aliases (use list):
    1. If the identifier is qualified get the part before the first namespace separator and check whether there exists an alias with that name. If it does, replace the first part with the aliased namespace (now the identifier will be fully qualified). Otherwise prepend the current namespace.
    2. If identifier is unqualified and the identifier type is class, check whether the identifier is an alias and if it is, replace it with the aliased class.
  5. If the identifier is fully qualified now drop the leading namespace separator and replace all other namespace separators with underscores and end this algorithm.
  6. Otherwise:
    1. If we are in the global namespace no further resolution required, thus end this algorithm.
    2. If the identifier type is class prepend the current namespace, replace all NS separators with underscores and end this algorithm.
    3. Otherwise:
      1. If the function / constant is defined globally leave the identifier as is and end this algorithm. (This assumes that no global functions are redefined in a namespace! In my code I don’t make this assumption, thus I insert dynamic resolution code.)
      2. Otherwise prepend the current namespace and replace all namespace separators with underscores. (Seems like I got a fault in my code here: I don’t do this even if the assumeGlobal flag is set. Instead I always insert the dynamic dispatch code.)

Additional note: Don’t forget that one can also write namespace\some\ns. I resolve these constructs in the NS function (which is also responsible for finding namespace declarations).