T-Regx icon indicating copy to clipboard operation
T-Regx copied to clipboard

Simple library for regular expressions in PHP.

T-Regx

Build status Unit tests latest: 0.36.0 dependencies: 0

T-Regx | Regular Expressions library

PHP regular expressions brought up to modern standards.

See documentation at t-regx.com.

last commit commit activity Commits since Unit tests Code Climate Repository size FQN PRs Welcome Gitter

OS Arch OS Arch OS Arch OS Arch

PHP Version PHP Version PHP Version PHP Version PHP Version PHP Version PHP Version

  1. Installation
    • Composer
  2. What T-Regx is and isn't
  3. API
    1. For legacy projects - preg::match_all()
    2. For standard projects -pattern()
  4. Documentation
  5. T-Regx fiddle - Try online
  6. Overview
    1. Seamless migration
    2. Automatic delimiters
    3. Prepared patterns
  7. Comparison
    1. Exceptions over warnings/errors
    2. Working with the developer
    3. Written with clean API in mind
    4. Philosophy of Uncle Bob and "Clean Code"
  8. Plans for the future
  9. Sponsors
  10. License

Installation

Installation for PHP 7.1 and later (PHP 8 as well):

composer require rawr/t-regx

T-Regx only requires mb-string extension. No additional dependencies or extensions are required.

What T-Regx is and isn't

  • T-Regx is not a tool for building regular expressions

    We don't want to build the patterns for you. T-Regx is to be used exactly with "raw patterns".

    :bulb: For a tool to help you build and understand your patterns, consider using PHPVerbalExpressions:

    $regex = new VerbalExpressions();
    $regex->startOfLine()->then("http")->maybe("s")->then("://")->maybe("www.")->endOfLine();
    
  • T-Regx is a regex solution as it should've been made in PHP

    In our humble opinions, T-Regx is a well-crafted, robust, reliable and predictable tool for using regular expression in modern applications. It eliminates unknowns and complexity, for the sake of concise code and revealing intentions. It utilizes numerous, performant and lightweight checks and operations to ensure each method does exactly what it's supposed to.

    • T-Regx uses preg_match()/preg_replace() as an engine internally, but doesn't leak the horribleness of their design out.
    • T-Regx is to preg_match(), what PDO was to mysql_query().

    Read more, Scroll to "Overview"...

API

You choose the interface:

  • I choose to keep PHP methods (but protected from errors/warnings):

    Scroll to see - preg::match_all(), preg::replace_callback(), preg::split()

  • I choose the modern regex API:

    Scroll to see - pattern()->test(), pattern()->match(), pattern()->replace()

For legacy projects, we suggest preg::match_all(). For standard projects, we suggest pattern().

  • Legacy API

    try {
        preg::match_all('/?ups/', 'ups', $match, PREG_PATTERN_ORDER);
        echo $match[0][0];
    } catch (\TRegx\Exception\MalformedPatternException $exception) {
        echo "Invalid pattern";
    }
    
  • Standard T-Regx

    $pattern = Pattern::of("ups"); // pattern("ups") also works
    $match = $pattern->match('yay, ups');
    
    if ($match->test()) {
      echo "Unmatched subject :/";
    }
    
    foreach ($match as $detail) {
      $detail->text();    // (string) "ups";
      $detail->offset();  // (int) 0
    }
    
    $pattern->replace('well, ups')->with('heck') // (string) "well, heck";
    

Documentation

Full API documentation is available at t-regx.com. List of changes is available in ChangeLog.md.

Try it online, in your browser!

Open T-Regx fiddle and start playing around right in your browser. Try now!

Why T-Regx stands out?

:bulb: See documentation at t-regx.com

  • Seamless migration for legacy projects

    • You can use T-Regx exception-based error handling, without changing your API much. Simply swap preg_match() to preg::match(), and the method will only ever throw exceptions. Won't return null or false or issue a warning or a notice. Nor will it throw a fatal error.
    • Arguments, structure and return types remain the same. Your code will not break.
  • Automatic delimiters for your pattern

    Surrounding slashes or tildes (/pattern/ or ~patttern~) are not compulsory (if you use pattern()). Methods preg::match()/preg::replace() of course still require them, so we can swap between preg::match() and preg_match().

  • Prepared patterns

    Using user data isn't always safe with PCRE (even with preg_quote()), as well as just not being that convenient to use. T-Regx provides dedicated solution for building patterns with unsafe user input. Choose Pattern::inject() for simply including user data as literals. Use Pattern::mask() to convert user-supplied masks into full-fledged patterns, safely. Use Pattern::template() for constructing more complex patterns.

    function makePattern($name): Pattern {
      if ($name === null) {
        return Pattern::of("name[:=]empty");
      }
      return Pattern::inject("name[:=]@;", [$name]); // inject $name as @
    }
    
    $gibberish = "(my?name)";
    $pattern = makePattern($gibberish);
    
    $pattern->test('name=(my?name)'); // (bool) true
    
  • Exceptions over warnings/errors

    • Unlike PHP methods, T-Regx doesn't use warnings/notices/errors for unexpected inputs:
      try {
        preg::match_all('/([a3]+[a3]+)+3/', 'aaaaaaaaaaaaaaaaaaaa 3');
      } catch (\TRegx\SafeRegex\Exception\CatastrophicBacktrackingException $exception) {
        // caught
      }
      
    • Detects malformed patterns in and throws MalformedPatternException. This is impossible to catch with preg_last_error().
      try {
        preg::match('/?ups/', 'ups');
      } catch (\TRegx\Exception\MalformedPatternException $exception) {
        // caught
      }
      
    • Not every error in PHP can be read from preg_last_error(), however T-Regx throws dedicated exceptions for those events.
  • Working with the developer

    • Simple methods
      • T-Regx exposes functionality by simple methods, which return int, string, string[] or bool, which aren't nullable. If you wish to do something with your match or pattern, there's probably a method for that, which does exactly and only that.
    • Handlers and state:
      • Not even touching your error handlers or exception handlers in any way!
      • In fact, T-Regx doesn't touch any global state.
    • Strings:
    • Groups:
      • When using preg::match_all(), we receive an array, of arrays, of arrays. In contrast, T-Regx returns an array of groups: Group[]. Object Group contains all the information about the group.

      • Group errors:

        • When invalid group named is used get('!@#'), T-Regx throws \InvalidArgumentException.
        • When attempt to read a missing group, T-Regx throws NonexistentGroupException.
        • When reading a group that happens not to be matched, T-Regx throws GroupNotMatchedException.
  • Written with clean API

    • Descriptive, simple interface
    • UTF-8 support out-of-the-box
    • No Reflection used, No (...varargs), No (boolean arguments, true), (No flags, 1), [No [nested, [arrays]]]
    • Inconsistencies between PHP versions are eliminated in T-Regx
  • Protects you from fatal errors

    Certain arguments cause fatal errors with preg_() methods, which terminate the application and can't be caught. T-Regx will predict if given argument would cause a fatal error, and will throw a catchable exception instead.

  • T-Regx follows the philosophy of Uncle Bob and "Clean Code"

    Function should do one thing, it should do it well. A function should do exactly what you expect it to do.

What's better

Ugly api

or

Pretty api

Current work in progress

Current development priorities, regarding release of 1.0:

  • Separate SafeRegex and CleanRegex into to two packages, so users can choose what they want #103
  • Add documentation to each T-Regx public method #17 [in progress]
  • Release 1.0
  • Revamp of t-regx.com documentation [in progress]

Sponsors

T-Regx is developed thanks to

JetBrains

License

T-Regx is MIT licensed.