T-Regx
T-Regx copied to clipboard
Simple library for regular expressions in PHP.
T-Regx | Regular Expressions library
PHP regular expressions brought up to modern standards.
See documentation at t-regx.com.
-
Installation
- Composer
- What T-Regx is and isn't
-
API
-
For legacy projects -
preg::match_all()
-
For standard projects -
pattern()
-
For legacy projects -
- Documentation
- T-Regx fiddle - Try online
-
Overview
- Seamless migration
- Automatic delimiters
- Prepared patterns
-
Comparison
- Exceptions over warnings/errors
- Working with the developer
- Written with clean API in mind
- Philosophy of Uncle Bob and "Clean Code"
- Plans for the future
- Sponsors
- License
Installation
Installation for PHP 7.1 and later (PHP 8 as well):
composer require rawr/t-regx
T-Regx only requires mb-string
extension. No additional dependencies or extensions are required.
What T-Regx is and isn't
-
T-Regx is not a tool for building regular expressions
We don't want to build the patterns for you. T-Regx is to be used exactly with "raw patterns".
:bulb: For a tool to help you build and understand your patterns, consider using PHPVerbalExpressions:
$regex = new VerbalExpressions(); $regex->startOfLine()->then("http")->maybe("s")->then("://")->maybe("www.")->endOfLine();
-
T-Regx is a regex solution as it should've been made in PHP
In our humble opinions, T-Regx is a well-crafted, robust, reliable and predictable tool for using regular expression in modern applications. It eliminates unknowns and complexity, for the sake of concise code and revealing intentions. It utilizes numerous, performant and lightweight checks and operations to ensure each method does exactly what it's supposed to.
- T-Regx uses
preg_match()
/preg_replace()
as an engine internally, but doesn't leak the horribleness of their design out. - T-Regx is to
preg_match()
, what PDO was tomysql_query()
.
Read more, Scroll to "Overview"...
- T-Regx uses
API
You choose the interface:
-
I choose to keep PHP methods (but protected from errors/warnings):
Scroll to see -
preg::match_all()
,preg::replace_callback()
,preg::split()
-
I choose the modern regex API:
Scroll to see -
pattern()->test()
,pattern()->match()
,pattern()->replace()
For legacy projects, we suggest preg::match_all()
. For standard projects, we suggest pattern()
.
-
Legacy API
try { preg::match_all('/?ups/', 'ups', $match, PREG_PATTERN_ORDER); echo $match[0][0]; } catch (\TRegx\Exception\MalformedPatternException $exception) { echo "Invalid pattern"; }
-
Standard T-Regx
$pattern = Pattern::of("ups"); // pattern("ups") also works $match = $pattern->match('yay, ups'); if ($match->test()) { echo "Unmatched subject :/"; } foreach ($match as $detail) { $detail->text(); // (string) "ups"; $detail->offset(); // (int) 0 } $pattern->replace('well, ups')->with('heck') // (string) "well, heck";
Documentation
Full API documentation is available at t-regx.com. List of changes is available in ChangeLog.md.
Try it online, in your browser!
Open T-Regx fiddle and start playing around right in your browser. Try now!
Why T-Regx stands out?
:bulb: See documentation at t-regx.com
-
Seamless migration for legacy projects
- You can use T-Regx exception-based error handling, without changing your API much. Simply swap
preg_match()
topreg::match()
, and the method will only ever throw exceptions. Won't returnnull
orfalse
or issue a warning or a notice. Nor will it throw a fatal error. - Arguments, structure and return types remain the same. Your code will not break.
- You can use T-Regx exception-based error handling, without changing your API much. Simply swap
-
Automatic delimiters for your pattern
Surrounding slashes or tildes (
/pattern/
or~patttern~
) are not compulsory (if you usepattern()
). Methodspreg::match()
/preg::replace()
of course still require them, so we can swap betweenpreg::match()
andpreg_match()
. -
Prepared patterns
Using user data isn't always safe with PCRE (even with
preg_quote()
), as well as just not being that convenient to use. T-Regx provides dedicated solution for building patterns with unsafe user input. ChoosePattern::inject()
for simply including user data as literals. UsePattern::mask()
to convert user-supplied masks into full-fledged patterns, safely. UsePattern::template()
for constructing more complex patterns.function makePattern($name): Pattern { if ($name === null) { return Pattern::of("name[:=]empty"); } return Pattern::inject("name[:=]@;", [$name]); // inject $name as @ } $gibberish = "(my?name)"; $pattern = makePattern($gibberish); $pattern->test('name=(my?name)'); // (bool) true
-
Exceptions over warnings/errors
- Unlike PHP methods, T-Regx doesn't use warnings/notices/errors for unexpected inputs:
try { preg::match_all('/([a3]+[a3]+)+3/', 'aaaaaaaaaaaaaaaaaaaa 3'); } catch (\TRegx\SafeRegex\Exception\CatastrophicBacktrackingException $exception) { // caught }
- Detects malformed patterns in and throws
MalformedPatternException
. This is impossible to catch withpreg_last_error()
.try { preg::match('/?ups/', 'ups'); } catch (\TRegx\Exception\MalformedPatternException $exception) { // caught }
- Not every error in PHP can be read from
preg_last_error()
, however T-Regx throws dedicated exceptions for those events.
- Unlike PHP methods, T-Regx doesn't use warnings/notices/errors for unexpected inputs:
-
Working with the developer
- Simple methods
- T-Regx exposes functionality by simple methods, which return
int
,string
,string[]
orbool
, which aren't nullable. If you wish to do something with your match or pattern, there's probably a method for that, which does exactly and only that.
- T-Regx exposes functionality by simple methods, which return
- Handlers and state:
- Not even touching your error handlers or exception handlers in any way!
- In fact, T-Regx doesn't touch any global state.
- Strings:
- Fixing error with multibyte offset (utf-8 safe).
- Separate methods for positions:
-
offset()
- which returns position of a match in characters in UTF-8 -
byteOffset()
- which returns position of a match in bytes, regardless of encoding
-
- Groups:
-
When using
preg::match_all()
, we receive an array, of arrays, of arrays. In contrast, T-Regx returns an array of groups:Group[]
. ObjectGroup
contains all the information about the group. -
Group errors:
- When invalid group named is used
get('!@#')
, T-Regx throws\InvalidArgumentException
. - When attempt to read a missing group, T-Regx throws
NonexistentGroupException
. - When reading a group that happens not to be matched, T-Regx throws
GroupNotMatchedException
.
- When invalid group named is used
-
- Simple methods
-
Written with clean API
- Descriptive, simple interface
- UTF-8 support out-of-the-box
- No Reflection used,
No (...varargs)
,No (boolean arguments, true)
,(No flags, 1)
,[No [nested, [arrays]]]
- Inconsistencies between PHP versions are eliminated in T-Regx
-
Protects you from fatal errors
Certain arguments cause fatal errors with
preg_()
methods, which terminate the application and can't be caught. T-Regx will predict if given argument would cause a fatal error, and will throw a catchable exception instead. -
T-Regx follows the philosophy of Uncle Bob and "Clean Code"
Function should do one thing, it should do it well. A function should do exactly what you expect it to do.
What's better
or
Current work in progress
Current development priorities, regarding release of 1.0:
- Separate SafeRegex and CleanRegex into to two packages, so users can choose what they want #103
- Add documentation to each T-Regx public method #17 [in progress]
- Release 1.0
- Revamp of t-regx.com documentation [in progress]
Sponsors
- Andreas Leathley - developing SquirrelPHP
- BarxizePL - Thanks!
T-Regx is developed thanks to
License
T-Regx is MIT licensed.