php-scalar-objects icon indicating copy to clipboard operation
php-scalar-objects copied to clipboard

hello + multibyte encoding support

Open rainbow-alex opened this issue 10 years ago • 3 comments

Hi,

I was introduced to your project via a discussion on reddit.

I am wondering if you are still working on this, since there hasn't been much activity lately. If so, is multibyte encoding something you will support for strings?

thanks,

rainbow-alex avatar Aug 03 '14 06:08 rainbow-alex

Hey. The project is kind of split into two parts, this project we implemented an API in pure PHP, its goal is to define the interface and write unit tests to define the expected behaviour.

Then in this project: https://github.com/rossriley/scalar_objects I've started to work on the implementation as a native C extension, where there are new native classes for each of the scalar types, eg SplScalarString, SplScalarArray etc..

These handle the methods and forward them on to the appropriate native php function.

I'm not working on this project right now, seems more sensible to wait until phpng is merged into master and base it off that but at some point I'll get an RFC together to try and get it into PHP7.

With regard to multibyte, yes definitely the plan is that all these native methods will be multi-byte aware, eg:

$a = "x";
$a->length(); //1
$b = "☂";
$b->length(); //1

rossriley avatar Aug 03 '14 15:08 rossriley

Thanks for the clarification. I like the api you're proposing, but I kind of doubt this will be accepted into the PHP distribution. :( Even though it is backwards compatible, it is a huge break from php's idioms. A requirement of >=PHP7 is unfortunate as well... Anyway, I do like what you're doing and look forward to using this! Best of luck!

That's cool about the multibyte support. With PHP7 being a prerequisite, I assume strings will be utf-8 by default? What might the api look like if I have a string in a different encoding?

rainbow-alex avatar Aug 03 '14 17:08 rainbow-alex

No worries, It should be worth a go, whilst I think there's quite a few internals guys who wouldn't accept it I don't think it's a complete non-starter and it is based on the initial work done by Nikita who is a major contributor. So we'll see.

The string API will be chainable so the technique I'd suggest would be something like this...

$string->decode('utf_8')->length();

Or something similar to that.

rossriley avatar Aug 03 '14 19:08 rossriley