voc icon indicating copy to clipboard operation
voc copied to clipboard

Complete implementation of methods on standard types

Open freakboy3742 opened this issue 9 years ago • 39 comments

The builtin types of Python have a number of methods associated with them - for example, dict objects have update(), items(), keys(), pop() and so on. These methods need to be implemented in Java. In many cases, there will be a simple mapping to an analogous function in the underlying Java type; however, sometimes you'll need to write an implementation using the available Java primitives.

Tests are are also required; they should go in the module corresponding to the data type in the tests/datatypes directory.

Tests are based on comparing the output of some sample Python code when run through CPython to the output when run through VOC. When the two outputs are identical - and I mean byte-identical, down to the text and punctuation of the error message if appropriate - the test passes.

If you want an example to follow, look at the implementation of clear() on dict:

  • The @org.python.Method annotation is used to expose the clear() method as a symbol that should be available at runtime.
  • All arguments are exposed as org.python.Object, and then cast to the right data type as required.
  • Finally, the method returns a Python None.

freakboy3742 avatar Dec 04 '15 23:12 freakboy3742

Hi there, first timer here, trying to help :smile:

I'm working on the dict builtin type, but I've encountered problems and I need some assistance:

  1. dict functions like pop(), get() can get an additional, optional value to be returned on default, yet the Java functions only have one argument. Is this correct? Could another optional argument be added?
  2. Should I add my tests to DictTests class or create my own?
  3. I did not manage to figure out how to run any test. Do I run setup.py again and use voc as usual?
  4. I tried to run ant java to get the support library as well, but it returned an error.

Thanks in advance!

TalSk avatar Dec 19 '15 10:12 TalSk

Hi Tal! Thanks for offering to help out!

To answer your questions:

  1. Implementing optional parameters is a little odd, but it is definitely possible. The best way to explain how it works is probably to look at an example - the builtin function range(). Range can take 1, 2 or 3 arguments; the @method() declaration defines this by specifying 1 argument in args, and 2 in default_args. At time of invocation, VOC will allow any argument in default_args (or default_kwargs) to be omitted; if the implementation of the method is in Java, then responsible for filling in the blanks. In the method body, you check each argument for the value null (a Java null, not the Python org.python.types.NoneType.NONE); a null means "use the default value"; anything else is the specifically provided value. for that argument.
  2. Yes - you should add your tests to the existing DictTests.
  3. You can run the tests by running python setup.py test from the root directory of the project. This can take a while, so it isn't very convenient for rapid turnaround testing; I'd suggest taking a look at cricket to make it easier to run single tests.
  4. Can you provide details of the error message you're getting?

freakboy3742 avatar Dec 19 '15 10:12 freakboy3742

Gotcha, thanks for the quick response! Cricket looks amazing, I will use that! :+1: I am unsure where the get the python support library binaries, and Google was not helpful

TalSk avatar Dec 19 '15 11:12 TalSk

Try running ant java in the root directory of the project which has the build xml file.

theshahulhameed avatar Dec 19 '15 11:12 theshahulhameed

Alright, everything is working and I'm on to it, thanks for the help! One last question: I'm getting a funny behavior while testing __len__:

self.assertCodeExecution("""
            x = {'a': 1, 'b': 2}
            print(len(x))
            print(x)
            """)

Running the test I get the following error:

AssertionError: "2\n{'b': 2, 'a': 1}\n" != "2\n{'a': 1, 'b': 2}\n"
  2
- {'b': 2, 'a': 1}
+ {'a': 1, 'b': 2}
 : Global context

It seems that the Java out has flipped the order unlike the Python out, though this can be overlooked since built-in dictionaries aren't supposed to keep items' order. The test is passing fine using an empty or 1 item dictionary. Is opening an issue required?

TalSk avatar Dec 19 '15 13:12 TalSk

This is one of those things that is going to fall under "feature, not bug" :-)

Python dictionaries are unsorted data structures, so their order isn't guaranteed. This is something that regularly bites the Django test suite - it's trivially easy to write a test that depends on a dictionary being processed in a particular order, but moving from a 32 bit to a 64 bit machine can (and thats "can", not "will") change the order in which the dictionary is stored.

In this case, the default ordering of a Java HashMap isn't the same as the hashing order provided by Python on the same machine. I suppose we could be ultra-pedantic and consider this a discrepancy, and therefore a bug - but the Python core team themselves consider relying on dictionary hashing order to be a bug (see the release notes for 2.7.3).

So, the "fix" here is to modify the test. Some possibilities:

  1. Only use a dictionary of length 0 or 1 (which will have a predictable order)
  2. Rather than checking the output of x, check that x contains 'a' and 'b', but doesn't contain 'c'.
  3. Don't worry about checking the output of x at all - after all, in this particular case, I'm not sure it's checking anything in particular that the len(x) == 2 check isn't doing.
  4. Print a sorted version of the content. Of course, this assumes you have a working implementation of sorted() and items() as well, which is probably getting to the point where the test is too complex.

freakboy3742 avatar Dec 20 '15 03:12 freakboy3742

So I've managed to implement most of the dict methods (Yes, slow, but only able to work on weekends :cry:). Tough I still have some left and I need assistance.

  1. There are few methods (like __class__, __delattr__, __getattribute__) that appear in dir({}), but not in the Dict.java class. Is this for a reason or should I add them myself?
  2. The function __format__: I think that the format_spec should be defined one time, globally, and then re-used in every __format__ functions.
  3. The “rich comparison” methods. When called directly ({}.__lt__(1)) they return "NotImplemented", and when called using {} < 1 they raise the following excpetion: TypeError: unorderable types: dict() < int(). What should the code do, then?
  4. The __dir__ function has a similar problem with an explicit call using {}.__dir__() and an indirect call using dir({}) returning two different string outputs.
  5. The keys(), values() and items() functions, should return a "View" object. After doing some digging I figured that the "dict_keys" (for example) type which is returned using Python3 is actually the KeysView type from the package collections.Abc which needs to be imported (see here). How should I handle that?
  6. The functions dict.update takes a keyword argument, optionally. How can this be done? I couldn't find any method that uses keyword arguments to use as an example.
  7. If the return type of a method is a fixed type, should I change it from org.python.Object? (Same question goes for the arguments)

Thanks!

TalSk avatar Dec 24 '15 15:12 TalSk

  1. __class__ is an an attribute, not a method; it should be automatically populated in the obj.attrs HashMap on the Java object instance as part of object construction. __delattr__ and __getattribute__ are both methods that should be defined on the base org.python.types.Object definition. In both cases, you're the first person looking closely at getting a type to 100% Python compliance, so it's possible you might have hit a point where some of the internal handling of org.python.type.Type needs to be corrected to ensure that inherited methods are picked up as expected. While I would certainly like to see 100% compliance on the first swing, don't get too bogged down - an implementation that is 95% correct with a couple of notable differences will be 85% better than what is there right now; we can fix the remaining problems in a second pass.
  2. I agree. org.Python probably is the best place for this.
  3. Implement __lt__() as if it were being invoked directly as a function. The code to convert the NotImplemented into a TypeError will go in the implementation of the COMPARE_OP Python opcode.
  4. In this case, the dir() builtin is implemented in org.Python. As with (3), some exception and re-raise handling is required.
  5. I've got stubs for the sys and time modules; it sounds like you'll need to start a stub for the collections module, with an Abc submodule, containing a definition of a KeysView type. The basic definition of KeysView - constructors, toJava etc - should be close enough to org.python.types.Dict that you can use it as a template.
  6. You need to be clear whether it is (a) an argument with a default, or (b) a keyword argument with a default. The general prototype of a Python function is myfunc(a, b=1, *varargs, c=2, **kwargs); a is a required argument; b is an argument with a default; c is a keyword argument with a default). In this case, dict.update([E,], **F), so:
    @org.python.Method(
        varargs="E",
        kwargs="F"
    )
    org.python.Object update(org.python.Object [] E, java.lang.Map<org.python.Object> F)
  1. If the method is being exposed as a method in Dict, it needs to return org.python.Object, no matter what type the method is returning (even if it's returning None/void). Arguments should always be org.python.Object, org.python.Object [], or java.lang.Map<org.python.Object>, depending on whether it's an argument/keyword argument, a vararg, or a varkwarg.

freakboy3742 avatar Dec 27 '15 02:12 freakboy3742

I was looking at implementing the NotImplemented -> TypeError bit in COMPARE_OP, but just to make sure I'm on the right track - this is where the 'reflected operations' need to be dealt with right?

Or in other words, should the COMPARE_OP Python opcode be where we check if y is a subclass of x (but not the same class), and whether the operations return NotImplemented?

cflee avatar Apr 03 '16 06:04 cflee

I would love to begin contributing code, any where I should start?

WesselBadenhorst avatar Oct 10 '16 07:10 WesselBadenhorst

@wesseljb We've got a Guide for first time contributors on our website; if you've been there, and you've come to this ticket looking for specific advice on a builtin method to implement, I would suggest taking a look at the Str type. There are a bunch of simple methods on String dealing with case manipulation and so on, which should be fairly approachable for a first contribution.

freakboy3742 avatar Oct 10 '16 15:10 freakboy3742

Thank you, I will look at the methods on string.

On 10 October 2016 at 17:04, Russell Keith-Magee [email protected] wrote:

@wesseljb https://github.com/wesseljb We've got a Guide for first time contributors http://pybee.org/contributing/first-time/ on our website; if you've been there, and you've come to this ticket looking for specific advice on a builtin method to implement, I would suggest taking a look at the Str type. There are a bunch of simple methods on String dealing with case manipulation and so on, which should be fairly approachable for a first contribution.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pybee/voc/issues/38#issuecomment-252650128, or mute the thread https://github.com/notifications/unsubscribe-auth/AGvh47EWXP7y-GSkduvcccpNF7B5Cqr7ks5qylP8gaJpZM4GvREI .

WesselBadenhorst avatar Oct 10 '16 17:10 WesselBadenhorst

I've been working on bytes() in Batavia (https://github.com/pybee/batavia/pull/221). My next job there job will be to correctly implement all the args/kwargs unpacking and error management for user-defined functions, up until https://www.python.org/dev/peps/pep-3102/.

For VOC, I'd like to do the same: first work on the bytes() builtin and the underlying Bytes object implementation (https://github.com/pybee/voc/pull/302) to get the hang of the project, then move on to implement args/kwargs handling.

candeira avatar Nov 12 '16 11:11 candeira

I'm working on case changes and type checks in String but there is a test that's failing due to unsupported characters: 'á'. The error I got was:

Traceback (most recent call last):
  File "..\voc\tests\utils.py", line 379, in assertCodeExecution
    py_out = runAsPython(self.temp_dir, adj_code, extra_code, args=args)
  File "..\voc\tests\utils.py", line 148, in runAsPython
    return out[0].decode('utf8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc1 in position 112: invalid start byte

For now I've removed those characters from the test.

I have also implemented islower() and isupper() (#375). This is my first time contributing so I wanted to check if I was doing this right.

Also, is there a better way to look for unimplemented methods? I've just been searching for "NotImplementedError" in random files.

Thanks in advance.

SarthakSuri avatar Mar 01 '17 23:03 SarthakSuri

Searching for "NotImplementedError" is one approach; the other is to open a CPython shell, check to see what methods are available on a given type, and cross check with the Java implementation to make sure they're all present. Even adding the empty stubs of the functions that should be there would be a huge help.

freakboy3742 avatar Mar 02 '17 00:03 freakboy3742

I am working on String. I will implement startswith() and improve endswith().

bitdumper1 avatar Mar 11 '17 03:03 bitdumper1

Hi, I am working on String methods.. I will implement replace(),strip(),lstrip(),maketrans(),translate().

sipah00 avatar Mar 11 '17 09:03 sipah00

I'll be doing the partition() String method then.

gilmouta avatar Mar 11 '17 15:03 gilmouta

Hello, a first-timer here. Wish to contribute and help. I have some Java and python knowledge. Where and how could I start ?

Zoham avatar Mar 12 '17 15:03 Zoham

@Zoham have you checked the docs about contributing? There is a first timers guide here, with instructions and pointers to next steps: http://pybee.org/contributing/how/first-time/what/voc/

eliasdorneles avatar Mar 12 '17 16:03 eliasdorneles

Can I work on str.casefold?

encore0117 avatar Mar 13 '17 03:03 encore0117

Hey guys, which method implementation are still available that I can reserve and start working on ?

askari12 avatar Mar 28 '17 11:03 askari12

I would like to work on this if possible.. I have some experience in Java and Python. Thanks, Chandler

chandlerbaggett avatar Aug 02 '17 06:08 chandlerbaggett

Working on bytes.endswith.

ArturGaspar avatar Oct 11 '17 18:10 ArturGaspar

Hey, i'm working on bytes.zfill

joanasouza avatar Oct 15 '17 02:10 joanasouza

This would be my first open source contribution. Can I work on str.rsplit()?

sairajsawant avatar Oct 25 '17 14:10 sairajsawant

@sairajsawant Any method that isn't already implemented, and hasn't been claimed by someone else in the last few days is fair game - so by all means, implement str.rsplit()!

freakboy3742 avatar Oct 25 '17 23:10 freakboy3742

I'm working on str.rsplit() !

sairajsawant avatar Oct 26 '17 02:10 sairajsawant

Is str.rsplit() still taken? If not, I would like to try it.

hwrdch avatar Nov 27 '17 22:11 hwrdch

str.rsplit() is taken?If yes allow me to help

BLaAckRose avatar Feb 13 '18 07:02 BLaAckRose