black icon indicating copy to clipboard operation
black copied to clipboard

Cython grammar support

Open ambv opened this issue 7 years ago • 19 comments

We'd like to be able to format .pyx, .pxd, and .pxi files, too.

ambv avatar Jun 15 '18 22:06 ambv

Hi, @pablogsal.

ambv avatar May 07 '19 16:05 ambv

Hello Łukasz!

I have been working on extending black for Cython. Here is my approach: Use a different grammar file, for the Cython grammar. Start with the Python grammar and add Cython grammar rules to it one by one. I have added most "simple" rules to the grammar so far, and I'm adding formatting rules for the blindingly obvious cases.

Here are some issues I faced in my endeavor:

  • As a result of using a different grammar file, there is a symbol mismatch (the same symbol may be represented by a different integer in the Python grammar and Cython grammar). This requires functions and top level variables using type_repr and syms to be parameterized with regards to the grammar's symbols. This does not affect tokens. (this may require encapsulating some of the global state)
  • Because of the way pgen works right now, string literals in the grammar cannot match NAME in the grammar. This is not that big of an issue in some cases and can be solved by disabling the keywords, however in other cases it's much harder as it conflicts with other rules. (like with the new expression: new = 5 vs new vector[int]()) I plan on deferring this issue for later.

Also I've expanded the tokenizer to accept integer literals ending in an arbitrary amount of (l, L, u, U), which affects affects python parsing as well. (single (l/L)-suffixed integer literals were already tokenized, even though they are invalid in Python3.)

Once I gather a sizable amount of rules, I'll let you know so you can inform me of your preferred formatting rules I'd love to hear any suggestions/recommendations you have with regards to anything I've implemented so far, or any preference on how to proceed on the mentioned issues.

grigoriosgiann avatar Jul 30 '19 16:07 grigoriosgiann

[Just as a clarification, @GrigoriosGiann is working with us (Python team in Bloomberg) in the Cython support as part of his internship]

pablogsal avatar Jul 30 '19 18:07 pablogsal

@pablogsal @GrigoriosGiann, do you have something that you would like people to try/provide feedback on?

jakirkham avatar Oct 08 '19 18:10 jakirkham

Are there any news on this? Even if you only have C syntax (no C++) or only .pxd or something, it would be really interesting to publish...

Synss avatar Dec 06 '19 20:12 Synss

Are there any news on this?

Yeah, we will soon do a PR to iterate over :). I was caught in many other responsibilities and could not finish some required steps until now.

pablogsal avatar Dec 06 '19 22:12 pablogsal

error: cannot format filename.pyx: Cannot parse: 11:13: from cpython cimport array
error: cannot format filename.pyx: Cannot parse: 145:6: 	cdef int ml = len(mtype)

im getting this on latest dev version

RyanHope avatar Mar 05 '20 01:03 RyanHope

Is there any further update on this? I'm running into issues still with formatting Cython. Any tips @pablogsal?

jlucier avatar May 22 '20 17:05 jlucier

Hi all!

I have a small update on the issue since I was working on it a bit in the last time. Basically I have rebased changes made by @pablogsal (https://github.com/pablogsal/black/tree/cython - hope you don't mind me taking it) on top of the current master branch and tried to add absent Cython features/rules, and to fix some existing bugs (you can check out my WIP-branch at https://github.com/ArseniiDunaevQC/black/tree/cython-dev). First of all, it could be already (limitedly) used for some of the basic .pyx files. Here I would for sure be happy to hear any suggestions/recommendations as well as preferred formatting rules.

There are still some other issues that hinder further development:

  • As @grigoriosgiann mentioned above, there are difficulties with matching grammar string literals with NAMEs. The hardest things for the moment are Python enumerations (Enum class) and using new as variable name.
  • It would be great to have support for Cython compile-time constants and conditional compilation using DEF, IF etc. keywords (see docs) and include.

ArseniiDunaevQC avatar Jul 30 '20 12:07 ArseniiDunaevQC

Hi all!

I have a small update on the issue since I was working on it a bit in the last time. Basically I have rebased changes made by @pablogsal (https://github.com/pablogsal/black/tree/cython - hope you don't mind me taking it)

The actual work in progress that we had does not live there but in https://github.com/bloomberg/black (branch cython_support). If you are going to base your work on top of this, please talk with @grigoriosgiann first.

pablogsal avatar Jul 30 '20 13:07 pablogsal

What is needed to get Black working with Cython? Is there any way to push that integration? :)

TheRealBecks avatar Jun 05 '21 09:06 TheRealBecks

What is needed to get Black working with Cython? Is there any way to push that integration? :)

https://github.com/bloomberg/black/tree/cython_support - Take this, rebase, retest etc. and get a PR up ... We would need pretty comprehensive tests to accept tho - But I am sure we can find people to help with the testing.

It probably needs a pick and pull from the branch rewrite as we've totally refactored out of one init.py / single py file to lots of modules. This would be cool to support.

cooperlees avatar Jun 05 '21 16:06 cooperlees

I use cython extensively in my projects so I can do a lot of testing, but I don't have time to dive into someone else's project right now and rewrite code.

On Sat, Jun 5, 2021 at 12:47 PM Cooper Lees @.***> wrote:

What is needed to get Black working with Cython? Is there any way to push that integration? :)

https://github.com/bloomberg/black/tree/cython_support - Take this, rebase, retest etc. and get a PR up ... We would need pretty comprehensive tests to accept tho - But I am sure we can find people to help with the testing.

It probably needs a pick and pull from the branch rewrite as we've totally refactored out of one init.py / single py file to lots of modules. This would be cool to support.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/psf/black/issues/359#issuecomment-855265232, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABCNTL5MOAKFTFBAM5YTCLTRJIIPANCNFSM4FFI324A .

RyanHope avatar Jun 05 '21 17:06 RyanHope

That branch is based on augmenting blib2to3 to parse Cython. But that's a problem because we eventually want to replace blib2to3 with a PEG-based parser so that we can keep parsing newer versions of Python.

JelleZijlstra avatar Jun 05 '21 17:06 JelleZijlstra

Hope to see the release soon

kennylajara avatar Jul 04 '21 17:07 kennylajara

So the Cython integration is stuck?

TheRealBecks avatar Jul 21 '21 07:07 TheRealBecks

Any progress on this? I'd be happy to take a look at pushing forward @grigoriosgiann's solution sometime this winter if not.

sabard avatar Oct 28 '22 04:10 sabard

No progress. We have a vague intention to switch to a LibCST parser at some point, but as far as I know there's no concrete work being done at the moment.

JelleZijlstra avatar Oct 28 '22 04:10 JelleZijlstra

For the record: There is at least cython-lint to gain some formatting standard for cython: https://pypi.org/project/cython-lint/

MuellerSeb avatar Apr 11 '24 11:04 MuellerSeb