ormolu icon indicating copy to clipboard operation
ormolu copied to clipboard

Challenging the CPP support

Open spl opened this issue 4 years ago • 9 comments

I like the goals of this project! I was hoping to use it, but it seems that I may be challenging the experimental C preprocessor support.

Describe the bug

I use CPP extensively in dlist to support multiple GHC versions. Ormolu fails with parse errors when I try to run it on Data/DList.hs, and I believe that's due to the CPP.

To reproduce

Run these commands:

git clone https://github.com/spl/dlist.git
cd dlist
git checkout -b e405a105ad7b2e6a83034bfa82efd22c4974ce02
ormolu Data/DList.hs

The output from ormolu is:

The GHC parser (in Haddock mode) failed:
  Data/DList.hs:36:3-19
  parse error on input `-- * Construction'

Expected behavior

Of course, I hoped it would work without effort. 😉 That said, I certainly understand that dealing with CPP when formatting code is a difficult task. So, I look at this issue as having one or both of two outcomes:

  1. You give me tips on how to resolve the issues Ormolu has with my code. (Or you tell me it's not possible!)
  2. You use dlist as a test case for your CPP support.

Either way, thanks for putting the effort into this project! 👏

Environment

  • OS name + version: macOS Mojave (10.14.6)

  • Version of the code:

    ormolu 0.1.2.0 UNKNOWN UNKNOWN
    using ghc-lib-parser 8.10.1.20200523
    

Additional context

N/A

spl avatar Jul 01 '20 03:07 spl

You give me tips on how to resolve the issues Ormolu has with my code. (Or you tell me it's not possible!)

CPP support is almost impossible to get right as you mentioned. I am not the maintainer of this project, so that would be @mrkkrp's call to decide on which issues should be fixed. But I can provide some tips and workarounds (until GHC somehow gets nicer conditional compilation support).

Ormolu currently formats CPP by ignoring the whole block, and re-inserting them after the formatting, so we have to make sure that the code has valid syntax without the CPP block. In your case:

...
#if defined(__GLASGOW_HASKELL__) && __GLASGOW_HASKELL__ >= 708
{-# LANGUAGE PatternSynonyms #-}
...
#endif
...
module Data.DList

#if defined(__GLASGOW_HASKELL__) && __GLASGOW_HASKELL__ >= 800
  ( DList(Nil, Cons)
#else
  ( DList
#endif

  -- * Construction
  , fromList
  , toList
  ...

doesn't work, because once you remove the CPP blocks there is three problems:

  • PatternSynonyms extension is now undeclared, so Ormolu can not parse patterns defined on that file. We can pass -o -XPatternSynonyms from command line to fix it, since moving it outside of CPP would break backwards compatibility.
  • The module export list is missing the opening parenthesis; so ( should be outside of CPP.
  • After that, the first list item is , fromLists starts with a comma, which is not allowed, so , there should be moved inside CPP.

So, those three things above should work, but the latter two expose a bug on Ormolu's CPP handling; and to workaround that we have to make it so that the first item in the export list should not be a CPP block.

In the end, below diff works in my case with the current master(bbcb812):

❯ git diff
diff --git a/Data/DList.hs b/Data/DList.hs
index 3cdd0aa..49ddadc 100644
--- a/Data/DList.hs
+++ b/Data/DList.hs
@@ -26,15 +26,16 @@
 -----------------------------------------------------------------------------
 
 module Data.DList
-
+(
+  id,
 #if defined(__GLASGOW_HASKELL__) && __GLASGOW_HASKELL__ >= 800
-  ( DList(Nil, Cons)
+   DList(Nil, Cons),
 #else
-  ( DList
+   DList ,
 #endif
 
   -- * Construction
-  , fromList
+   fromList
   , toList
   , apply

I hope this is useful. I guess a better solution would be to provide compatibility modules which are conditionally exposed via cabal, but that would be a bigger refactor.

utdemir avatar Jul 01 '20 07:07 utdemir

@utdemir Thanks for the thorough walkthrough! Much appreciated.

Ormolu currently formats CPP by ignoring the whole block, and re-inserting them after the formatting, so we have to make sure that the code has valid syntax without the CPP block.

That's helpful to know.

So, those three things above should work, but the latter two expose a bug on Ormolu's CPP handling; and to workaround that we have to make it so that the first item in the export list should not be a CPP block.

This is not a major thing, but I wonder if you can avoid changing the export list order by replacing the added id with DList and put the #if/#else/#endif around the (Nil, Cons). If you don't happen to try this, I might look into it later.

I hope this is useful.

Definitely!

I guess a better solution would be to provide compatibility modules which are conditionally exposed via cabal, but that would be a bigger refactor.

I don't recall considering that approach. Some potential concerns come to mind. Would that help with things like the export list? Would that avoid orphan instances?

spl avatar Jul 01 '20 09:07 spl

I wonder if you can avoid changing the export list order by replacing the added id with DList and put the #if/#else/#endif around the (Nil, Cons).

To answer my own question, it seems this doesn't work:

Input.hs ormolu Input.hs
module Input
  ( Type
#if cond
  (Con1, Con2)
#endif
  , value
  )
where
module Input
  ( Type,
#if cond
  (Con1, Con2)
#endif
    value,
  )
where

Ormolu introduces a comma after Type, which would break the code when cond is true.

spl avatar Jul 01 '20 09:07 spl

Here's an example of another problem I encountered:

Input.hs ormolu Input.hs
instance Class Type where
#if cond
  def = value1
#else
  def = value2
#endif
instance Class Type

#if cond
  def = value1
#else
  def = value2
#endif

spl avatar Jul 02 '20 06:07 spl

Oh, I guess my last comment (https://github.com/tweag/ormolu/issues/633#issuecomment-652820277) is an example of me forgetting the maxim:

Ormolu currently formats CPP by ignoring the whole block, and re-inserting them after the formatting

If I were to write instance Class Type where under each branch of the #if cond, it would unfortunately introduce some redundancy.

Out of curiosity, did you consider other strategies for dealing with CPP?

Lastly, I can still use Ormolu and – since there's not a lot of code in dlist – fix the errors created by formatting. So, this issue is not a major one, as far as it concerns me. Depending on your time and priorities, you should, of course, decide how much focus to give CPP.

spl avatar Jul 02 '20 08:07 spl

The present support for CPP is an experiment and a miracle in a way, because originally there was no intention to support CPP at all. I the minimal support for CPP just to make very simple things work, but the tricky cases like the ones you show in this thread are totally out of scope. It is unlikely that we will spent time on improving CPP support.

P.S. It is "Ormolu", not "Ormulo" ;-)

mrkkrp avatar Jul 03 '20 09:07 mrkkrp

It is unlikely that we will spent time on improving CPP support.

Understood, thanks!

P.S. It is "Ormolu", not "Ormulo" ;-)

Yes, I eventually noticed that my fingers kept typing the latter. But I didn't realize how often I did it on this page until now. 🙄 I've fixed them all to avoid confusing future readers.

spl avatar Jul 03 '20 10:07 spl

After re-reading the readme again (🔁), one thing that occurred to me was to try to disable Ormolu before and after the CPP statements so that Ormolu would ignore them. That is, I started with something like this, expecting the formatting to be fixed:

Test.hs:

module Test where

{- ORMOLU_DISABLE -}
#if MIN_VERSION_base(4,9,0)
{- ORMOLU_ENABLE -}

import Data.Semigroup(Semigroup(..)  )

{- ORMOLU_DISABLE -}
#endif
{- ORMOLU_ENABLE -}

The debug output follows:

$ ormolu --debug Test.hs 
warnings:


parse result:
  comment stream:
Test.hs:3:1-20 Comment False ("{- ORMOLU_DISABLE -}" :| [])
Test.hs:(3,21)-(11,21) Comment False ("{- ORMOLU_DISABLE_START" :| ["-- ORMOLU_CPP_MASK#if MIN_VERSION_base(4,9,0)","-- ORMOLU_CPP_MASK{- ORMOLU_ENABLE -}","-- ORMOLU_CPP_MASK","-- ORMOLU_CPP_MASKimport Data.Semigroup(Semigroup(..)  )","-- ORMOLU_CPP_MASK","-- ORMOLU_CPP_MASK{- ORMOLU_DISABLE -}","-- ORMOLU_CPP_MASK#endif","ORMOLU_DISABLE_END -}"])
Test.hs:11:22-40 Comment True ("{- ORMOLU_ENABLE -}" :| [])


module Test where

{- ORMOLU_DISABLE -}
#if MIN_VERSION_base(4,9,0)
{- ORMOLU_ENABLE -}

import Data.Semigroup(Semigroup(..)  )

{- ORMOLU_DISABLE -}
#endif
{- ORMOLU_ENABLE -}

But, as you can see, the code was not formatted.

Is there any way to make an approach like this work?

spl avatar Jul 08 '20 11:07 spl

Another instance where this has bitten me:

module Test.QuickCheck.Classes.IntegralDomain
  (
#if HAVE_SEMIRINGS
    integralDomainLaws
#endif
  ) where

becomes

module Test.QuickCheck.Classes.IntegralDomain
  (
  )
where

#if HAVE_SEMIRINGS
    integralDomainLaws
#endif

under

ormolu 0.1.2.0 master b95feb345e14f333988b06be0fc6534694d6d0e9
using ghc-lib-parser 8.10.1.20200412

hseg avatar Aug 18 '20 13:08 hseg