commonmark-hs icon indicating copy to clipboard operation
commonmark-hs copied to clipboard

Seemingly aggressive superclass constraints on `IsBlock` and `IsInline`

Open tysonzero opened this issue 3 years ago • 14 comments

Specifically it is not clear to me why Show, Rangeable or HasAttributes are needed to parse markdown. These extra constraints are hard for me to satisfy for the types I want to work with.

tysonzero avatar Sep 28 '20 10:09 tysonzero

Rangeable is needed because we add source position information. If you don't need this, you can define a trivial instance for your type: ranged _ x = x. If you don't control the type, use a newtype wrapper.

HasAttributes is needed because the parser allows attributes to be added to everything. (An extension must be enabled, but because of the way this works the possibility of this needed to be integrated into the core.) Again, easy to define a dummy instance if you don't need it.

Show -- I'm not sure it's needed for parsing, but it has been useful in debugging to have it in there. Again, easy to define a dummy instance.

Yes, I can confirm that tests pass with the following patch:

diff --git a/commonmark/src/Commonmark/SourceMap.hs b/commonmark/src/Commonmark/SourceMap.hs
index 7f45fdb..3294f87 100644
--- a/commonmark/src/Commonmark/SourceMap.hs
+++ b/commonmark/src/Commonmark/SourceMap.hs
@@ -53,20 +53,17 @@ newtype WithSourceMap a =
         WithSourceMap { unWithSourceMap :: State (Maybe Text, SourceMap) a }
         deriving (Functor, Applicative, Monad)
 
-instance (Show a, Semigroup a) => Semigroup (WithSourceMap a) where
+instance Semigroup a => Semigroup (WithSourceMap a) where
   (WithSourceMap x1) <> (WithSourceMap x2) =
     WithSourceMap ((<>) <$> x1 <*> x2)
 
-instance (Show a, Semigroup a, Monoid a) => Monoid (WithSourceMap a) where
+instance (Semigroup a, Monoid a) => Monoid (WithSourceMap a) where
   mempty = WithSourceMap (return mempty)
   mappend = (<>)
 
-instance (Show a, Monoid a) => Show (WithSourceMap a) where
-  show (WithSourceMap x) = show $ evalState x mempty
-
 -- | Extract a parsed value and a source map from a
 -- 'WithSourceMap'.
-runWithSourceMap :: (Show a, Monoid a)
+runWithSourceMap :: Monoid a
                  => WithSourceMap a -> (a, SourceMap)
 runWithSourceMap (WithSourceMap x) = (v, sm)
   where (v, (_,sm)) = runState x (mempty, mempty)
@@ -102,7 +99,7 @@ instance (IsBlock b a, IsInline b, IsInline (WithSourceMap b), Semigroup a)
                addName "referenceLinkDefinition"
   list lt ls items = (list lt ls <$> sequence items) <* addName "list"
 
-instance (Rangeable a, Monoid a, Show a)
+instance (Rangeable a, Monoid a)
          => Rangeable (WithSourceMap a) where
   ranged (SourceRange rs) (WithSourceMap x) =
     WithSourceMap $
diff --git a/commonmark/src/Commonmark/Types.hs b/commonmark/src/Commonmark/Types.hs
index 2ea9f04..c87dfc4 100644
--- a/commonmark/src/Commonmark/Types.hs
+++ b/commonmark/src/Commonmark/Types.hs
@@ -62,7 +62,7 @@ data ListType =
      -- first Text is before, second Text is after enumerator
      deriving (Show, Ord, Eq, Data, Typeable)
 
-class (Monoid a, Show a, Rangeable a, HasAttributes a) => IsInline a where
+class (Monoid a, Rangeable a, HasAttributes a) => IsInline a where
   lineBreak :: a
   softBreak :: a
   str :: Text -> a
@@ -81,7 +81,7 @@ class (Monoid a, Show a, Rangeable a, HasAttributes a) => IsInline a where
   code :: Text -> a
   rawInline :: Format -> Text -> a
 
-class (Monoid b, Show b, Rangeable b, IsInline il, HasAttributes b)
+class (Monoid b, Rangeable b, IsInline il, HasAttributes b)
       => IsBlock il b | b -> il where
   paragraph :: il -> b
   plain :: il -> b

Still, I'm not sure the benefits of removing Show outweigh the costs.

jgm avatar Sep 28 '20 18:09 jgm

It seems like you could always add Show back when debugging right? I don't really see the advantage in having the constraint there when it's not being used.

tysonzero avatar Sep 29 '20 07:09 tysonzero

In principle, yes: but it's not trivial. You'd often need to add a Show constraint to a whole chain of definitions just so you can put a trace in.

jgm avatar Sep 29 '20 17:09 jgm

I figured that most types that represent some kind of textual output would have a Show instance. Is there a good reason why your type doesn't?

jgm avatar Sep 29 '20 17:09 jgm

The type we are using is a virtual dom implemented in GHCJS, so it's full of a ton of IO and ->.

tysonzero avatar Sep 29 '20 23:09 tysonzero

Would it be hard to wrap it in a newtype and define a dummy Show instance for that?

jgm avatar Sep 30 '20 06:09 jgm

Yeah that's what we plan on doing for the time being. Always on the lookout for ways to make our code more clean though!

tysonzero avatar Sep 30 '20 09:09 tysonzero

I just noticed ToPlainText which seems more problematic, as unlike Show it seems like it is actually used in a meaningful way in the parser, so a dummy instance will presumably break the AutoIdentifiers extension.

tysonzero avatar Oct 01 '20 02:10 tysonzero

Oh yes, that's true. There are certain places where we need to downshift some inline content to plain text (e.g. in generating alt tags for images).

jgm avatar Oct 01 '20 16:10 jgm

Would it be possible for the parser to keep track of the parsed plain text, to avoid needing to pull it back out of the result type?

tysonzero avatar Oct 02 '20 01:10 tysonzero

Looking more closely: in core commonmark, toPlainText is only used in SourceMap and in Html. So you shouldn't have a problem there. In commonmark-extensions, as you note, it's used for auto_identifiers. You could always avoid that extension.

jgm avatar Oct 02 '20 05:10 jgm

Yeah for now that extension just won't work. I don't think it's an extension we were really planning on using, although keeping compatibility with gfm would be nice.

tysonzero avatar Oct 02 '20 09:10 tysonzero

I just wanted to chime in and say this issue was very useful for me to read. This clarification helped me understand better and it might be worth mentioning in some documentation if this behavior is kept (requiring Show instance).

To make myself useful, I wouldn't mind making a PR to the README.md or some of the docs for exactly that, myself, if you want.

hyperrealgopher avatar Jan 23 '21 06:01 hyperrealgopher

@tsonzero

Would it be possible for the parser to keep track of the parsed plain text, to avoid needing to pull it back out of the result type?

It's a good thought; I want to keep this open to come back and reconsider this.

@hyperrealgopher

Sure if there is some documentation that would have been useful for you, don't hesitate to propose it.

jgm avatar Jan 23 '21 17:01 jgm