Statiq.Framework Run Tag Helpers on generated Markdown content

I have a feeling this likely isn't possible (guessing that the inner most RenderBody() is getting raw HTML from document's content) but I wonder if it is possible to have tag helpers run across the generated/embedded Markdown?

My tag helpers are running fine for all HTML inside .cshtml files but they don't pick up the generated HTML from Markdown.

In this case, I'm trying to auto-add rel="noopener" to all external anchors and tag helpers seemed the best way to do so without duplicating the effort (eg. by having a custom Markdown renderer).

One thought I had for this is if Statiq could optionally run the content from Markdown through the Razor engine though I figure that creates issues with @ needing to be escaped (or maybe auto-escape it to avoid embedding Razor in Markdown embeded in a Razor template).

Oct 31 '21 09:10 Turnerj

One thought I had for this is if Statiq could optionally run the content from Markdown through the Razor engine

Are you using Statiq Web or Framework? Web actually does exactly this! It's why you can set a Razor based layout and the Markdown files render with it.

though I figure that creates issues with @ needing to be escaped

This is by far the biggest pain in the ass with Markdown -> Razor rendering. And the problem is you just can't escape the @ because it might be Razor code that you're trying to flow through. Or it might be literal and shouldn't be interpreted as Razor code. You can see in the code documentation for the RenderMarkdown module what the default behavior here is:

Which I guess brings us back to using tag helpers and why they're not working in this situation even though they should. Could it be the lack of a good @addTagHelper directive? What does the Markdown file look like?

Nov 02 '21 01:11 daveaglick

I've spent a few hours trying to debug it and create a minimum example of the issue and I thought I got close - it looked like nested markdown files were the problem. That said, even when individually testing that outside of the site, it still wasn't right - I couldn't consistently recreate the issue.

What seems to be the problem is actually the Razor cache. Changing certain details about the tag helper doesn't seem to affect cached razor. Manually deleting the "cache" folder solved the issue.

Anyway, here is a link to a Git repo where you can reproduce the problem with the following steps:

https://github.com/Turnerj/StatiqRazorMarkdownTagHelperTest

Build and preview the site, viewing the output that shows text set by the tag helper
Stop Statiq and modify "TestConst" in the tag helper, rebuild and preview the new const value in the output
Stop Statiq and switch which attribute is commented out on the tag helper, rebuild and preview (the output still appears even though it shouldn't)
Stop Statiq and modify "TestConst" in the tag helper again, rebuild and preview the new const value in the output (still shouldn't appear but it does)
Stop Statiq and modify text in the markdown file, rebuild and preview (the output is gone in the Markdown frame)
Stop Statiq and modify "TestConst" in the tag helper again, rebuild and preview the new const in the Razor frame (the output shouldn't be appearing)
Stop Statiq, delete the cache folder, rebuild and preview (the output is gone in in the Razor frame now)

Nov 03 '21 09:11 Turnerj

The weird thing I find with it is that attributes on the tag helper seem to be cached differently than the tag helper code itself (that's what the parts about changing the constant are meant to prove). Additionally though, it seems that shared assets in an assembly cache (the tag helper) are re-generated for a single page when the content of the page is changed BUT the old shared assets still are used for other pages (that's what the part about changing the markdown file and the last change of the constant are meant to prove).

Nov 03 '21 09:11 Turnerj

What I was hitting in my original description though was because I was changing the attributes of my tag helper, not the pages themselves. What looked like to me that the tag helper wasn't running on the markdown page was technically correct but for a different reason - it wasn't running because the Razor cache was stale for that page.

Nov 03 '21 09:11 Turnerj

Fascinating! Thanks for all the research. You know what they say about hard problems in computer science and cache invalidation 😬

Manually deleting the "cache" folder solved the issue.

Yeah, that definitely suggests it's in the caching algorithm. The cache can be turned off globally, but I'd rather get to the bottom of why it's not invalidating documents that it should.

There's actually two kinds of caching going on - the first happens within a single execution of the app and is designed to speed up following executions. This cache compares the content and the metadata of a document and if either has changed, the document is invalidated and reprocessed. Of course, there's a lot of space in "compare the metadata" because metadata can be nested, contain objects, etc. So this cache is best-effort and can get confused (for example, it won't pick up if some other document or file being used changes in a way that would change the output).

The second form of caching is between runs and designed to speed up cold-start performance. This is what goes in the cache folder and is a much newer form of caching that was only added relatively recently. This cache contains a dump of all the Razor assemblies being used so that they can be loaded on the next application run without having to recompile. Again, the cache invalidation is based on a hash of content and metadata plus some timestamps for files from disk.

Anyway, here is a link to a Git repo where you can reproduce the problem with the following steps

Awesome!! Thanks for the repro with detailed steps. That should be enough to go on so I can track this down. Off the top of my head I'm stumped as to why it's not invaliding this document since you're clearly changing the file on disk.

...

UPDATE: Oh, I see! You're changing the tag helper code, not the Markdown or Razor page content. And since the tag helper is "taking over" an element that the Markdown renderer creates (in this case <h1>), it's applying quietly behind the scenes. This caching problem aside, defining a tag helper to specialize existing elements is a cool idea. But back to the topic! I suspect you'd see the same behavior without the Markdown step at all - if you added an <h1> directly in the Razor file for example.

This is a tricky one, and I can see why it's not invalidating now. The cache invalidation is based on the document itself, it doesn't know anything about external libraries or code. The same problem would exhibit if the Razor file used any call that was defined in code externally (either inside your app or brought in from a package, etc.). Since it would be way too hard to figure out and track all the code being called by a Razor view (and all the code it calls, etc. recursively) to determine if any of it would produce different results, we need to deal with this holistically.

I'd still prefer to err on the side of invalidating when we don't actually need to then not invalidating when we do need to (I.e. your case). The best I can come up with right now, assuming this is even possible, is to store a combined hash of the Statiq application assembly along with all linked assemblies - if any of those change, indicating changes to code inside your app or another linked app, then invalidate the cache. I'll have to research to see if there's a way to get a more stable hash from an assembly (I.e. that persists between builds). Maybe using a version would be better - though since that wouldn't have helped in this case, I'm not sure.

Anyway - does the explanation make sense as to what's going on? Any thoughts on a direction to help with this and similar cases?

Nov 03 '21 18:11 daveaglick

Looking around for ways to compare assemblies, I stumbled on https://www.codeproject.com/articles/501631/compare-two-dll-files-programmatically-using-hash which is about as complicated as I suspected - the problem is that the assembly is stamped with several things that change on every build so there's no (easy) way to look at two assemblies and determine if they contain the same code/IL. There's a push towards more deterministic builds now (which freezes all of this), but it's a toss-up whether a given assembly uses that feature.

So...we can probably rely on version numbers for external assemblies but I have no idea right now how we can say "when the code contained my Statiq app changes, invalidate the cache...but don't invalidate it if I'm just rebuilding the app but the code hasn't changed".

Nov 03 '21 19:11 daveaglick

So this looks like it's going to be a pretty deep rabbit hole. On the one hand, I think it would be a good thing to solve - I can see lots of other cases like this one not even related to tag helpers where code changes and the cache needs to be invalidated but isn't. On the other hand, there are definitely cases where the answer is simply "turn off the cache" because the invalidation logic will never be perfect. Having a hard time deciding if this situation is the former or latter.

Nov 03 '21 19:11 daveaglick

Okay, might have a fairly workable solution after talking it through on Twitter. I think we can generate an aggregated "all the code involved" hash by:

Turning on deterministic builds for the Statiq app through the Statiq.App.props file that ships with Statiq.App - it's a little sneaky, but deterministic builds doesn't hurt anything and if it's needed for caching to work well, it seems like a valid thing to turn on automatically.
Hashing the Assembly.ManifestModule.ModuleVersionID of the host app (which we know is now deterministic) and all referenced assemblies.
- Package references would have the same module ID because they're the same file.
- Local references may not have the same ID, so we'd need to document that in this case deterministic builds should be turned on for those as well.

Going to start work on this and see how practical is it.

Nov 03 '21 19:11 daveaglick

Thanks for your quick and detailed response!

This is a tricky one, and I can see why it's not invalidating now. The cache invalidation is based on the document itself, it doesn't know anything about external libraries or code. The same problem would exhibit if the Razor file used any call that was defined in code externally (either inside your app or brought in from a package, etc.). Since it would be way too hard to figure out and track all the code being called by a Razor view (and all the code it calls, etc. recursively) to determine if any of it would produce different results, we need to deal with this holistically.

So this is one of the things I find the most weird - I would have thought each Razor assembly was linked to the assembly with the Tag Helper. Then changing the Tag Helper would recompile its assembly and thus force recompilation of the assemblies that reference it (similar to what would happen in a normal build with project references). But I guess this is the result of binary compatibility - the dependent assemblies don't need a recompilation unless I changed something about a referenced method signature or perhaps even a parent Razor layout.

From everything else you've said, it sounds like you're taking the best path forward. Happy to help however I can with it though!

Nov 04 '21 04:11 Turnerj

Statiq.Framework Statiq.Framework copied to clipboard

Run Tag Helpers on generated Markdown content

Statiq.Framework
Statiq.Framework copied to clipboard