content icon indicating copy to clipboard operation
content copied to clipboard

Building relations between documents before querying

Open mklueh opened this issue 2 years ago • 5 comments

Hello,

I'm the developer of Gridsome Recommender Plugin, a plugin for the http://gridsome.com/ static site generator that automatically builds relationships between your documents by "machine learning" to automate certain things, such as tagging posts, linking between similar products or articles.

This is also showcased in my blog at https://overflowed.dev/ where you'll see "Other posts" sections that are auto-generated.

The approach Gridsome has is pretty nice, it reads all your markdown files or other content into one content layer, that exposes an API letting you transform the loaded documents, before those documents then are getting transformed into the GraphQL layer.

My plugin is hooked into the document layer and does training and relation-building there and when querying the content with GraphQL like this

query {
  allBlogPost{
    edges{
      node{
        id
        path
        title
        slug
        related{
          id
          path
          title
        }
        tags{
          id
          path
          title
        }
      }
    }
  }
}

It results in a response like this:

{
  "data": {
    "allBlogPost": {
      "edges": [
        {
          "node": {
            "id": "c49d59f3adc970325ddfe964b0704f16",
            "path": "/blog/gridsome-recommender-plugin-one/",
            "title": "This is a post about Gridsome and the Gridsome Recommender Plugin",
            "slug": "gridsome-recommender-plugin-one",
            "related": [
              {
                "id": "97343729018a3b99c94793e0dc98547d",
                "path": "/blog/vue-overflowed-dev-blog/",
                "title": "This is a post about Vue and the overflowed.dev blog created with Gridsome and the Gridsome Recommender Plugin"
              }
            ],
            "tags": [
              {
                "id": "iL9Z22tuPL",
                "path": "/tag/gridsome/",
                "title": "Gridsome"
              },
              {
                "id": "B_gnVHnF6",
                "path": "/tag/plugin/",
                "title": "Plugin"
              }
            ]
          }
        }
       ]
     }
   }
}

The related and tag nodes are appended by the gridsome-recommender-plugin and reference either other posts or tags from the tag collection, which are Markdown files themselves.

In case someone is interested in more details I'm leaving this blogpost here as well https://overflowed.dev/blog/building-a-gridsome-plugin-for-related-posts/

Migration to nuxt/content

Although the Gridsome developers did not officially abandon the project, there is very few activity on GitHub and there is no clear direction anymore. Some pull requests, including one of myself are not merged for ages, and Vue 3 is in my opinion far from being supported and threads like this give me no hope at all that the maintainers are even interested in talking to the community at all https://github.com/gridsome/gridsome/issues/1632

I'd like to migrate my blog to Nuxt 3 in future, but I'd really like some similar solution to mine to help with such tasks like tagging, categorizing, finding related posts and products, especially in a static site environment it can be time consuming to do this manually when using CMS systems like Netlify CMS (like I do) that work directly on your git repository and require every change to be persisted in form of commits and pull requests, which is very slow and allows no bulk operations.

In my opinion, all it needs to make something like my plugin working with nuxt/content is a hook that allows retrieving and editing the entire loaded collection of all documents before it is made available to the query API.

Will this be possible with the rewrite for Nuxt 3 or can it made be possible?

Thanks in advance

mklueh avatar May 11 '22 19:05 mklueh

Hey @mklueh , Sorry for the late response I like the idea of having a hook for retrieving and modifying all contents. It help adding new features like your plugin, WikiLinks and ...

WDYT @Atinux @Tahul ?

farnabaz avatar Jun 22 '22 14:06 farnabaz

I think that you can use the content:file:afterParse hook in order to get the metadata & body of the document (https://github.com/nuxt/content/blob/main/src/runtime/server/transformers/index.ts#L34)

You can read more about them in https://content.nuxtjs.org/api/advanced

Atinux avatar Jun 28 '22 12:06 Atinux

@Atinux sorry for the late response. I'm currently looking into this issue again.

I think the problem generally is, I would need all parsed documents, not just a single one.

Is nuxt/content pre-loading and parsing everything at once or is everything loaded and parsed based on each query that is executed?

In the first case, it would be possible if there would be a "content:files.parsed" etc, providing a collection of all parsed files and its types. In the latter case I don't see a chance to make it working.

The essence of the idea is creating relationships between different documents, even between different types.

mklueh avatar Sep 04 '22 18:09 mklueh

Would you mind providing an example of what would you like to achieve @mklueh as a sandbox?

Atinux avatar Sep 05 '22 12:09 Atinux

Would you mind providing an example of what would you like to achieve @mklueh as a sandbox?

I'm not sure how I should do this, but just imagine something that automatically creates relations between posts for example based on similarities in the title or text. If you query one post by id, it will contain a list of posts that are similar.

Therefore all content must be loaded, analyzed and extended with references before the content is provided by the query function

This is a demo of my Gridsome plugin which does exactly that. All references on that page between tags and posts, posts and tags, posts and posts are auto-generated based on the context of texts and keywords https://mklueh.github.io/gridsome-plugin-recommender/

mklueh avatar Sep 05 '22 17:09 mklueh

@Atinux Hi, getting back to this as I'm stuck again with transitioning my site from Gridsome to Nuxt.

  1. My posts have tags
  2. Tags are dynamic pages, that should show which posts relate to them

Is there any chance to build this kind of relationship with Nuxt 3 content, without managing references on both sides of the relationship? With Gridsome, it was enough to reference a tag by id from one post, and Gridsome created the reverse reference automatically. From a CMS point of view, this is easy to handle. WIthout that, it would be a mess

This was the Gridsome config that

This was the Gridsome configuration of my website:

    {
      use: '@gridsome/source-filesystem',
      options: {
        path: 'content/tags/**/*.md',
        typeName: 'Tag',
      }
    },
    {
      use: '@gridsome/source-filesystem',
      options: {
        path: 'content/posts/**/*.md',
        typeName: 'Post',
        refs: {
          tags: 'Tag',
          authors: 'Author',
        },
      },
    },

It did the following:

  1. Read entities from markdown files and build a collection for posts and tags
  2. Reference tags from posts and vice verca (many-to-many)

mklueh avatar Oct 29 '22 19:10 mklueh