gatsby-source-s3 icon indicating copy to clipboard operation
gatsby-source-s3 copied to clipboard

Attaching S3 images to another node

Open LpmRaven opened this issue 5 years ago • 19 comments

I just wanted to mention the idea of maybe documenting another way of using this plugin. The site I am building is an e-commerce platform with thousands of images (for all the products). This presented a major issue with using gatsby which is querying images. For a long time, I had a component that queried all images and matched them up to their respective product. (Like proposed in this stack overflow) This is highly inefficient, throwing warnings about the duration of the query.

An alternate to this is the attach the imageFile to the product on the data level, rather than when trying to render.

src/gatsby-api/create-resolvers/index.js

const resolvers = {
    AWSAppSync_Product: {
        imageFile: {
            type: 'File',
            resolve: async (source, args, context, info) => {
                const node = await context.nodeModel.runQuery({
                    query: {
                        filter: {
                            Key: { eq: source.image1 }
                        }
                    },
                    type: 'S3Object',
                    firstOnly: true
                });

                if (node && node.imageFile) return node.imageFile;
            }
        },
    },
}

module.exports = {
    resolvers
}

gatsby-node.js

exports.createResolvers = async ({ createResolvers }) => {
    createResolvers(resolvers)
}

src/components/image/index.js

import React from 'react'
import Img from 'gatsby-image'

export const Image = props => {
  if (props.imageFile && props.imageFile.childImageSharp && props.imageFile.childImageSharp.fluid) {
    return <Img className={props.imgClassName} alt={props.alt} fluid={props.imageFile.childImageSharp.fluid} />;
  }
};

Then use it like:

<Image
  imageFile={product.imageFile}
  alt=""
/>

AWSAppSync_Product is the type of node I am attaching my File to. (which can be found in the graphql playground on localhost). The resolve will match the Key of the S3Object with image1 (which is a string) on the product. This allows me to directly use the product images without having to run a query inside the image component.

In my opinion, this is a valuable piece of information once you wrap your head around it and it certainly has helped me a lot. Thanks @Js-Brecht.

LpmRaven avatar Jun 19 '20 04:06 LpmRaven

This is a really good idea! I'm actually also seeing the warnings for inefficient queries on a site with component-level filtering, I'll need to try it your way! 🎉

I'd love the plugin to document this, I think it's quite a common question to have when building a Gatsby site, since static queries don't take variables (https://github.com/gatsbyjs/gatsby/issues/10482). And I've mostly seen filtering built into components, like you mentioned.

Do you think this should be in the README? It's not really about the plugin itself, but more about a best practice for dealing with sourced images in Gatsby. We could have a new section, or alternatively some extra docs.

Do you want to draft this or should I @LpmRaven?

robinmetral avatar Jun 19 '20 08:06 robinmetral

Do you think this should be in the README? It's not really about the plugin itself, but more about a best practice for dealing with sourced images in Gatsby. We could have a new section, or alternatively some extra docs.

Do you want to draft this or should I @LpmRaven?

I think it would be good to include somewhere in the README. I would welcome you to draft something, I can assist.

LpmRaven avatar Jul 13 '20 10:07 LpmRaven

Hi @LpmRaven! I'm trying to wrap my head around this and apply it to my use case, and I'd love some help 🙂

If I understand correctly, you're linking your products with their images via this resolver in gatsby-node.js.

How would it look like for things like images in a blog post? Here are some initial thoughts:

  • I imagine that there should be something in the S3Object nodes themselves to link them to the right post, like an id or slug in the Key to match against. For example there would be first-post-001.jpg and first-post-002.jpg in S3, and they would be matched with the first-post.md article.
  • In your case, you're adding a single imageFile to each AWSAppSync_Product. We could probably have an imageFiles array instead, and this would allow us to filter through only the specific post's images in the Image component, rather than all S3 images.

Thoughts?

Also, maybe I've missed something, but something else I've done in the past for foreign-key relationships is use the GraphQL @link directive. I wrote about this in an answer on StackOverflow. Could this be an alternative to resolvers?

robinmetral avatar Jul 29 '20 16:07 robinmetral

Hi @robinmetral!

Yes, the resolver is linking the images with their respective products. This could also be done with the schema customisation API but I'm not sure if that would work with multiple images (for the same product/blog post) as the @link directive is a one-to-one relationship (as I understand, correct me if I am wrong), and would have to match the foreign-key exactly? Whereas, with the resolver API you could check if the Key contains first-post.

Yes, the s3Object Key should contain an identifier slug, your example is exactly how I would do it.

I'm actually going to start adding additional images for each product, I like the imageFiles array suggestion. Yes, you could then take the array of imageFiles and use them as you wish, I don't think you even need to filter them, depending on how you use them.

LpmRaven avatar Jul 31 '20 02:07 LpmRaven

Hi @robinmetral 🙂 I was hoping you could clear up my possible misunderstanding of how to use the @link directive.

I'm trying to get a better understanding of where the schemaCustomisation API and createResolvers API are best used. It just so happens I am running into some issues around this. I originally thought these resolvers would work, they populate the Category and SubCategory types and the product.category when logging, but in the graphql playground all the product.categorys are null.

const categoryResolvers = {
    AWSAppSync_Product: {
        category: {
            type: 'Category',
            resolve: async (source, args, context, info) => {
                if (source.categorySlug) {
                    return categorys.find(category => category.slug === source.categorySlug)
                }
            }
        },
        subCategory: {
            type: 'SubCategory',
            resolve: async (source, args, context, info) => {
                if (source.subCategorySlug) {
                    return subCategorys.find(subCategory => subCategory.slug === source.subCategorySlug)
                }
            }
        },
    }
};

As an alternative approach, I thought I would try using the link directive. There is a one-to-many relationship between categories and products, one category can apply to many products, would the @link directive work here? I'm creating the categories using sourceNodes and the AWSAppSync_Product is being created by gatsby-source-graphql.

const categorys = [
    {
        formattedName: "Mens",
        slug: "mens",
    },
    {
        formattedName: "Womens",
        slug: "womens",
    },
    {
        formattedName: "Boys",
        slug: "boys",
    },
    {
        formattedName: "Girls",
        slug: "girls",
    },
]

module.exports = {
    categorys
}
const typeDefs = `
type Category implements Node @dontInfer {
    formattedName: String!
    slug: String!
}

type SubCategory implements Node @dontInfer {
    formattedName: String!
    slug: String!
}
`

module.exports = {
    typeDefs
}
const sourceCategoryNodes = ({ actions, createNodeId, createContentDigest }) => {
    categorys.forEach(category => {
        const node = {
            ...category,
            id: createNodeId(`Category-${category.slug}`),
            internal: {
                type: "Category",
                contentDigest: createContentDigest(category),
            },
        }
        actions.createNode(node)
    })
}

module.exports = {
    sourceCategoryNodes,
}

gatsby-node.js

const { typeDefs } = require('./src/gatsby-api/create-schema-customization');
const { sourceCategoryNodes } = require('./src/gatsby-api/source-nodes');

exports.createSchemaCustomization = ({ actions: { createTypes } }) => {
    createTypes(typeDefs);
}

exports.sourceNodes = ({ actions, createNodeId, createContentDigest }) => {
    sourceCategoryNodes({ actions, createNodeId, createContentDigest });
};

How would I extend the existing AWSAppSync_Product type with a category that matches the product's categorySlug using @link?

LpmRaven avatar Aug 11 '20 04:08 LpmRaven

Just to note, I actually got my initial resolver implementation to work. I just needed the categorySlug to be requested alongside the category in the graphql playground; I'm still interested to know how I would do this with @link 😄

LpmRaven avatar Aug 11 '20 04:08 LpmRaven

@LpmRaven hope you don't mind me adding a bit to the conversation.

Theoretically, it looks like the following may work. I say theoretically because I don't have the option of testing this myself at the moment.

With foreign key fields like categorySlug and subCategorySlug, you would likely need something like this:

type Category implements Node @dontInfer {
    formattedName: String!
    slug: String!
}

type SubCategory implements Node @dontInfer {
    formattedName: String!
    slug: String!
}

type AWSAppSync_Product {
  category: Category @link(by: "slug", from: "categorySlug")
  subCategory: SubCategory @link(by: "slug", from: "subCategorySlug")
}

On the other hand, if you changed categorySlug and subCategorySlug to category and subCategory, you should be able to drop the from: parameter. Since @link uses the original node resolver behind the scenes, you shouldn't need to worry about overwriting the field.

You can find the logic for the extensions here. The @link resolver in particular is here. (at the time of writing this)

It basically says "go fetch a Category type node with the by field equal to the from field of the original node". from, by default, is the field that the directive was applied to. It even has support for overriding the type parameter, if you wanted to. Perhaps that would be useful if you had a custom Query level resolver that linked back to a Category node 🤷‍♂️.

It looks to me like it should also have support for collecting a list of categories, too. If category were an array of category slugs, then I imagine something like this would work:

type Category implements Node @dontInfer {
    formattedName: String!
    slug: String!
}

type SubCategory implements Node @dontInfer {
    formattedName: String!
    slug: String!
}

type AWSAppSync_Product {
  category: [Category] @link(by: "slug")
  subCategory: [SubCategory] @link(by: "slug")
}

Js-Brecht avatar Aug 11 '20 20:08 Js-Brecht

Hi @Js-Brecht , your input is always welcome!

Thank you for that explanation, I hopefully can get this working so I can translate it into an example for gatsby-source-s3.

I have attempted to implement it this way but get Error: Schema must contain uniquely named types but contains multiple types named "AWSAppSync_Product". Possible that gatsby-source-graphql is causing that issues.

LpmRaven avatar Aug 14 '20 02:08 LpmRaven

Hmm, yeah... I'm not sure if type merging will work properly when also using schema stitching. IIRC, merging and stitching happen at different stages, which is one reason that createResolvers are needed when using third party schemas. This was talked about somewhere, but I can't find where; It was ages ago, so it would take a lot of digging.

Just to make sure, AWSAppSync_Product isn't a Node type, right? Pretty sure that when using third-party schema stitching there is only one root type.

Js-Brecht avatar Aug 14 '20 02:08 Js-Brecht

Here's one comment: https://github.com/gatsbyjs/gatsby/issues/23444#issuecomment-624746337

Still can't find the original discussion

Js-Brecht avatar Aug 14 '20 02:08 Js-Brecht

appSync is the root node type and AWSAppSync_Product is a type within that. From what I have read, I don't think it is possible to extend AWSAppSync_Product whilst using gatsby-source-graphql.

I did manage to get it working with with the resolvers, so it's not such a big issue. I guess its something to note if anyone is trying to use gatsby-source-s3 with other source plugins.

LpmRaven avatar Aug 14 '20 05:08 LpmRaven

Since upgrading to V2, my implementation of resolvers no longer works as I was using my own version of createRemoteFileNode in my gatsby-node.js for the S3Objects. V2 allows for private S3 buckets, retreiving the file using a signedUrl, this breaks my implementation as the Url no longer exists on the node.

    if (node.internal.type === "S3Object" && node.Key && isImage(node.Key)) {
        try {
            const imageFile = await createRemoteFileNode({
                url: node.Url, // <-- no longer exists
                parentNodeId: node.id,
                cache,
                createNode,
                getCache,
                store,
                auth: {},
                httpHeaders: {},
                createNodeId,
                ext: null,
                name: null,
                reporter
            }).catch((error) => {
                reporter.error(error);
            });

            if (imageFile) {
                node.imageFile = imageFile;
            }

        } catch (error) {
            reporter.error(error);
        }
    } else if (node.Key) {
        reporter.warn(`${node.Key} is not an image`);
    }

At the point which the resolvers run, the localFile (foreign key || node.localFile___NODE = imageFile.id;) has not been resolved. So I am unable to retrieve the localFile and add it to my AWSAppSync_Product. (I get null)

imageFile: {
            type: 'File',
            resolve: async (source, args, context, info) => {
                if (source.image1 && !source.image1.startsWith('http')) {
                    const node = await context.nodeModel.runQuery({
                        query: {
                            filter: {
                                Key: { eq: source.image1 }
                            }
                        },
                        type: 'S3Object',
                        firstOnly: true
                    });
                    //console.log(node)
                    if (node && node.localFile) return node.localFile; // <--- This is null
                }
            }
        },

The only way I found to resolve it was to modify the plugin and add a line node.imageFile = imageFile;.

export async function onCreateNode({
  node,
  actions: { createNode },
  store,
  cache,
  reporter,
  createNodeId,
}) {
  if (node.internal.type === "S3Object" && node.Key && isImage(node.Key)) {
    try {
      // download image file and save as node
      const imageFile = await createRemoteFileNode({
        url: node.url,
        parentNodeId: node.id,
        store,
        cache,
        reporter,
        createNode,
        createNodeId,
      });

      if (imageFile) {
        // add local image file to s3 object node
        node.localFile___NODE = imageFile.id; // eslint-disable-line @typescript-eslint/naming-convention
        node.imageFile = imageFile; // <-- Added this line only. 
      }
    } catch (error) {
      reporter.error(error);
    }
  }
}

Is there a way around this?

LpmRaven avatar Aug 17 '20 04:08 LpmRaven

A couple things:

  • I notice that Url is capitalized in your first snippet, and in the last snippet it isn't.

  • I'm curious why the remote file node needs to be generated more than once. If you're implementing the image fetch yourself, and the plugin is also fetching it, then you're... kind of... duplicating the work. Headers from the first fetch will be cached, and then IF-NONE-MATCH will be used with the etag on the second fetch. This causes a second round-trip to the server, but it'd be quick because the second fetch will return a 304. However, the file node will be created again, and it will overwrite the file node created by the first fetch, so the second createRemoteFileNode will be like a round-about (and slower) way to query the file node.

  • Your custom resolver shouldn't be running until the entire node store/schema is already generated. Resolvers don't run until at least createPages() (which is why you have a graphql parameter in that endpoint that you can use to query the store). That means that all sourceNodes() and onCreateNode() endpoints should have run already.

Matter of fact, this:

        node.localFile___NODE = imageFile.id; // eslint-disable-line @typescript-eslint/naming-convention
        node.imageFile = imageFile; // <-- Added this line only. 

would be kind of doing the same thing, except you're loading all of the raw imageFile data up onto the S3Object node (which results in the data being duplicated in the store), instead of just linking the id. If node.imageFile = imageFile works, then node.localFile___NODE = imageFile.id should work, too.

Is just the localFile property null in your custom resolver, or is it the entire node returned from runQuery()?

Js-Brecht avatar Aug 17 '20 18:08 Js-Brecht

  • Yes Url is now url, but its not really any help now anyway as its signed and will have expired.

  • If I'm completely honest, I forgot the custom implementation was still there 😅 But my intention is to remove it and have this work with the plugin's remote file node.

  • I have uncommented the log in the resolver.

        imageFile: {
            type: 'File',
            resolve: async (source, args, context, info) => {
                if (source.image1 && !source.image1.startsWith('http')) {
                    const node = await context.nodeModel.runQuery({
                        query: {
                            filter: {
                                Key: { eq: source.image1 }
                            }
                        },
                        type: 'S3Object',
                        firstOnly: true
                    });

                    console.log(node)

                    if (node && node.localFile) return node.localFile;
                }
            }
        },

The log gives this. Which all seems fine.

{
  Key: '3230-blue-1.jpg',
  LastModified: 2020-08-13T07:05:09.000Z,
  ETag: '"3b21b4fd9c2a31b8831b804d279a059e"',
  Size: 59121,
  StorageClass: 'STANDARD',
  Bucket: 'images.mock',
  url: '<signedurl here>',
  id: '26f0732f-6fd6-5759-bdd6-ba2c10d69aaf',
  parent: null,
  children: [],
  internal: {
    type: 'S3Object',
    content: '{"Key":"3230-blue-1.jpg","LastModified":"2020-08-13T07:05:09.000Z","ETag":"\\"3b21b4fd9c2a31b8831b804d279a059e\\"","Size":59121,"StorageClass":"STANDARD","Bucket":"images.mock"}',
    contentDigest: '44830c4f1429ec25f3a94e4123cd81b9',
    counter: 355,
    owner: '@robinmetral/gatsby-source-s3'
  },
  localFile___NODE: '18b56aaa-96f0-5834-b6ce-51859d6e5b58',
  __gatsby_resolved: undefined
}

When I make the graphql queries the s3Object has the localFile, but imageFile is null. Should it be type: 'File',?

{
  s3Object(Key: {eq: "3230-blue-1.jpg"}) {
    Key
    localFile {
      childImageSharp {
        fluid {
          src
        }
      }
    }
  }
  appSync {
    listProducts(globalIdColor: "3230-blue", regionProductTypeCategorySlugSubCategorySlugSizeSupplierSlug: {beginsWith: {productType: "variant", region: gb}}) {
      items {
        globalIdColor
        image1
        imageFile {
          childImageSharp {
            fluid {
              src
            }
          }
        }
      }
    }
  }
}

With the response:

{
  "data": {
    "s3Object": {
      "Key": "3230-blue-1.jpg",
      "localFile": {
        "childImageSharp": {
          "fluid": {
            "src": "/static/3b21b4fd9c2a31b8831b804d279a059e/14b42/3230-blue-1.jpg"
          }
        }
      }
    },
    "appSync": {
      "listProducts": {
        "items": [
          {
            "globalIdColor": "3230-blue",
            "image1": "3230-blue-1.jpg",
            "imageFile": null
          },
        ]
      }
    }
  }
}

LpmRaven avatar Aug 18 '20 01:08 LpmRaven

The nodeModel queries the raw store, as opposed to graphql resolvers. In addition to that, Gatsby does work behind the scenes to link keys using the ___NODE foreign-key convention to their actual store node. So in createResolvers(), because it is querying the raw store and Gatsby hasn't done its magic, you will need to do the linking yourself.

Can you try this and see if it works?

        imageFile: {
            type: 'File',
            resolve: async (source, args, context, info) => {
                if (source.image1 && !source.image1.startsWith('http')) {
                    const node = await context.nodeModel.runQuery({
                        query: {
                            filter: {
                                Key: { eq: source.image1 }
                            }
                        },
                        type: 'S3Object',
                        firstOnly: true
                    });

                    console.log(node)

                    if (!node || !node.localFile___NODE) return null;

                    return await context.nodeModel.getNodeById({
                        id: node.localFile___NODE,
                        type: 'File',
                    });
                }
            }
        },

Js-Brecht avatar Aug 18 '20 02:08 Js-Brecht

It didn't seem to like the destructuring of context, but apart from that, IT WORKS! 🎉 Thanks for that @Js-Brecht !

LpmRaven avatar Aug 18 '20 02:08 LpmRaven

It didn't seem to like the destructuring of context

Oh yeah, probably because those are class members... herp :man_facepalming: :laughing:

EDIT: Fixed the code snippet, for anybody coming through later

Js-Brecht avatar Aug 18 '20 02:08 Js-Brecht

Thank you so much for jumping into this @Js-Brecht, and happy that it works for you @LpmRaven!

I'll leave this open for now until I find some time to write about this way of handling images somewhere. If you feel inspired, feel free to document it!

robinmetral avatar Aug 18 '20 10:08 robinmetral

Thanks for your interest in gatsby-source-s3. This plugin is moving into the Gatsby User Collective and this repo will be archived. Please open an issue in that repository, submit a PR if you'd like to see this implemented, or Join us on Discord if you have questions!

moonmeister avatar Jan 31 '22 17:01 moonmeister