launchpad icon indicating copy to clipboard operation
launchpad copied to clipboard

Add boolean option to merge sanity pages into combined files (one per content type)

Open benjaminbojko opened this issue 3 years ago • 10 comments

benjaminbojko avatar Aug 04 '22 19:08 benjaminbojko

Here are my thoughts....

First, Contentful source combines ALL data into 1 file. My thinking is each contentType or customQuery should be its own file (or groups of files with pagination).

do we default to pagination? or to full files? Why are we worried about pagination to begin with? If we have less records than the pageSize, the convo is useless.

Current working process with sanity...

  • Foreach contentType or customQuery, create a ContentResult
  • Foreach page of content, add data file
  • When we are done with that contentType or customQuery, check if we are combining into 1 file.
    • If so, collate ContentResult into 1 data file

For this to work with other sources, I think the big assumption is we are just saving "records" or items in the ContentResult which I think is a big assumption, but one I'm fine with.

pingevt avatar Aug 05 '22 15:08 pingevt

Definitely agreed on the one content type per file making the most sense and that the default should just be whatever the raw API response is. I think the core idea of launchpad simply caching API responses makes the most sense to me, and then that can be augmented by a feature/boolean flag that merges pagination results.

FWIW, launchpad could currently be configured to have one file per content type with Contentful if each content type was treated as its own source, but it's still inconsistent with the other content sources.

In terms of consolidating pages: My main argument for this is that certain apps might not have a reliable method to check for the number of local json files. E.g. a locally running web app wouldn't be able to get a directory listing w/o explicit server support and/or incrementing page numbers until a request fails. I feel pretty strongly about having the option to merge for the sake of usability.

benjaminbojko avatar Aug 10 '22 20:08 benjaminbojko

Agreed with all - I was really trying to understand what the "default" should be but I don't think there is one. It really is an either/or type thing.

I think the pattern I have been playing with for Sanity is a good pattern combining it all... let me try to get something in here to explain soon.

pingevt avatar Aug 11 '22 12:08 pingevt

Sounds good. IMO the default should be paginated jsons per type, with the option to merge pages. Combining all types into one file is a nice-to-have, imo, but we should probably add it just for backwards compatibility.

benjaminbojko avatar Aug 11 '22 15:08 benjaminbojko

Video in slack: https://bluecadet.slack.com/archives/C03RY7MRA4V/p1660233005600819

pingevt avatar Aug 11 '22 15:08 pingevt

  • [x] return array or single content result
  • [x] remove ContentResult::combine();
  • [x] remove "-full"
  • [ ] Make sure tickets to update other sources
  • [x] remove "customQueries"
  • [ ] move npm packages to correct package (I didn't understand and added to the main package)
    • [ ] Looks like Clay may have taken care of this...

pingevt avatar Aug 17 '22 19:08 pingevt

@benjaminbojko Re: return array or single content result

MediaDownloader makes an assumption that it is only being run once per content source with its options.clearOldFilesOnStart and options.clearOldFilesOnSuccess.

Either we should go back to only returning 1 result object per source OR pull the functionality for these options out of the media downloader class

My gut is saying, lets go back to combining everything into 1 Content Result for each source. Its keeping the rest of the pipeline unchanged.

pingevt avatar Aug 23 '22 21:08 pingevt

Ah yeah, the success flag is pretty important. I agreed: If it's not a huge hassle let's go the path of least resistance and return just one result.

There's issue #34 , which might be a good opportunity to revisit this.

benjaminbojko avatar Aug 23 '22 22:08 benjaminbojko

very cool! I kind of reverted what I originally did but made the combine() method in ContentResult static. And renamed it to combineContentResults(). That will just take the array of ContentResult objects.

pingevt avatar Aug 24 '22 12:08 pingevt

Sounds good. I'd love for that method name to be shorter though, ha. Could we mirror the JS API here more? E.g. Array has Array.from() and Array.of(). We could use those names or maybe just say ContentResult.combine(). The ContentResults part feels redundant otherwise (ContentResult.combineContentResults()).

Not sure if you can tell, but I get really hung up on naming, haha. Apologies for that.

benjaminbojko avatar Aug 25 '22 17:08 benjaminbojko