baleen
baleen copied to clipboard
Change post object in order to avoid duplicate fetch
So the goal was to avoid duplicate fetch of a post object that is already in Mongo. Alas, even if the post object is in mongo, we might have fetched it or not.
Hence, the post has three states:
- Not in mongo
- In mongo but did not fetch HTML. In this case the content field in the Post object will contain the "summary"
- In mongo and have the HTML saved. In this case the content field in the Post object will contain the HTML.
Hence, I suggest that we have a state field to the Post object which will have the following states:
- CREATED / WRANGLED / FETCHED
Also add a method:
is_in_mongo() and was_fetched()
And the following behavior ! is_in_mongo() -> wrangle the post and fetch the html is_in_mongo() && ! was_fetched() -> fetch the html and set the content to the html is_in_mongo() && was_fetched() -> get the post from mongo and return it.