openfoodfacts-python icon indicating copy to clipboard operation
openfoodfacts-python copied to clipboard

Implement FacetContainer Class

Open Ansh-Sarkar opened this issue 3 years ago • 2 comments

With reference to Issue #69 and the proposed changes to the method of retrieval of data from facets in https://github.com/openfoodfacts/openfoodfacts-python/issues/69#issuecomment-1135774535 a brief summary of the expected changes is as follows

  • [ ] Implementation of Pagination via FacetContainer Class with respect to the data retrieved from facets.
  • [ ] Implementation of various methods inside FacetContainer Class so as to provide easy and efficient means of accessing, manipulating and moving data.
  • [ ] Implementation of a python based filesystem cache module in order to locally store, and speed up retrieval of previously fetched facet data.
  • [ ] Extend FacetContainer Class to include support for the above mentioned cache module.
  • [ ] Implement appropriate and thorough tests for each of the above mentioned features.

Quoting Proposed Changes

Even though, we can surely increase the limit on the number of records returned, it will almost certainly lead to a decrease in performance and increased waiting times. A suggested way to solve this issue would be to create a set of new functions which handle pagination. We could have 2 different types of functions : get_all_<facet_name>() and get_page_<facet_name>() The get_all_<facet_name>() function would internally call the get_page_<facet_name>() function repeatedly until all the pages have been fetched one by one. Since this data can be large we can create a FacetContainer which shall store the entire fetched data while also providing easy and efficient access to functions which can be helpful in manipulating and moving the data around.

Combined, these 2 suggestions if implemented, should be able to solve the following issues

  • [ ] https://github.com/openfoodfacts/openfoodfacts-python/issues/69 : By dividing the entire available data into pages and also providing control over the number of records which should be returned per page.
  • [ ] https://github.com/openfoodfacts/openfoodfacts-python/issues/56 : The second part of this feature implementation involves the use of the FacetContainer class to implement functions to aid in Data Manipulation and movement. This class can be used to add more precise filters to the data stored inside it thereby acting as a powerful tool for working with records.

Originally Post by @Ansh-Sarkar in https://github.com/openfoodfacts/openfoodfacts-python/issues/69#issuecomment-1135774535

Ansh-Sarkar avatar May 26 '22 00:05 Ansh-Sarkar

Have started working on the implementation of the FacetContainer Class. Will open the first PR once its completed.

Ansh-Sarkar avatar May 26 '22 00:05 Ansh-Sarkar

@Ansh-Sarkar 100% in favor of this.

alexgarel avatar Sep 09 '22 12:09 alexgarel