the-algorithm icon indicating copy to clipboard operation
the-algorithm copied to clipboard

Enhancement: Efficiency Improvement for GFS Scala Intersection Calculator

Open ethanknights opened this issue 1 year ago • 3 comments

Currently, the IntersectionValueCalculator.scala utility performs a redundant comparison to assign the largerArray/smallerArray variables (which are used for binary-search to identify their intersection).

This PR suggests to assign both variables simultaneously, to marginally improve computational efficiency.

PS. The associated missing comment is also removed - happy to replace that etc.

ethanknights avatar Apr 06 '23 16:04 ethanknights

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Apr 06 '23 16:04 CLAassistant

LGTM, legit improvement

SimpleCookie avatar Apr 07 '23 14:04 SimpleCookie

As the author notes, the performance improvement here seems marginal. This is suggested by the fact that this is a simple calculation used to choose an algorithm, which is where the bulk of the computation will presumably happen.

The real danger here is that largerArray and smallerArray get passed to the methods in the wrong order. It looks correct to me, but if it were wrong, there would be no compile time error, whether from strong typing or (based on my cursory search) unit tests.

With that in mind, I think a better approach would be

  1. break the comparison out into a separate method (e.g., smallerArrayFirst(x: ByteBuffer, y: ByteBuffer): (ByteBuffer, ByteBuffer))
  2. call that method from within each of the algorithmic methods
  3. write a unit test to validate that smallerArrayFirst performs as intended
  4. if possible, write tests for each algorithm method to verify they are operating on the small and large arrays as intended

einnocent avatar Apr 07 '23 22:04 einnocent