Adam Derewecki

Results 15 issues of Adam Derewecki

As we're separating algorithmic changes from performance changes, there should be a standardized way to measure how two different versions perform from a purely performance point of view -- rate,...

As we make algorithmic changes, we should make sure that we're not degrading the quality of the recommendations.

The current scoring mechanism has a percentage threshold that must be exceeded by (intersections / set_a_members_length) for the result to be scored. This is really not fair to all sets...

We can scale pretty easily on any work node (like EC2), so if we can prepare work units and distribute them across machines, we ought to be able to crank...

The test data is great, but makes it hard to test performance at the scale Suggestomatic was intended for. Internally at Causes we have a test set of about 900m...

There's a few algorithms that are supposed to be faster than O(m+n) time for a set intersection. Most of this are in academic papers that take 10 pages to describe...

Related to ccf4ff3bd207c6f14486578dd95a2a8392f9af3d, there's no sanity check for some of the more basic things the engine is doing. There should be a full test suite instead of just a smoke...

61e59f891f17485c528c19b6736e4d24b8c5aa53 pointed to how fragile the data preparation step really is -- a unittest suite that performs sanity checks on the data is going to be crucial to the continual...