mleap icon indicating copy to clipboard operation
mleap copied to clipboard

Adding Batch leap frame and a sample batch tf transformer

Open sushrutikhar opened this issue 5 years ago • 4 comments

Currently in mleap we only have default leapframe which applies transformation to the dataset row by row. However, as TF does support predictions over a batch of requests and is internally optimised for that, we can leverage the benefits in mleap using a batch leap frame. This increases the throughput and decreases the latencies as opposed to a sequential processing. A BatchTransformer will take Seq[Row] as input and return back the transformed and enriched output as Seq[Row] A sample BatchTensorflowTransformer is added in this PR

Here is a comparison in benchmarking numbers (using a Gatling client) between DefaultLeapFrame and BatchLeapFrame, for a simple LR model written in Tensorflow The throughput gain is almost 2x

TF-Mleap-

================================================================================
---- Global Information --------------------------------------------------------
> request count 300000 (OK=300000 KO=0 )
> min response time 0 (OK=0 KO=- )
> max response time 238 (OK=238 KO=- )
> mean response time 7 (OK=7 KO=- )
> std deviation 10 (OK=10 KO=- )
> response time 50th percentile 5 (OK=5 KO=- )
> response time 75th percentile 9 (OK=8 KO=- )
> response time 95th percentile 26 (OK=26 KO=- )
> response time 99th percentile 55 (OK=55 KO=- )
> mean requests/sec 3750 (OK=3750 KO=- )
---- Response Time Distribution ------------------------------------------------
> t < 5 ms 146849 ( 49%)
> 5 ms < t < 20 ms 132300 ( 44%)
> t > 20 ms 20851 ( 7%)
> failed 0 ( 0%)
================================================================================

TF-Mleap with Batching

================================================================================
---- Global Information --------------------------------------------------------
> request count                                     300000 (OK=300000 KO=0     )
> min response time                                      0 (OK=0      KO=-     )
> max response time                                     68 (OK=68     KO=-     )
> mean response time                                     3 (OK=3      KO=-     )
> std deviation                                          2 (OK=2      KO=-     )
> response time 50th percentile                          2 (OK=3      KO=-     )
> response time 75th percentile                          4 (OK=5      KO=-     )
> response time 95th percentile                          8 (OK=8      KO=-     )
> response time 99th percentile                         12 (OK=12     KO=-     )
> mean requests/sec                                7142.857 (OK=7142.857 KO=-     )
---- Response Time Distribution ------------------------------------------------
> t < 5 ms                                          217808 ( 73%)
> 5 ms < t < 20 ms                                   81691 ( 27%)
> t > 20 ms                                            501 (  0%)
> failed                                                 0 (  0%)
================================================================================

sushrutikhar avatar Nov 19 '19 05:11 sushrutikhar

@hollinwilkins @ancasarb

sushrutikhar avatar Nov 20 '19 04:11 sushrutikhar

hey @ancasarb did you get a chance to have a look at the PR?

sushrutikhar avatar Jan 22 '20 09:01 sushrutikhar

This looks interesting. @sushrutikhar do you think our benchmark analysis in https://github.com/combust/mleap/issues/631 (between xgboost4j and mleap) could be related? The factor 2x might be a pattern between the two ?

lucagiovagnoli avatar Jan 23 '20 01:01 lucagiovagnoli

This looks interesting. @sushrutikhar do you think our benchmark analysis in #631 (between xgboost4j and mleap) could be related? The factor 2x might be a pattern between the two ?

@lucagiovagnoli the gain we saw is mainly because the default leap frame is not utilising underlying libraries's ability to do parallel processing. Looks like in your case the xgboost library being used is itself having performance issues. however, as an exercise we can try using the parallel leap frame introduced in this PR for xgbooost as well and see if that gives more performance gain over and above to the changes you proposed in your PR

sushrutikhar avatar Feb 19 '20 10:02 sushrutikhar