bechamel
bechamel copied to clipboard
Benchmarking memory footprint of data structures.
There's a measure that I want to do and I haven't managed to achieve this using Bechamel. The end goal is to measure the memory footprint of values using Obj.reachable_words
.
Suppose that we want to compare the memory footprint of different implementations of semi-persisent lists.
Let's consider that we have a module type SP_LIST
and different implementations: module A : SP_LIST
and module B : SP_LIST
.
Then we have a generic test signature with a make_list
function that will allocate a value, let's call it v
, that will be passed to the function bench
, which will be the basis of the test. I would like to compare the foot print of v
before and after the call of bench
.
module type TEST = functor (Impl : SP_LIST) -> sig
type t
val make_list : unit -> t
val bench : int -> t -> unit
end
My idea was to try to make a test with resources for modules A
and B
and then, use the alloc
and free
functions to register the values of interest in a table that is read by the measurement module and use Ob.reachable_words
.
However, this doesn't work. When the test is set up with multiple
allocations the alloc
and free
functions are called exactly once for each run of n
executions of bench
. So the measured values keep decreasing as n
grows. This is because the footprint of execution n
doesn't add up to the execution n+1
of bench
.
I would like to make a feature request, that I'm willing to implement, to add such measurement features. Although I don't know which design would fit properly with Bechamel's mindset.
The basic idea of bechamel
is to execute a given function N times and collect metrics during its N executions. Let's take the time metric and a function fn
. If we execute fn
1 time, we'll record how long the function took to execute (say 10ms). This assumes that we have this metric before execution and after execution. All we need to do is perform a subtraction afterwards. We'll assume that if we execute our function twice, we'll expect to get 20ms. The idea of the benchmark could be described in code in this way:
let benchmark fn =
let metrics = Array.make 3000 0.0 in
let run = ref 1 in
for idx = 0 to Array.length metrics - 1 do
let _N = !run in
let a = gettimeofday () in
for _ = 0 to _N do
fn ()
done;
let b = gettimeofday () in
metrics.(idx) <- b -. a;
run := !run + 1
done
Bechamel then tries to get results that should plot a curve (between our N and our subtractions) corresponding to an affine function: f(N) = T * N + beta
where T should be the time spent executing our function (in our example, 10ms).
This is the fundamental idea behind bechamel. There are one thing to note:
The metric is information which necessarily comes from the "system" (time, number of words allocated by OCaml, number of cache-misses, etc.). So the metric doesn't come from an OCaml value for example (as we might want with Obj.reachable_word
) but from what the system can tell us before and after our function is executed.
alloc
and free
were functions added (#25) after the fact because the function may need a resource (like a file-descriptor). The aim of these resources is not to participate in the metrics; on the contrary, the aim is to ensure that they have as little influence as possible on the metrics (to obtain 'fair' values). If we go back to our example code, in the case of having multiple resources, these are allocated before a metric value is obtained and afterwards.
--- a.ml 2024-03-26 17:27:12.085161748 +0100
+++ b.ml 2024-03-26 17:27:47.475366040 +0100
@@ -3,11 +3,13 @@
let run = ref 1 in
for idx = 0 to Array.length metrics - 1 do
let _N = !run in
+ let resources = Array.make _N alloc in
let a = gettimeofday () in
for _ = 0 to _N do
fn ()
done;
let b = gettimeofday () in
+ Array.iter free resources;
metrics.(idx) <- b -. a;
run := !run + 1
done
From what I understand you want to have a mix and match between the alloc
/free
API and the metrics. Your example is not as trivial as all that. With bechamel, you can see whether a function allocates (with minor_allocated
and/or major_allocated
) but it's difficult to know how a value changes depending on how it's used (whether it grows or not). If your value is a global (like a system resource), however, and you'd like to know how it evolves after calling fn
, there is a way, but I'd really like you to reconsider what seems relevant as a metric first :+1: .
@dinosaure thank you for your explanation of how Bechamel takes measures. I do understand your point about Bechamel being oriented to measure system resources rather than the dynamics of values.
Indeed, it is likely that the alloc
/free
API is not the right one. However, could we imagine a completely new kind of test
of type unit -> 'a
that return a value of interest on which we can take measurements instead of unit
? I understand that this may lead to very high consumption of memory and possibly result in out of memory errors.
To illustrate my idea, this is what I have in mind implemented in a module.
open Bechamel
type register = { mutable total : int; mutable clear : bool }
let register = { total = 0; clear = true }
module type TEST = sig
type t
val make_list : unit -> t
val bench : int -> t -> unit
end
module type WORDS_TEST = functor (_ : TEST) -> sig
include TEST
end
module MakeWordsCase : WORDS_TEST =
functor
(Test : TEST)
->
struct
include Test
let bench n l =
Sys.opaque_identity
(let before = Obj.(repr l |> reachable_words) in
bench n l;
register.total <-
register.total + (Obj.(repr l |> reachable_words) - before))
end
module Words : S.MEASURE = struct
type witness = unit
let load () = ()
let unload () = ()
let make () = ()
let get () =
if register.clear then (
register.clear <- false;
0.)
else (
register.clear <- true;
let m = register.total in
register.total <- 0;
m |> float_of_int)
let label () = "reachable-words"
let unit () = "words"
end
module Extension = struct
type 'w t = 'w Measure.measure
let words = Measure.register (module Words)
end
module Instance = struct
let words = Measure.instance (module Words) Extension.words
end
This works as expected. The drawback of this approach is that I have to take reachable words measurements separately as the bench
function of the MakeWordsCase
has some overhead.