median icon indicating copy to clipboard operation
median copied to clipboard

erlang experiment in distributed median finding

finding the median of a trillion numbers http://matpalm.com/median see above page for walkthrough, basic instructions follow


generate a list of (approx) 1000 numbers whose min is 1, max is 100 and median is 40 spread this across 4 files; numbers.0, numbers.1, numbers.2, numbers.3

bash> ./generate_test_data.rb 1 40 100 1000 > numbers.all bash> cat numbers.all | ./spread_across_files.rb numbers 4


run ruby version

bash> ./median.rb < numbers.all


run erlang single process version

bash> erl -noshell -run median from_file numbers.all


run erlang multiple process version (all processes on local machine) using full list impl

bash> erl -noshell -sname bob -run controller init worker numbers.[0-3]


run erlang multiple process version (all processes on local machine) using list freq impl

bash> erl -noshell -run generate_binary_dicts main numbers.all numbers.dict 4 bash> erl -noshell -sname bob -run controller init worker_freq numbers.dict.[0-3]