pyhf icon indicating copy to clipboard operation
pyhf copied to clipboard

[WIP] Dask Integration

Open lukasheinrich opened this issue 7 years ago • 6 comments

Description

Adresses #259 this was really a drop in replacement (except or tensorlib.ones(), where one needed to add chunks)

simple logpdf evaluation works

import pyhf
import pyhf.tensor

import pyhf.tensor.dask_backend
pyhf.tensor.dask_backend.dask_backend()
db = pyhf.tensor.dask_backend.dask_backend()
pyhf.set_backend(db)

import pyhf.simplemodels
pdf = pyhf.simplemodels.hepdata_like(signal_data=[7.], bkg_data=[50.], bkg_uncerts=[7.])


testdata = pdf.expected_data(pdf.config.suggested_init())

v = pdf.logpdf(pdf.config.suggested_init(), testdata)

print(v.compute())

Checklist Before Requesting Approver

  • [ ] Tests are passing
  • [ ] "WIP" removed from the title of the pull request

lukasheinrich avatar Sep 13 '18 18:09 lukasheinrich

just making @kratsg @matthewfeickert aware.. not for review yet.. nice thing is we get easy visualization of the computational graph

screenshot

lukasheinrich avatar Sep 13 '18 18:09 lukasheinrich

interestingly some of the basic tensorlib tests fail

>       assert np.std(values) < 1e-6
E       assert 0.2802208533607308 < 1e-06
E        +  where 0.2802208533607308 = <function std at 0x1155a41e0>([-16.948276294321396, -16.948274612426758, -16.94827651977539, -16.948274612426758, -17.648827643136507])
E        +    where <function std at 0x1155a41e0> = np.std

a) good thing we have tests :) b) probably good to add more tests for each of the tensorlib methods so that we now better which one is failing

edit: reason is that in the tests we compare poisson-from-normal values not real poisson, forgot to set the flag in the dask backend

lukasheinrich avatar Sep 13 '18 19:09 lukasheinrich

Coverage Status

Coverage decreased (-0.2%) to 97.278% when pulling d8e2a663e776d15ce5058bd23f9165258b160fca on tensor/dask into b143ffd7c7d874c5144a3ca299aad325afc52d45 on master.

coveralls avatar Sep 13 '18 22:09 coveralls

@lukasheinrich Can you verify that the updated backend + optimizer table is correct?

matthewfeickert avatar Sep 14 '18 00:09 matthewfeickert

I'm going to rebase this against origin/master to bring in the work from PR #262

matthewfeickert avatar Sep 15 '18 15:09 matthewfeickert

From Issue #259:

we definitely need to add the new methods in #262. Apart from that we might want to understand Dask a bit better

@lukasheinrich abs, zeros, concatenate, and einsum are copied in now. As you point out, at the moment we are just using Dask as a NumPy clone for the most part. Are there explicit things that you wanted to try to explore for this PR or should we first do more research on Dask's capabilities and then come back to this?

matthewfeickert avatar Sep 15 '18 16:09 matthewfeickert