beanmachine icon indicating copy to clipboard operation
beanmachine copied to clipboard

Writing equivalent program

Open adamisetty opened this issue 2 years ago • 1 comments

Hi, I am attempting to translate some programs from Numpyro to Beanmachine. I am not sure how to translate the following programs. These are written in Numpyro.

The first:

data = dict()
data['T'] =40
data['y'] =np.array([-1.0907,-2.4951,-2.0792,-3.3462,-3.2191,-3.8484,-4.1459,-2.1709,-2.7299,-2.3542,-3.0580,-2.8536,-2.9261,-2.6771,-2.9773,-4.7511,-3.2940,-4.1163,-4.0809,-3.3474,-3.5929,-1.7049,-3.5229,-4.1394,-4.0499,-3.7171,-2.6830,-3.2192,-1.7822,-1.7139,-3.1528,-3.2609,-2.9224,-2.4019,-3.1697,-4.0054,-2.3263,-1.7483,-1.0572,-3.4454])
data['x'] =np.array([0.0531,-0.4272,0.5755,-1.0550,-0.0014,0.3624,-0.9064,1.3946,-1.4005,-0.8725,-0.4256,-0.1929,0.6113,0.2235,-0.3359,-0.7862,1.0467,-1.3558,-0.4803,-0.4828,-0.4481,1.4168,-1.1272,-1.5252,-0.2323,-0.4525,-0.2356,-0.7895,1.6194,0.7358,-1.0254,0.9361,1.1059,0.8332,0.1137,-1.1564,1.8551,1.3096,0.8222,-1.4380])

def model():
    alpha=numpyro.sample("alpha", dist.Uniform(-5,5))
    beta=numpyro.sample("beta", dist.Uniform(-5,5))
    lag=numpyro.sample("lag", dist.Uniform(0,1))
    for t in range(1,data['T']):
        with numpyro.plate("size", np.size(data['x'])):
            numpyro.sample("obs_{}".format(t), dist.Normal(alpha+beta*data['x'][t]+lag*data['y'][t-1],0.5), obs=data['y'][t])
                
    


mcmc = MCMC(numpyro.infer.NUTS(model,step_size=1.0,adapt_step_size=True,adapt_mass_matrix=True,target_accept_prob=0.8,max_tree_depth=10,find_heuristic_step_size=False),num_samples=1000,num_warmup=1000,num_chains=1)
mcmc.run(jax.random.PRNGKey(1445666613))
params = mcmc.get_samples(group_by_chain=True)
mcmc.print_summary()

The second:

random.seed(1970170530)
np.random.seed(1444983722)

data = dict()
data['Nobs'] =10
data['y'] =np.array([9,8,6,2,2,3,5,1,4,7])
data['SubjIdx'] =np.array([1,2,3,4,5,6,7,8,9,10])
data['Nsubj'] =10

def model():
    omega=numpyro.sample("omega", dist.Gamma(2.0,3.0))
    kappa=numpyro.sample("kappa", dist.Beta(7.0,3.0))
    theta=numpyro.sample("theta", dist.Beta(1.0*np.ones([data['Nsubj']]),1.0*np.ones([data['Nsubj']])))
    for obs in range(1,data['Nobs']):
        with numpyro.plate("size", np.size(data['Nsubj'])):
            numpyro.sample("obs_{}".format(obs), dist.Binomial(10,theta[data['SubjIdx'][obs]]), obs=data['y'][obs])





mcmc = MCMC(numpyro.infer.NUTS(model,step_size=1.0,adapt_step_size=True,adapt_mass_matrix=True,target_accept_prob=0.8,max_tree_depth=10,find_heuristic_step_size=False),num_samples=1000,num_warmup=1000,num_chains=1)
mcmc.run(jax.random.PRNGKey(115211254))
params = mcmc.get_samples(group_by_chain=True)
mcmc.print_summary()

I am mainly not sure how to translate the loops in the observe statements in the above programs into Beanmachine. Any suggestions would be appreciated!

adamisetty avatar Apr 06 '22 04:04 adamisetty

Hi @adamisetty, a general formula for converting numpyro's imperative syntax to Bean Machine's declarative one would be roughly as follows:

  • convert numpyro.sample functions into bm.random_variable functions
  • convert the observe statements into an observation dict

I don't quite follow the examples though, in the second example omega, kappa, are not connected to the rest of the model/observation. The plate syntax is to denote conditional independence and for vectorization which we can ignore in this case. As a nit, you can also vectorize your numpyro code above (remove the for loop and just pass in the entire tensor). So something like the following (untested):

@bm.random_variable
def omega():
    return dist.Gamma(2., 3.)

@bm.random_variable
def kappa():
    return dist.Beta(7., 3.)

@bm.random_variable
def theta():
    ones = torch.ones([data['Nsubj']]).float()
    return dist.Beta(ones, ones)

@bm.random_variable
def obs():
    # we will manually vectorize the observations
    return dist.Binomial(10, theta()[data['SubjIdx']])

obs = {obs(): torch.tensor(data['y'])} # put the whole data 
queries = [omega(), kappa(), theta()]
samples = bm.GlobalNoUTurnSampler(queries, obs, num_iter, ...)

The first example is a time series problem which you can take a look at the HMM tutorial to see how to write an autoregressive model.

jpchen avatar Apr 08 '22 06:04 jpchen