DNest4 High number of calls of copy constructor/assigment operator for ModelType

High number of calls of copy constructor/assigment operator for ModelType

Open qacwnfq opened this issue 3 years ago • 1 comments

Hey,

I was just benchmarking DNest4 for some of my more memory intensive/complex models and I noticed, that the copy constructor and assigment operators were called quite often. This leads to some low performance in my use cases, which is why I tried to identify the code, where it happens.

I think this is the place in SamplerImpl.h:

	LikelihoodType& logl = log_likelihoods[which];

	// Do the proposal for the particle
	ModelType proposal = particle;
	double log_H = proposal.perturb(rng);

	// Prevent unnecessary exponentiation of a large number
	if(log_H > 0.0)
		log_H = 0.0;

    if(rng.rand() <= exp(log_H))
    {
    	LikelihoodType logl_proposal(proposal.log_likelihood(), logl.get_tiebreaker());

    	// Do the proposal for the tiebreaker
    	log_H += logl_proposal.perturb(rng);

	    // Accept?
	    if(level.get_log_likelihood() < logl_proposal)
	    {
		    particle = proposal;
		    logl = logl_proposal;
		    level.increment_accepts(1);
	    }

There are two lines here which impact performance: The first is ModelType propsal=particle; and the second is particle = proposal;

I believe these copies can be replaced by something more efficient for complex models, because we actually only have to track the proposal and log(H) and then decide if the ModelType should set state to the value of proposal or if it should keep the state it already had. In this case we wouldn't need to copy the whole model every time we try to update a particle, we would only change the internal state, which can probably even be done by std::move or std::swap since we discard the rejected proposals in this snippet.

I can make a PR for this, but I'm unsure how to deal with the changes in the interface for ModelType.

One solution which would not break existing code (but uses some more complex template code) is to check at compile time if the ModelType has the required members to circumvent copying and otherwise to sample like it is sampling now. This is something we've used in our convex polytope sampling library, see for example https://github.com/modsim/hops/blob/main/include/hops/MarkovChain/MarkovChainAdapter.hpp.

Any thoughts on this?

Nov 14 '21 14:11 qacwnfq

I don't think you can do the Metropolis algorithm without making a copy, because you need to backup the original state in case the proposal is rejected. I think you're right that the assignment that happens upon acceptance could be avoided by some sort of swap. However I failed to get an implementation that avoids the assignment operator here (I thought I understood std::move and so on, but I guess I don't). Feel free to make a PR if you can get that part working.

On Mon, Nov 15, 2021 at 3:33 AM Johann Fredrik Jadebeck < @.***> wrote:

Hey,

I was just benchmarking DNest4 for some of my more memory intensive/compex models and I noticed, that the copy constructor and assigment operators were called quite often. This leads to some low performance in my use cases, which is why I tried to identify the code, where it happens.

I think this is the place in SamplerImpl.h:

LikelihoodType& logl = log_likelihoods[which];

// Do the proposal for the particle ModelType proposal = particle; double log_H = proposal.perturb(rng);

// Prevent unnecessary exponentiation of a large number if(log_H > 0.0) log_H = 0.0;
if(rng.rand() <= exp(log_H))
{
	LikelihoodType logl_proposal(proposal.log_likelihood(),
  										logl.get_tiebreaker());

	// Do the proposal for the tiebreaker
	log_H += logl_proposal.perturb(rng);

  // Accept?
  if(level.get_log_likelihood() < logl_proposal)
  {
      particle = proposal;
      logl = logl_proposal;
      level.increment_accepts(1);
  }
There are two lines here which impact performance: The first is ModelType propsal=particle; and the second is particle = proposal;

I believe these copies can be replaced by something more efficient for complex models, because we actually only have to track the proposal and log(H) and then decide if the ModelType should set state to the value of proposal or if it should keep the state it already had. In this case we wouldn't need to copy the whole model every time we try to update a particle, we would only change the internal state, which can probably even be done by std::move or std::swap since we discard the rejected proposals in this snippet.

I can make a PR for this, but I'm unsure how to deal with the changes in the interface for ModelType.

One solution which would not break existing code (but uses some more complex template code) is to check at compile time if the ModelType has the required members to circumvent copying and otherwise to sample like it is sampling now. This is something we've used in our convex polytope sampling library, see for example https://github.com/modsim/hops/blob/main/include/hops/MarkovChain/MarkovChainAdapter.hpp .

Any thoughts on this?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/eggplantbren/DNest4/issues/39, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMBKOXRNMU55R7J2ZUG73DUL7CFJANCNFSM5H75NJJA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

-- Dr Brendon J. Brewer Phone: +64 27 5001336 https://www.brendonbrewer.com @.***:3 https://keybase.io/brendonbrewer

Nov 14 '21 19:11 eggplantbren

DNest4 DNest4 copied to clipboard

High number of calls of copy constructor/assigment operator for ModelType

DNest4
DNest4 copied to clipboard