PhysiCell icon indicating copy to clipboard operation
PhysiCell copied to clipboard

Sporadic error when rules sharing cell type and behavior

Open drbergman opened this issue 1 year ago • 13 comments

Sometimes, when two rules have the same target cell type and the same target behavior, a segmentation fault occurs when building the dictionary of rules for that cell type. This behavior has been observed for rules version 2, but may also occur when using other versions. These remain untested.

The error does not always occur when two such rules exist. In fact, in the two instances this error has manifested in the community, both were resolved with the same approach: move the offending rule up in the rules list. That is, when the segmentation fault occurs, it is immediately after the problematic rule is displayed in the console and so the user can identify which it is. Simply move this rule up the list (the top of the rules list will do) and the problem appears to go away.

Please post here if you have also ran into this issue. If the solution above does not work for you, please post here and reach out; we will look into it asap.

drbergman avatar Dec 30 '23 18:12 drbergman

Have spent a lot of time looking into this. Bottom line, we need to get some of Paul's time to discuss the current implementation of rules. I have some thoughts.

One minimal test case that demonstrates this issue is:

  • create just one cell type, ctypeA
  • provide the following rules (not intended to be biologically meaningful)

ctypeA,volume,decreases,migration speed,100.,200.,4,0 ctypeA,pressure,increases,cycle entry,1,5,4,0 ctypeA,pressure,increases,migration speed,1,5,4,0

The key point is there are two rules with the same behavior, migration speed, with different signals. But another rule separates them.

rheiland avatar Jan 05 '24 13:01 rheiland

See PR #207 for a potential fix.

drbergman avatar Jan 08 '24 05:01 drbergman

One bug that caused this problem was assuming the address to a std::vector's element was fixed. However, a vector dynamically allocates memory - it's not fixed. You can write a simple test program to see this. Therefore, this line:

             Hypothesis_Rule* pHR = &(rules.back());

is error-prone. In 1.13.1, it's at https://github.com/MathCancer/PhysiCell/blob/master/core/PhysiCell_rules.cpp#L781C3-L781C42

There are a few different solutions, one of which is Daniel's (above), i.e., using new to allocate memory, then changing all references to pointers. Another might be to use the reserve feature for vectors. Another would be to use std::list (but then for all related params of rules). A discussion of the design and future plans for how rules may be used would be good, e.g., do we foresee the use of dynamic (run-time) rules or will they always be defined one-time at startup?

rheiland avatar Jan 12 '24 13:01 rheiland

Would there be a performance hit for allowing them to change throughout a simulation? Obviously changing them will take some time, but will even just allowing them to change cause a performance hit?

drbergman avatar Jan 12 '24 13:01 drbergman

We won't really know until we do some profiling, hopefully on meaningful models. My gut feeling is it would be a relatively small performance hit.

rheiland avatar Jan 12 '24 14:01 rheiland

A "fix" which is minimal in the amount of code that needs changed is to reserve the size of rules, i.e., fix the max # of rules that are possible. Needs more testing. Also doesn't address concerns above about future possible plans for rules. In this constructor, make the following, single edit:

Hypothesis_Ruleset::Hypothesis_Ruleset()
{
	cell_type = "none"; 
	pCell_Definition = NULL; 

	// rules.resize(0);         // comment out the resize
	rules.reserve(1000);   // prevent dynamic allocation to fix memory address bug (max # rules=1000?)
	rules_map.clear(); 

	return; 
}

rheiland avatar Jan 13 '24 18:01 rheiland

It is satisfying to have such a minimal change address this issue. However, an artificial cap is...artificial. If the already-written pointer implementation gets around that while using exactly the minimal necessary memory (even if dynamically allocated when appending new rules), I would lean towards that. Backwards compatibility for those who have written custom code using rules strikes me as quite niche, especially considering the member functions that allow access to the rules via rules_map obviating the need to use rules directly. If I'm missing other pros to this fix or cons to my own, please point out 😄

drbergman avatar Jan 29 '24 20:01 drbergman

Totally agree with your summary. I do think that your proper fix will probably need even more testing than you've already done, maybe. I only suggested this 1-line edit workaround as an option for someone (I had a specific user in mind) who was willing to make a minimal change to C++ in order to get past the original error.

rheiland avatar Feb 08 '24 13:02 rheiland

Hi, all!

So I’ve looked a bit at the PR. From what I can see, Daniel’s fix works by--instead of declaring a new HR "in place", which can go out of scope and collide with future HRs, it declares a new HR in global memory (Hypothesis_Rule* pHR = new Hypothesis_Rule), so it doesn't go out of scope after exiting the function.

This prevents the "multiple HRs could accidentally occupy same memory space" issue because they uniquely persist, but it doesn't fully connect them back to the containing vector of rules until his later fix.

It's elegant! I think​ it works! But like Randy, I am a little concerned about needing to track it all the way through and be very careful on testing. I think this is a solution to grow towards.

Randy's solution has simplicity on its side, with a high probability of not breaking anything, at a cost of memory. I think we ought to do a bit of performance testing. However, since rulesets are stored in cell types and not individual cells (for now!), the memory cost ought to be pretty reasonable.

The other issue is that we now have a built-in limit that could trip the user up. Given that we can look at performance cost, we could see what the impact is of reserving 1000 rules per cell type. Recall that you only need one HR per behavior, which is relatively bounded. We could examine the expected number of behaviors for 100 cell types and 100 diffusing substrates.

In fact, we could do better. Instead of fixing 1000, we could (1) make this number a settable parameter, (2) calculate that parameter by paring the XML, based on number of cell types and number of diffusible substrates. In fact, even easier: look at the length of the behaviors dictionary, (3) multiplying by a safety factor, and (4) allocating that number of rules, rather than 1000.

This preserves the simplicity and safety, while taking the guesswork (and user work!) out of how many to pre-allocate.

What do you think?

MathCancer avatar Feb 08 '24 18:02 MathCancer

Ping @rheiland and @drbergman .

I view this as one of the most important things we can fix right now. What are your thoughts? Thank you!

MathCancer avatar Feb 10 '24 20:02 MathCancer

I can appreciate the extra precautions @MathCancer took in implementing the reserve method for addressing this. However, it still feels like an artificial solution. I don't really worry that it would come back to bite us...but I don't really know.

If we can be convinced that my solution works, then I'd prefer that. In case it helps build confidence in this fix, here's my logic in how I made changes:

  1. Hypothesis_Ruleset has a member rules, which we want to change from std::vector< Hypothesis_Rule> rules to std::vector< Hypothesis_Rule* > rules.
  2. As a member, rules must be written in the code whenever it is accessed
  3. Search entire PhysiCell repo for rules and replace instances of rules[INDEX].SOMETHING with rules[INDEX]->SOMETHING where INDEX is an integer index and SOMETHING is a member of Hypothesis_Rule.

The case-sensitive, match whole word search for rules (after excluding Makefile*,*.md,.git) returned only 95 results across 9 files, so it's pretty manageable to check manually. I just double-checked again and I didn't find any missed instances.

drbergman avatar Feb 11 '24 01:02 drbergman

First, I was the one who introduced the reserve hack, not Paul. My opinion is that we go with the proper fix of Daniel's. Yes, it's more code editing, but it's not a hack. I also believe he's done a thorough job of tracking down all edits that need to be made. We need to include more unit/functional tests involving rules (simple to more complex) that will help determine if things are working correctly. Obviously those tests should include the models+rules that exhibited the bug in the first place. And one of the important lessons learned from this Issue is that C++ std::vector elements have dynamic memory addresses and we should not store the address of an element in any PhysiCell data structure.

rheiland avatar Feb 12 '24 15:02 rheiland

Yes, @rheiland did that, and I regard it more highly than a hack, for the record. I think it has elegance in its simplicity and robustness.

If you are both convinced on the more extensive rewrite, I'm happy to go that way.

MathCancer avatar Feb 12 '24 20:02 MathCancer