fluffos
fluffos copied to clipboard
Extend checkmemory into a full garbage collector
I am not sure which algorithm would be good perhaps one of the copy sweep collectors to prevent fragmentation. Is the list of usages in checkmemory pretty comprehensive now? If it is I think it should be reasonably straightforward to modify the alloc behavior of array,mapping etc (svalue_t) to support full garbage collection. Perhaps try the simple mark/sweep first then modify it to something a bit better (but perhaps slightly more difficult to implement though not substantially so).
check_memory (Or it is actually the debugmalloc.cc ) is essentially a arena allocator that puts metadata before all allocated blocks. it also does reference counting (which is implemented manually,, then compare calculated reference with actual blocks allocated.
And, most importantly, debugmalloc doesn't really count LPC objects allocated on stack I think
The stack objects I think should be relatively easy to add to the root set since the stack in the driver is for better or worse currently a global variable.
So, sounds like you can start by adding a "mark" efun that loop through all svalues. We can talk about sweep part when that is working.
I actually haven't thought about this in a little while. I think the main problem is that there are a lot of cases where the driver uses svalues in ways different than what a simple garbage collector expects. What I mean by this is that is at certain times the driver keeps svalues for internal book keeping outside what is generated by lpc code. To cope with this one can enlarge the root set but it is a bit difficult to be certain the root set is covering everything so that one does not inadvertently delete used values.
This is a sketch of some code I have been working on that I need to start testing. It basically is a mixin class that will be inherited by svalue_t, array_t,mapping_t etc with specialized virtual functions for determining the set of pointers that point to the next group of items for the mark phase.
`class garbage_collectable { public: static void* operator new(std::size_t size) throw(std::bad_alloc); virtual std::list<garbage_collectable*> follow_set() = 0; void setMark(bool mark); bool getMark() const; private: bool mark_; };
class garbage_collector { public: static void init(std::function<std::list<garbage_collectable*>()> fun) { rootSetFunction_ = fun; }
static void record_new_pointer(void* p);
static void collect(); // mark and sweep
private: static std::function<std::list<garbage_collectable*>()> rootSetFunction_; //replace with a function or functor. static std::list<garbage_collectable*> allObjects_; static bool lastMark_; };
void garbage_collectable::setMark(bool mark) { mark_ = mark; }
bool garbage_collectable::getMark() const { return mark_; }
void* garbage_collectable::operator new(std::size_t size) throw (std::bad_alloc) { auto ptr = ::operator new(size); garbage_collector::record_new_pointer(ptr); return ptr; }
void garbage_collector::record_new_pointer(void* p) { allObjects_.push_back( static_cast<garbage_collectable*>(p) ); }
void garbage_collector::collect() { std::queue<garbage_collectable*> workQueue{ rootSetFunction_() }; auto mark = !lastMark;
while (!workQueue.empty())
{
auto element{ workQueue.front() };
workQueue.pop();
element->setMark(mark);
auto list = element->follow_set();
copy(list.begin(), list.end(), back_inserter(list));
}
allObjects_.remove_if([mark = mark](auto element) { element->getMark() == !mark; });
lastMark = !lastMark;
} `
#include
#include
class garbage_collectable { public: static void* operator new(std::size_t size) throw(std::bad_alloc); virtual std::list<garbage_collectable*> follow_set() = 0; void setMark(bool mark); bool getMark() const; virtual ~garbage_collectable(); private: bool mark_; };
class garbage_collector { public: static void init(std::function<std::list<garbage_collectable*>()> fun) { rootSetFunction_ = fun; }
static void record_new_pointer(void* p);
static void collect(); // mark and sweep
private: static std::function<std::list<garbage_collectable*>()> rootSetFunction_; //replace with a function or functor. static std::list<garbage_collectable*> allObjects_; static bool lastMark_; };
void garbage_collectable::setMark(bool mark) { mark_ = mark; }
bool garbage_collectable::getMark() const { return mark_; }
void* garbage_collectable::operator new(std::size_t size) throw (std::bad_alloc) { auto ptr = ::operator new(size); garbage_collector::record_new_pointer(ptr); return ptr; }
void garbage_collector::record_new_pointer(void* p) { allObjects_.push_back( static_cast<garbage_collectable*>(p) ); }
void garbage_collector::collect() { std::list<garbage_collectable*> workQueue = rootSetFunction_(); auto mark = !lastMark_;
while (!workQueue.empty())
{
auto element = workQueue.front();
workQueue.pop_front();
element->setMark(mark);
auto list = element->follow_set();
copy(list.begin(), list.end(), back_inserter(workQueue));
}
// this needs to delete before removing... loop through and delete and set to nullptr first?
for (auto& elem : allObjects_) {
if (elem->getMark() == !mark) {
delete elem;
elem = nullptr;
}
}
allObjects_.remove_if([mark = mark](auto element) { return element == nullptr; });
lastMark_ = !lastMark_;
}
std::function<std::list<garbage_collectable*>()> garbage_collector::rootSetFunction_ = nullptr; std::list<garbage_collectable*> garbage_collector::allObjects_ = {}; bool garbage_collector::lastMark_ = false;
Problem is this is actually C++14 code ;-)
I think need to test and debug it now then implement a rootsetfunction that encapsulates the various sources of variables in the checkmemory code.
Sorry have been looking at other things for the last week. I just got back to looking at this. I think for the most part testing the mixin code and making sure it works with all the types in vm/internal/base is the key. The idea would be to replace all the malloc/calloc calls with new calls in the allocate functions in the respective interfaces then make sure that the global link list is updating correctly. Step after that would probably be to look through the check memory code to compute the root set- and perhaps add the otable objects to the root set as well.
Actually this might be a bit harder than i initially thought because of the fact that svalues are not allocated on the heap and I am not sure at present where to find the code that allocates svalues. if they are only stack allocated it isn't difficult but if there are other uses of svalue_t it might not quite work.
I am actually feel I am close to figuring this out. Will potentially post a patch for this sometime this week.
What is the new apply_ret_value for?
Progress report down to one grep file for array_t. seems like the gc code root set determination as expected depends on which packages are included since the parser maintains a bunch of static global variables. there might be more files that do this? how do i check/guard if a certain package is included?
Check the #define
from local_options...