cplayground icon indicating copy to clipboard operation
cplayground copied to clipboard

Implement seq_file iterator to improve performance under load

Open reberhardt7 opened this issue 4 years ago • 0 comments

Cplayground gets info from the kernel by reading /proc/cplayground. That file is generated by the cplayground kernel module, which iterates over the processes and "prints" to a buffer managed by the seq_file interface. The way that works is seq_file allocates a buffer, and each "print" appends to the buffer. If the buffer runs out of space, then subsequent prints are discarded until our code finishes running; then, seq_file discards the too-small buffer, allocates a new one double the size, and reruns our code, restarting the process of printing process info.

When there are a lot of processes running or a lot of open files, this creates severe performance issues. It's not uncommon to see seq_file run our code 5+ times before it finally succeeds because it finally got the buffer size right.

A better fix is to implement the seq_file interface. Instead of having one function to generate the entire file (ct_seq_show in our existing code), we should do the following:

  • When the cplayground file gets opened, we should call our get_containerized_processes function to populate a list of processes. We can store this in the file private data. This example from the seq_file documentation looks promising:
      static int ct_open(struct inode *inode, struct file *file)
      {
      	struct mystruct *p =
      		__seq_open_private(file, &ct_seq_ops, sizeof(*p));
    
      	if (!p)
      		return -ENOMEM;
    
      	p->foo = bar; /* initialize my stuff */
      		...
      	p->baz = true;
    
      	return 0;
      }
    
  • We should use the iterator interface to iterate over the processes in the process list. That way, if the buffer is overflowed, we don't need to restart writing the entire file from scratch; seq_file can copy the output that was successfully generated up until this point, then restart only from the process where the output overflowed.
  • We'll need to free the list of processes when the file is closed

We should also confirm that performance improves. I'm pretty sure it should based on my understanding of seq_file, but if I am wrong and seq_file still restarts file generation from scratch on overflow, then this is not helpful and only adds more complexity.

reberhardt7 avatar Apr 16 '20 20:04 reberhardt7