volatility3 icon indicating copy to clipboard operation
volatility3 copied to clipboard

Unify Linux plugins representation of pids

Open eve-mem opened this issue 2 years ago • 22 comments

Is your feature request related to a problem? Please describe. The "pid" as shown by different plugins is not consistent. Either displaying the pid or tgid value from the tast_struct.

For example the pslist plugin uses tgid: https://github.com/volatilityfoundation/volatility3/blob/develop/volatility3/framework/plugins/linux/pslist.py#L83

While malfind will report pid: https://github.com/volatilityfoundation/volatility3/blob/develop/volatility3/framework/plugins/linux/malfind.py#L83

And in fact if you wanted to use a pid filter for plugins, that is actually using pid rather than tgid: https://github.com/volatilityfoundation/volatility3/blob/develop/volatility3/framework/plugins/linux/pslist.py#L63

Describe the solution you'd like Picking one and sticking with it across the plugins. I'm happy to make the changes in a PR when a choice is made.

I think the biggest question is what would actually be preferred?

tgid is the most technically correct "pid" to display, but using that value and calling it pid in plugin outputs might be confusing to some? It'll catch people out when writing plugins or working with volshell.

However if all changes to pid are swapped to tgid and now none of the plugins report a pid result. That will be confusing to people that doesn't yet know what tgid is. They might want to find a process by a pid they have from logs etc, but it's now not obvious how to do that in vol.

I'm not sure personally what the right answer is, although I think I lean to changing all plugins to use the tgid value but reporting it as pid. We're using "pid" to mean "unique id for a process to use between plugins and different sources of data" and tgid does exactly that. It's what ps displays after all. It's just slightly confusing as the task_struct has a field called pid that might seem more natural to use.

We could go to the extreme of adding a get_pid() function to the task_stuct in the Linux extensions to keep it all central?

This is probably quite a minor point, but I'd personally love to see consistency across the plugins.

I'd be really interested in your thoughts. Thank you.

eve-mem avatar Jul 06 '23 10:07 eve-mem

I don't know enough about the structures to know the difference between a tgid and a pid. I'm happy to defer to @atcuno decision on this front?

ikelos avatar Jul 09 '23 22:07 ikelos

There was a suggestion at one point of potentially abstracting the concept of a process from any operating system into a single class in some way, but I dunno if anyone has the energy or desire to dig into all the details of doing that (or whether it would be a very useful thing to do)?

ikelos avatar Jul 09 '23 22:07 ikelos

I've been thinking about a generic way of handling processes (maybe worth discussion in a different "issue"?)

It's interesting. I think could work, but you'd need to be able to get to the actual process objects for the different OSes quickly or it could get annoying.

Really it could be a nice set of convenience features and it could be expanded over time, something to abstract away the way different OSes do things. I would think at a minimum you'd want something like:

  • unique id per process "pid"
  • parent pid
  • the default way of walking through all the processes (probably with a way to change it via parameters per OS to walk different lists etc)
  • start time
  • easy access to (or way to create) a layer to access the process private memory (if it exists)

That list could then get expanded over time. Arguments and environment variables also feel like they might be easy enough to add.

Things like file handles etc might be difficult to add, but we could stick with the current per OS way of dealing with that for now?

So would a minimal process object like that be useful enough to be worth doing? It would make things like yara, strings, and any other generic analysis plugins easier to make. It doesn't help with OS specific plugins, but I guess that's how they are already, so probably not making things worse?

I think at the moment I lean towards thinking it might be useful to do, but what do others think?

eve-mem avatar Jul 18 '23 06:07 eve-mem

I believe we already have a semi-consistent way of adding process layers. It might be we just define an asbtract interface, and make sure the _EPROCESS and whatever the linux equivalent is, both support it. Then we've have named methods (get_pid, get_ppid, get_starttime, etc) all of which would respond in a uniform manner.

I'm a little worried about chaining/listing processes, I'd like to be able to extend that later, but I'm not sure it should be part of the first iteration.

I agree I don't think it'll make things worse. Whether it makes things better or just adds an extra layer, I'm not sure. It should be transparent to existing plugins, because they'd continue to use the native objects, it's just that those native objects would also begin inheritting from the abstract process class. I figure we mock it up so people can see what it helps with and what it doesn't and then figure out if it's worthwhile (so invest a bit of time, but don't go overboard, it might still get thrown away). How does that sound?

ikelos avatar Jul 23 '23 20:07 ikelos

I don't know enough about the structures to know the difference between a tgid and a pid. I'm happy to defer to @atcuno decision on this front?

Hey! sorry for getting into this conversation :) ... my two cents: I know it's confusing, I've been there. This is particularly true in Linux, where the terminology holds distinct interpretations based on whether it's in user-space or kernel-space. The term PID typically refers to its representation in user-space. Within the kernel it's referred to as TGID. And, to finish screwing it up, the user-space TID corresponds to the kernel name of PID.

User-Space Kernel
PID task_struct.tgid
PPID task_struct.parent.tgid
TID task_struct.pid

Then, what from user-space is called a 'process' has TGID == PID. And a user 'thread' or 'lightweight process' TGID != PID.

All the above is explained in the getpid() syscall comment here.

Having said that, IMHO I wouldn't recommend changing the names in the plugins to their user-space ones. In my opinion, doing so could lead to more confusion among developers and potentially introduce more errors in the code. I would use the kernel names in our source code.

On the other hand, I like the idea of methods responding in a uniform manner for all the OSes. Maybe, it would be good to have a look at a cross-platform process libraries such as Python's psutils and see what they have done there. Although, I'm pretty sure they didn't use the kernel names :)

gcmoreira avatar Aug 14 '23 16:08 gcmoreira

So perhaps we deviate from the specific nomenclature for the OS, but just go with "what people would expect" for the uniform interface for all OSes?

ikelos avatar Aug 16 '23 19:08 ikelos

@gcmoreira yeah thats exactly what I mean.

The linux pslist plugin returns thr vaule from tgid, but calls it PID for the user. I think that's probably a good thing.

The issue I mean is that other plugins don't do this.

For example this scenario:

as a user I run pslist, find some process I'm interested in and note the PID. I then try the lsof plugin with a pid filter for that pid, but I get no results.

That's not because there isn't any, but because the pid filter function uses pid rather than tgid.

So if they don't match it causes confusion. It seems like a good idea if all plugins return "pid" in the same way.

eve-mem avatar Aug 16 '23 19:08 eve-mem

@ikelos I missed your comment somehow, but I think that is a good idea. What people would expect is exactly what should happen - with consistency across the plugins.

eve-mem avatar Aug 16 '23 19:08 eve-mem

Hey! We need to make a decision on this. It seems you agree with going for 'what people would expect', so let's see what @atcuno thinks about it and if it's ok proceed with that approach.

To prevent our code from becoming overly confusing, I would recommend, as a good practice, being more explicit when naming variables i.e. user_pid = task.tgid instead of just i.e. pid = task.tgid. And then yield from the individual variables like: yield user_pid, user_tid, user_ppid instead of yield task.tgid, task.pid, task.parent.tgid.

Regarding the PID filtering, @eve-mem your example above doesn't make sense to me. Filtering by task.pid actually would never miss any task. Even, if the user confused PID with TID. See the example below. Anyway I do understand what you mean there and agree we have the same issue as with the column names here. Currently, the PID in the --pid argument is meaning filtering by task.pid. And we don't have a way to only filter by task.tgid. However, having only one way/argument to filter by numeric ID, I still think that filtering by user_pid == task.pid is the best way to do it. Let me explain the two alternatives with an example:

USER_PID(task.tgid) USER_TID(task.pid) Name Comment
100 100 process_name1 Thread group leader 1
100 101 process_name1 1st thread of USER_PID 100
100 102 process_name1 2nd thread of USER_PID 100
103 103 process_name2 Thread group leader 2
  1. Using user_pid == task.pid , the user wants to filter:
  • USER_PID=100 and uses --pid=100: It will return just the first row
  • USER_TID=100 and uses --pid=100: It will return just the first row
  • USER_TID=101 and uses --pid=101: It will return just the second row
  • USER_TID=102 and uses --pid=102: It will return just the third row
  1. Using user_pid == task.tgid, the user wants to filter:
  • USER_PID=100 and uses --pid=100: It will return the first three rows.
  • USER_TID=100 and uses --pid=100: It will return the first three rows.
  • USER_TID=101 and uses --pid=101: It will return nothing; it has to use --pid=100 and visually filter which of the three tasks has TID =101
  • USER_TID=102 and uses --pid=102: It will return nothing; it has to use --pid=100 and visually filter which of the three tasks has TID =102

Alternatively, if we add an extra argument we can have:

  • --pid (filtering by task.tgid) and --tid (filtering by task.pid) or ...
  • --pid (filtering by task.pid) and --tgid (filtering by task.tgid).

gcmoreira avatar Dec 12 '23 08:12 gcmoreira