phoenix_live_dashboard icon indicating copy to clipboard operation
phoenix_live_dashboard copied to clipboard

Filtered process list

Open bokner opened this issue 1 year ago • 33 comments

Allow custom implementation of process lists. The motivation is to be able to inspect specific groups of processes rather than pulling the whole process list. For instance, we may want to look at the processes from the specific registry, supervised by a specific supervisor, etc.

bokner avatar Mar 02 '23 18:03 bokner

Hi @bokner! Thanks for the PR. I think we should simplify this a bit. I think we could have a process_filter and an input. The input defines some text that is sent to the filter. If the filter is enabled, then we call the filter to return a list of processes with the input value.

So the goal is to only filter processes but not customize rendering. WDYT?

josevalim avatar Mar 02 '23 18:03 josevalim

Hi @josevalim, Thank you for the quick response. I'm not sure I follow. For purposes of our project, we do want to customize the rendering of the process info page. Specifically, we want to pull some specific data from the registry and show it along with some process_info data (removing some pieces that are not relevant, such as registered_name).

I also realized that more work needs to be done to sync up my code with the latest changes. I started out forking v0.7.1, and I see that things changed quite a bit.

bokner avatar Mar 02 '23 19:03 bokner

I see. So in this case, I don't think it is a filter, is it? You want to argument the information. Perhaps we should have a ProcessInfo behaviour but it has to return a keyword list of data, instead of rendering on its own.

josevalim avatar Mar 02 '23 19:03 josevalim

But it is a filter :-) Perhaps, the example of implementation would explain the intention better(sorry, I should have thought about it!):

defmodule ProcessFilter.Demo do
  @behaviour Phoenix.LiveDashboard.ProcessFilter
  def list() do
    ["Registered", "All"]
  end

  def filter("Registered") do
    Process.registered() |> Enum.map(fn name -> %{pid: Process.whereis(name), name_or_initial_call: name, extra: "extra_param"} end)
  end

  def filter("All") do
    Process.list()
  end



  def render_process_info(assigns, filter) when filter in ["Registered"] do
    render(Map.put(assigns, :filter, filter))
  end

  def render_process_info(_assigns, _filter) do
    nil
  end

  use Phoenix.LiveDashboard.Web, :live_component
  def render(assigns) do
    ~H"""
    <div class="tabular-info">
      <%= if @alive do %>
        <table class="table table-hover tabular-info-table">
          <tbody>
            <tr><td class="border-top-0">Filtered by</td><td class="border-top-0"><pre><%= @filter %></pre></td></tr>
            <tr><td class="border-top-0">Registered name</td><td class="border-top-0"><pre><%= @registered_name %></pre></td></tr>
            <tr><td>Total heap size</td><td><pre><%= @total_heap_size %></pre></td></tr>
            <tr><td>Heap size</td><td><pre><%= @heap_size %></pre></td></tr>
            <tr><td>Stack size</td><td><pre><%= @stack_size %></pre></td></tr>
            <tr><td>Reductions</td><td><pre><%= @reductions %></pre></td></tr>
           <tr><td>Current stacktrace</td><td><pre><%= @current_stacktrace %></pre></td></tr>
          </tbody>
        </table>

        <%= if @page.allow_destructive_actions do %>
          <div class="modal-footer">
            <button class="btn btn-danger" phx-target={@myself} phx-click="kill">Kill process</button>
          </div>
        <% end %>
      <% else %>
        <div class="tabular-info-not-exists mt-1 mb-3">Process is not alive or does not exist.</div>
      <% end %>
    </div>
    """
  end
end

So here we have two options for filtering process list, and the process info view for registered processes is customized.

bokner avatar Mar 02 '23 19:03 bokner

Got it. So I think it is good but instead of render_process_info we should call it extra_process_info and return a keyword list, WDYT?

josevalim avatar Mar 02 '23 19:03 josevalim

Got it. So I think it is good but instead of render_process_info we should call it extra_process_info and return a keyword list, WDYT?

I like the idea of not dealing with rendering outside of the ProcessInfoComponent, but I'm not sure how do we customize the rendering then? For instance, for some processes, we want to remove registered name and/or some other things, add application-specific links (not just the table rows) etc. Perhaps, we could just provide a list of elements for Phoenix.LiveDashboard.PageBuilder.label_value_list?

bokner avatar Mar 02 '23 20:03 bokner

So perhaps we may need to return structured data, such as map with key, value and an optional link.

josevalim avatar Mar 02 '23 20:03 josevalim

Actually, let’s call it extra_info only, without process, because we can use the same behavior to extend other pages later. :)

josevalim avatar Mar 02 '23 21:03 josevalim

I think, in addition to returning the content as structured data for the component to render it, we might also want to provide a way to render the content info page directly by the implementation module. The reasoning is that the markup for the info page could be quite complex (as it is for our particular use case), i.e., not limited by the table, basic links, etc.

Then the corresponding callback would be

@callback info_content(assigns :: Socket.assigns(), filter_name :: String.t() | nil) ::
              Phoenix.LiveView.Rendered.t() | [map()] | nil

An example of an implementation that returns the list of maps:

  def info_content(assigns, filter) do
    info_list(assigns)
  end

  defp info_list(assigns) do
    [
      %{label: "Registered name", value: assigns.registered_name},
      %{label: "Total heap size", value: assigns.total_heap_size},
      ## some other pieces
    ]
  end

An example of an implementation that returns the template:

  def info_content(assigns, filter) do
    info_template(assigns)
  end

  defp info_template(assigns) do
    ~H"""
    <table class="table table-hover tabular-info-table">
      <tbody>
        <tr><td class="border-top-0">Registered name</td><td class="border-top-0"><pre><%= @registered_name %></pre></td></tr>
        <tr><td>Total heap size</td><td><pre><%= @total_heap_size %></pre></td></tr>
      </tbody>
    </table>
  """
  end

Both variations are currently handled by Phoenix.LiveDashboard.PageFilter. What do you think?

bokner avatar Mar 06 '23 18:03 bokner

Unfortunately for us that is a no-go because we don't want people to assume it is being rendered in a certain way or in a certain part. We need to keep everything driven by Elixir data.

josevalim avatar Mar 06 '23 21:03 josevalim

Do you have a screenshot of what you want to render right now? And which kind of rich data would be necessary to achieve it?

josevalim avatar Mar 06 '23 21:03 josevalim

I'm not 100% sure if I like the filter idea. We already have a good abstraction for this: row_fetcher. Instead of modifying the current behaviour to add more complexity, we could change the row_fetcher "in runtime using a select input" to a custom one.

Thank you for the feedback @alexcastano. I'm not sure I completely understand your idea. We would have to introduce some customization either way. Currently, the modification in the original code is limited by the addition of filter parameter to fetch_processes call. Then fetch_processes is using a custom implementation of process list. What would be your suggested strategy here?

bokner avatar Mar 07 '23 15:03 bokner

@bokner @alexcastano one of the complications here is that we need to fetch the processes from another node and, in order to fetch processes from another node, we need to run code in said node.

So customizing it via the row_fetcher is limited or even the router is tricky, because we can pass a module that we are not sure exists in the node we are fetching the information from.

In fact, @bokner, the filter itself and its possible values must come from the remote node. You should read from the application environment in a single place and that must be inside the remote node callback. So I guess our callback has to be:

@callback filter(name) :: {current_filter | nil, [filter],%{label: binary(), value: binary()}}

In this sense, I am fine with it being the application environment, ultimately the node we are connecting to is the one which has to decide which filter functionality it provides (and it has to pick one).

josevalim avatar Mar 07 '23 16:03 josevalim

@bokner @alexcastano one of the complications here is that we need to fetch the processes from another node and, in order to fetch processes from another node, we need to run code in said node.

Yes, and that's what fetch_processes/2 does, making rpc calls to retrieve the (filtered) process list on remote node.

So customizing it via the row_fetcher is limited or even the router is tricky, because we can pass a module that we are not sure exists in the node we are fetching the information from.

We can modify Phoenix.LiveDashboard.SystemInfo.processes_callback/6 (which is being remotely called by fetch_processes/2), so it can fall back to default behaviour (i.e., full process list) if the filter module is not on the node.

bokner avatar Mar 07 '23 18:03 bokner

We can modify Phoenix.LiveDashboard.SystemInfo.processes_callback/6 (which is being remotely called by fetch_processes/2), so it can fall back to default behaviour (i.e., full process list) if the filter module is not on the node.

The problem is that they can differ altogether, so it is better to always ask the node and don't have to guess or compare. The contract I proposed above should do it!

josevalim avatar Mar 07 '23 19:03 josevalim

@bokner @alexcastano one of the complications here is that we need to fetch the processes from another node and, in order to fetch processes from another node, we need to run code in said node.

So customizing it via the row_fetcher is limited or even the router is tricky, because we can pass a module that we are not sure exists in the node we are fetching the information from.

I understand this point. However, I don't understand why we can assume that LiveDashboard.SystemInfo is in other nodes, but we cannot assume the same for other modules.

Furthermore, if you have a heterogeneous cluster, it is the user's responsibility to provide modules that exist and can be executed nicely in all types of nodes. If not, can we always capture the error and show a friendly error?

alexcastano avatar Mar 08 '23 00:03 alexcastano

I understand this point. However, I don't understand why we can assume that LiveDashboard.SystemInfo is in other nodes, but we cannot assume the same for other modules. Furthermore, if you have a heterogeneous cluster, it is the user's responsibility to provide modules that exist and can be executed nicely in all types of nodes.

I agree with @alexcastano , I think it's fairly unusual for the cluster nodes to have different sets of modules (unless there is a release upgrade or redeployment which usually does not last for long).

bokner avatar Mar 08 '23 14:03 bokner

So I guess our callback has to be:

@callback filter(name) :: {current_filter | nil, [filter],%{label: binary(), value: binary()}}

I'm not sure we are on the same page :-) I added the description of callbacks for the current implementation with the working example and the description of the flow here, which I should have done from the very beginning :-)

https://github.com/phoenixframework/phoenix_live_dashboard/pull/411/files#diff-d70287c0ec9596c0e2e8a5061d182a21940c2407ce650f7719719bff11a0c3b8

bokner avatar Mar 08 '23 15:03 bokner

I understand this point. However, I don't understand why we can assume that LiveDashboard.SystemInfo is in other nodes, but we cannot assume the same for other modules.

Because we load SystemInfo into other nodes as soon as we connect and we carefully write the code in SystemInfo so it doesn't invoke any other module!

Furthermore, if you have a heterogeneous cluster, it is the user's responsibility to provide modules that exist and can be executed nicely in all types of nodes. If not, can we always capture the error and show a friendly error?

It is an intentional feature of the dashboard to connect to heterogeneous nodes. So I think we should do our best to design features that work well on heterogeneous nodes.

josevalim avatar Mar 08 '23 17:03 josevalim

@bokner we may need two different behaviours. One is the filter, it will look like this:

      def filter(:default), do: filter("All")

      def filter("All") do
        {"All", list(), Process.list()}
      end

      def filter("Phoenix") do
        {"Phoenix", list(), Process.list() |> ...}
      end

The filter will not be sent from the view. It will be read directly from the application environment of the connected node. The first time we call it, we will pass :default as the key.

Therefore, first please send a PR that only adds the filters, and we will work on merging it. The next step is to figure out the info parts, but we will do that next. :)

josevalim avatar Mar 08 '23 18:03 josevalim

Btw, I understand there are concerns about heterogeneous nodes but breaking them is a deal-breaker to me. Process listing is basic functionality of the dashboard and we should not break it. If we can't make it work on heteregenous setup, then I believe we should not accept this change. FWIW, Observer is also 100% heteregenous.

josevalim avatar Mar 08 '23 18:03 josevalim

@bokner we may need two different behaviours. One is the filter, it will look like this:

      def filter(:default), do: filter("All")

      def filter("All") do
        {"All", list(), Process.list()}
      end

      def filter("Phoenix") do
        {"Phoenix", list(), Process.list() |> ...}
      end

The filter will not be sent from the view. It will be read directly from the application environment of the connected node. The first time we call it, we will pass :default as the key.

Therefore, first please send a PR that only adds the filters, and we will work on merging it. The next step is to figure out the info parts, but we will do that next. :)

Thanks, this makes sense.

bokner avatar Mar 08 '23 18:03 bokner

I understand this point. However, I don't understand why we can assume that LiveDashboard.SystemInfo is in other nodes, but we cannot assume the same for other modules.

Because we load SystemInfo into other nodes as soon as we connect and we carefully write the code in SystemInfo so it doesn't invoke any other module!

I read now the code and you're absolutely right. I didn't notice this feature before. It is a brilliant solution 💎 Thanks for the explanation.

Furthermore, if you have a heterogeneous cluster, it is the user's responsibility to provide modules that exist and can be executed nicely in all types of nodes. If not, can we always capture the error and show a friendly error?

It is an intentional feature of the dashboard to connect to heterogeneous nodes. So I think we should do our best to design features that work well on heterogeneous nodes.

Yeah, we agree 100%. I just thought that dashboard would fail if it wasn't loaded in the remote node.

alexcastano avatar Mar 08 '23 21:03 alexcastano

@alexcastano

We updated the name from filter to active_filter, so we have to update it here too.

In the code, filter is used as the name of the dropdown, whereas active_filter and filter_list are part of socket.assigns. In particular, active_filter could be updated directly as a result of ProcessFilter module being hot-swapped (or some condition that is coded into the implementation as the polling happens), and not by the user action from UI. So I think it's a good idea to have different names for the form field and the session (i.e., socket.assigns) variable.

bokner avatar Apr 24 '23 12:04 bokner

You have to implement the All filter. In the same way, the user must do it too. However, we could automatically add All as the first option, which will set active_filter = nil. Here, we can check if active_filter, do: filter_mod.filter(active_filter), else: Process.list(). What do you think?

I would argue that if the implementation has "XXX" in the filter list, it probably does not make sense not to have a corresponding filter("XXX"). Also, in our particular case, we do not want to have an "All" option for all users, just to avoid accidentally polling millions of processes that we have.

bokner avatar Apr 24 '23 13:04 bokner

@alexcastano

We updated the name from filter to active_filter, so we have to update it here too.

In the code, filter is used as the name of the dropdown, whereas active_filter and filter_list are part of socket.assigns. In particular, active_filter could be updated directly as a result of ProcessFilter module being hot-swapped (or some condition that is coded into the implementation as the polling happens), and not by the user action from UI. So I think it's a good idea to have different names for the form field and the session (i.e., socket.assigns) variable.

Sorry, but I don't get it. I don't see any @filter variable used in the template, either socket.assigns.filter in the rest of the code. If this is true, it doesn't make sense to have filter in the specs, in the Map.put_new and the attr. On the other hand, active_filter is used and it could make sense to be modified "externally", for example, using update function.

Maybe are you confused because socket.assigns.table_params.filter is very similar to socket.assigns.filter? I'm just trying to understand :)

alexcastano avatar Apr 24 '23 20:04 alexcastano

You have to implement the All filter. In the same way, the user must do it too. However, we could automatically add All as the first option, which will set active_filter = nil. Here, we can check if active_filter, do: filter_mod.filter(active_filter), else: Process.list(). What do you think?

I would argue that if the implementation has "XXX" in the filter list, it probably does not make sense not to have a corresponding filter("XXX"). Also, in our particular case, we do not want to have an "All" option for all users, just to avoid accidentally polling millions of processes that we have.

I don't think it is a problem to have always the "All" option. AFAIK the :observer works in a similar way. Because we have a limit to don't showing every process it should be fine. On the other hand, I was thinking more in a malicious user sending wrong tokens than the developer forgetting to add the clause.

alexcastano avatar Apr 24 '23 21:04 alexcastano

I don't think it is a problem to have always the "All" option. AFAIK the :observer works in a similar way.

It is a problem for us, and that's also the reason why we avoid using observer :-)

Edit:

Because we have a limit to don't showing every process it should be fine.

The issue is not so much with how many rows are shown, but rather with server-side polling a potentially huge number of processes and retrieving process info for each of them.

bokner avatar Apr 24 '23 21:04 bokner

Sorry, but I don't get it. I don't see any @filter variable used in the template, either socket.assigns.filter in the rest of the code. If this is true, it doesn't make sense to have filter in the specs, in the Map.put_new and the attr.

You are right, there is none :-) I will try to convey my understanding of the workflow. I have set up a small application for the test, and the description of the workflow mostly comes from debugging it.

  • filter is the name of the form field https://github.com/phoenixframework/phoenix_live_dashboard/blob/fe86d97d99c4025133240dea4d3b576872984183/lib/phoenix/live_dashboard/components/table_component.ex#L213

  • The value of this field could be set in 2 ways

    • By the user selecting the value from the dropdown. This is handled by https://github.com/phoenixframework/phoenix_live_dashboard/blob/fe86d97d99c4025133240dea4d3b576872984183/lib/phoenix/live_dashboard/components/table_component.ex#L284-L288
    • By setting/updating @active_filter. This will have an effect in https://github.com/phoenixframework/phoenix_live_dashboard/blob/fe86d97d99c4025133240dea4d3b576872984183/lib/phoenix/live_dashboard/components/table_component.ex#L214

    This will usually happen just one time upon initialization of the table component: https://github.com/phoenixframework/phoenix_live_dashboard/blob/fe86d97d99c4025133240dea4d3b576872984183/lib/phoenix/live_dashboard/components/table_component.ex#L112
    , and from there on the value of the filter field would be controlled by the UI, unless for some reason (hot-swapping? or something else depending on the implementation of the process filter?) the value of @active_filter changes. If that's the case, the value of filter will be reset to the value of @active_filter again. Edited: forgot to mention the case of switching between cluster nodes - this could also change @active_filter.

So the filter behaves differently from other table parameters in that it could be changed both by the backend and UI. Hence we have @active_filter as a bridge between the backend and UI.

Otherwise, the filter field is just another table parameter, so it needs to be handled in the same manner as other fields (i.e., limit, search, 'hint`...).

Please let me know if you see any problems with this description or the way to improve the code.

bokner avatar Apr 24 '23 23:04 bokner

https://github.com/phoenixframework/phoenix_live_dashboard/pull/411#discussion_r1174410382 This should be fixed too.

Okay, maybe you could help me to fix it.

https://github.com/phoenixframework/phoenix_live_dashboard/pull/411/files/d146580eb01bee2bd1ce6c8b5ad7659d298d7aeb#diff-1d9200c19d147e3e691759ff70e93ff5287aa2cc8b475ee88ebfa04383acf159R96

This line is here because I wasn't able to make the condition in the form https://github.com/phoenixframework/phoenix_live_dashboard/blob/fe86d97d99c4025133240dea4d3b576872984183/lib/phoenix/live_dashboard/components/table_component.ex#L208

to work without assigning active_filter. I'd expect :if not to fail if there is no assignment, but apparently, it's not the case. Please let me know if you see a way around it :-)

bokner avatar Apr 25 '23 01:04 bokner