cachex Custom hooks & Cachex extension

I'm building a custom hook to ensure our caches are persisted easily and avoid warming the entire cache with a large dataset. I also want to avoid littering complex Cachex.fetch calls to handle misses.

Here's my setup:

defmodule Myapp.Cache.PersistentCacheSpec do
   import Cachex.Spec

   alias MyApp.Cache.PostgresPersistenceHook

   @default_ttl to_timeout(day: 1)

   def child_spec(opts) when is_list(opts) do
     name = Keyword.fetch!(opts, :name)
     ttl = Keyword.get(opts, :ttl, @default_ttl)

     cache_options = [
       hooks: [
         hook(module: PostgresPersistenceHook, args: [cache: name])
       ],
       ttl: true,
       default_ttl: ttl,
       expiration:
         expiration(
           default: ttl,
           interval: to_timeout(second: 1),
           lazy: true
         )
     ]

     %{
       id: name,
       start: {Cachex, :start_link, [name, cache_options]},
       type: :supervisor
     }
   end
 end

The hook:

defmodule MyApp.Cache.PostgresPersistenceHook do
   use Cachex.Hook

   import Ecto.Query, warn: false

   require Logger

   @doc """
   Specify which actions this hook should listen to
   """
   def actions do
     [:put, :update, :del, :clear, :expire]
   end

   @doc """
   Initialize the hook - store the cache name for use in operations
   """
   def init(opts) do
     cache_name = Keyword.get(opts, :cache, :unknown_cache)
     {:ok, %{cache_name: cache_name}}
   end

   @doc """
   Handle cache notifications and persist relevant operations to PostgreSQL
   """
   def handle_notify({action, args}, _result, state) do
     case action do
       # Handle writes - persist to database
       :put ->
         persist_to_db(action, args, state)

       :update ->
         persist_to_db(action, args, state)

       :del ->
         remove_from_db(action, args, state)

       :clear ->
         clear_namespace_from_db(action, args, state)

       :expire ->
         expire_from_db(action, args, state)

       # Ignore other actions
       _ ->
         :ok
     end

     {:ok, state}
   end

   ...private impl omitted for brevity
 end

So far, this works wonderfully. The main issue is that I really want to be able to extend Cachex.get/2 to handle cache-misses with a fallback.

I realize this isn't really the intent of hooks.

I've solved my internal issue by creating a wrapper module MyApp.PersistentCache, which has get/2:

 def get(cache_name, key) when is_atom(cache_name) do
     Cachex.fetch(cache_name, key, &fetch_from_db(cache_name, &1))
     |> case do
       {:ok, value} -> {:ok, value}
       {:commit, value} -> {:ok, value}
       {:ignore, nil} -> {:ok, nil}
       error -> error
     end
   end

This ensures things are reactively warmed, but I would love to just be able to call Cachex.get without the indirection.

Am I missing something simple that could solve my use-case? Is this completely out of scope for Cachex?

Jun 06 '25 19:06 chevinbrown

Hi @chevinbrown!

I think I might be missing something. Why can't you just use Cachex.fetch/4 and avoid the wrapper? Why does it specifically have to be Cachex.get/3?

Jun 06 '25 19:06 whitfin

@whitfin thanks for the quick reply.

We have a variety of use-cases, but let's assume:

We have a bunch of persisted caches (and no warmer).

Every Cachex.fetch will have to implement a database-lookup as fallback to ensure the cache-miss (after restart) is handled. I want that to happen automatically without warming the entire cache from postgres. Almost like a cache-warmer, or an implicit fallback for the particular child_spec.

Again, this may just be anti-caching...but the implementation above is really nice--if there's a cache-miss, we pull from the persisted store without any developer having to know where the persisted store comes from: cache-misses check db, and then set value for Cachex if found from db.

So...in my case, a wrapper is probably the right call, I'm just very close to having a really nice API that uses Cachex hooks to ensure we're handling the persisted state. I suspect a cache-warmer would be the correct solution here, I just want to avoid that for other reasons right now.

Jun 06 '25 19:06 chevinbrown

Based on what you've said, Cachex.fetch/4 is the thing to use - it's exactly for what you're describing!

Every Cachex.fetch will have to implement a database-lookup as fallback to ensure the cache-miss (after restart) is handled.

Right, but this is exactly what you want, right? Just to confirm, fetch/4 will not do a lookup if the value is in the cache already (i.e. will act as get/3).

I'm not super clear on exactly what you're trying to improve in what you already have, it looks fine to me -- maybe I'm missing something, though!

Jun 06 '25 20:06 whitfin

@whitfin yes, but with one augmentation. fetch/4 would require implementing the fallback at every callsite. I'd like to abstract that away to ensure devs don't have to implement the fallback logic because the fallback (for this "spec") is to always check the db in a consistent way, per the wrapper example above. Maybe another way to think about it would be pre-call hooks or a passive warmer. Devs are still free to implement fetch/4 in any way they want, but under the PersistentCacheSpec, I'd like to be able to specific some kind of generic/default fallback, or a hook that would implement some logic before the :get call/cast to ensure we can implement some logic that augments the Cachex.get function. This would allow us to build adapters/process around different specs as we need. Again, if caches are warm, everything works as expected, but if there are cache-misses, we would be able to use various adapters/specs to warm the cache via postgres, api-calls, or anything else (without having to litter verbose fallbacks)! This seems like a great feature IMO and would make Cachex more extensible.

I hope this is clear!

Maybe I need to reach for something like this instead? https://hexdocs.pm/nebulex_adapters_ecto/readme.html

Jun 09 '25 12:06 chevinbrown

yes, but with one augmentation. fetch/4 would require implementing the fallback at every callsite.

This would make sense, but you're already using a function reference and a wrapper. How would Cachex provide something helpful in this situation?

If you're using a function reference for fetch, as above, then your concern about re-implementation does not exist -- you just link to one specific handler as the fallback. If you want a wrapper around that to hide it completely, then sure, but this is just standard Elixir.

Maybe another way to think about it would be pre-call hooks

These exist already; you can maybe try use them to see if it'll help you do what you need! If you add the following to a hook it'll run prior to the call:

def type, do: :pre

This simply defaults to :post in most cases, so that your hook runs after the fact.

This seems like a great feature IMO and would make Cachex more extensible.

Maybe the easiest way to help me understand is to simply show me (in pseudocode, if necessary) what you want the Cachex interaction to look like? I really do not follow the description given to this point, but would like to help if possible 😅.

Maybe I need to reach for something like this instead? https://hexdocs.pm/nebulex_adapters_ecto/readme.html

No clue! I have never used it, but since it seems to be focused on Ecto and you are using Ecto... maybe?

That being said I think what you want is likely already possible here, it's just a case of straightening it out!

Jun 09 '25 15:06 whitfin

I totally missed the type callback!

Here's the pseudo code that works. The type: :pre and async?: false are key.

Now, I can configure the spec:

hooks: [
        hook(module: PostgresPersistencePreHook, args: [cache: name]),
        hook(module: PostgresPersistenceHook, args: [cache: name])
      ],

defmodule MyApp.Cache.PostgresPersistencePreHook do
  @moduledoc """
  A Cachex pre-hook that loads values from PostgreSQL before cache operations.
  This hook runs before cache actions to populate the cache from the database.

  This is designed to work alongside PostgresPersistenceHook (post-hook) to provide
  full persistence functionality.
  """

  use Cachex.Hook

  import Ecto.Query, warn: false

  alias MyApp.Cache.Line
  alias MyApp.Repo

  @doc """
  Specify that this is a pre-hook (runs before cache actions)
  """
  def type, do: :pre
  def async?, do: false

  @doc """
  Specify which actions this hook should listen to
  """
  def actions do
    [:get]
  end

  @doc """
  Initialize the hook - store the cache name for use in operations
  """
  def init(opts) do
    cache_name = Keyword.get(opts, :cache, :unknown_cache)
    {:ok, %{cache_name: cache_name}}
  end

  @doc """
  Handle cache notifications before the action occurs.
  For :get operations, intercept and use fetch with DB fallback.
  """
  def handle_notify({action, args}, _result, state) do
    case action do
      :get ->
        handle_get_with_fallback(args, state)

      # Ignore other actions
      _ ->
        :ok
    end

    {:ok, state}
  end

  # Private functions

  defp handle_get_with_fallback([key, _opts], state) do
    cache_name = Map.get(state, :cache_name, :MyApp_cache)

    try do
      Cachex.fetch(cache_name, key, &db_fallback(cache_name, &1))
      :ok
    rescue
      _ -> :ok
    end
  end

end

This allows me to use Cachex.get & Cachex.put completely transparently with db-persistence. Now, the pre-hook will automatically fallback to the db as a sort of reactive warmer.

Thank you for the prompt response and an awesome library! Feel free to close!

Jun 09 '25 17:06 chevinbrown

Ah, okay, that totally makes sense -- thank you for your patience @chevinbrown!

Yeah, the documentation seems a little lacking on the type/0 pattern, I'm not sure when that happened but I can improve that going forward.

As long as it all works for you and does what you need, I'll close this - but please do comment or re-open it if you need anything further!

Jun 09 '25 18:06 whitfin

@whitfin I won't cast dispersion on the docs, it totally makes sense now, I just overlooked that initially!

Jun 09 '25 19:06 chevinbrown

oh! one last thing. (We can keep closed)

There's no hook currently for when keys are expired.

So similar issue as above: I have persisted records in the db--if there's an expiry/eviction policy set for the persistent cache, I have no notification of when a key is expired, and therefore am unable to automatically delete db-entries when the key expires from cachex.

Internally, this is probably easy to solve with postgres-triggers with knowledge of the cache-item's ttl.

Last question!

Jun 09 '25 20:06 chevinbrown

Yeah, this is a known thing I am still trying to figure out. The issue is that we purge directly via an ETS query, and so it's not actually possible to get a list of evicted keys -- and if we did the list could be gigantic. There are a couple of ways you can do this, depending on how consistent you require:

Using inspect/3:

The simplest way would to do this would be a synchronous :pre hook listening on :purge events (which are triggered by TTL enforcement). In this hook you can use Cachex.inspect(cache, {:expired, :keys}) to get a list of keys which will be removed by the :purge operation and do whatever you want with them.

The downside here is that there is a small gap of time between the receipt of this event and the actual purge, in which some keys could have expired in the meantime. Whether this matters for you or not, I'm unsure.

Using stream/4:

More time consuming but more reliable would be to use Cachex.stream/4 to collect the full list of keys before and after the purge, and then diff them to determine which were removed by the purge. This is way more "work", but guaranteed to be consistent.

This is rough example, but it should look something like this:

query1 = Cachex.Query.build(output: key)
keyset1 = my_cache |> Cachex.stream!(query) |> MapSet.new()

# purge happens

query2 = Cachex.Query.build(output: key)
keyset2 = my_cache |> Cachex.stream!(query) |> MapSet.new()
removed = MapSet.difference(keyset1, keyset2)

This is definitely not ideal, but as of yet I haven't figured out a good abstraction to do this a better way. If someone has e.g. thousands of keys expiring per second the hook mailbox would explode, so for now this is forced upon the user to work out.

As an aside, if you are calling cache actions from inside a hook, you can use notify: false on every cache function to disable that call from also triggering a hook. This can help with some performance and avoid accidentally recursive cases within hooks.

Jun 09 '25 20:06 whitfin

@whitfin great, that's what I was seeing, so good confirmation. For now, I'll ignore and use some better internal db-mechanisms to handle that!

Jun 09 '25 20:06 chevinbrown

@whitfin this has been working well for us, save one issue.

Working as expected:

iex(1)> Cachex.put(:test_cache, "test_key", %{some: %{nested: :persisted}})
{:ok, true}
[debug] QUERY OK source="cache_lines" db=3.4ms queue=0.5ms idle=352.7ms
INSERT INTO "rune"."cache_lines" AS c0 ("value","ttl","strategy","key","namespace","inserted_at","updated_at") VALUES ($1,$2,$3,$4,$5,$6,$7) ON CONFLICT ("namespace","key") DO UPDATE SET "value" = $8, "ttl" = $9 RETURNING "id" [%{"some" => %{"nested" => "persisted"}}, 7200, :lru, "test_key", "cachex::test_cache", ~N[2025-07-22 13:25:31], ~N[2025-07-22 13:25:31], %{"some" => %{"nested" => "persisted"}}, 7200]
↳ Rune.Cache.PostgresPersistenceHook.persist_normalized_value_to_database/2, at: lib/rune/cache/postgres_persistence_hook.ex:90
[debug] Successfully persisted cache key "test_key" to database
iex(2)> Cachex.get(:test_cache, "test_key")
{:ok, %{some: %{nested: :persisted}}}

But after a restart, the cachex-hook will hit the db to hydrate:

iex(1)> Cachex.get(:test_cache, "test_key")
[debug] QUERY OK db=3.9ms queue=0.5ms idle=45.2ms
SELECT * FROM rune.read_cache_line($1, $2) ["cachex::test_cache", "test_key"]
{:ok, %{"some" => %{"nested" => "persisted"}}}

Can I use a pre-hook to intercept :put to ensure the value is json-serialized?

I have things working with a pre-hook on :put by modifying :ets directly, but that feels yucky, and of course calling Cachex.put within the handle_notify is recursive...

Jul 22 '25 13:07 chevinbrown

Those are both equivalent, right? It's just that within your app you are using atom keys for maps, and when you read back from the database you are using binary keys. Either you should swap to using binary keys in all places (probably the better option), or you should have your fetching logic parse as atom keyed maps.

FWIW you can use notify: false on any cache call to skip the notification system, if it turns out you need to - but likely that's going to venture down an awkward path, like you say.

Jul 22 '25 13:07 whitfin

@whitfin TY for the prompt reply, that's helpful, but I'm still stuck. Let's say I do want to ensure we have consistent binary keys. I'm going in circles trying to get my hooks to properly serialize/json-encode->decode as a pre-put step. I tried using a pre/non-async hook to catch handle_notify(:put...then in that function-call, use Cachex.put(cache, key, value, notify: false)...but that doesn't seem to be working.

Am I doing something obviously wrong? Can you give me some pseudo code?

Jul 23 '25 11:07 chevinbrown

Hi @chevinbrown! No problem, I'm sure we can figure it out.

This is really hard to answer without actually looking at the code, but it sounds like we're going down a bad path. I'm also not entirely sure we're on the same page...

Your serialization is fine, because it's getting into the database properly. The deserialization is also coming back fine; the only issue is that the keys are binaries rather than atoms when they're loaded from the database, right?

If that's the case, then the issue is just in whatever fetch_from_db/2 is doing. That should be pulling from the database and parsing using atom keys, instead of the binary keys (or change your main application logic to use binary keys).

While this is obviously nothing to do with Cachex, happy to look over anything. My gut instinct is that you are too focused on hooks and pre/post steps; just go straight to the source (fetch_from_db/2) and change that to do what you want.

Jul 23 '25 14:07 whitfin

@whitfin yeah, this may be yucky. I realize my last comms were a bit confusing. I should just be using persist_to_db to validate/alter/stringify things beforehand...

Here's kinda what I was thinking: Use a pre-hook to intercept the PUT and maybe validate/stringify the keys to ensure we don't have to do funky things like parsing get to term-keys:

def handle_notify({:put, [key, value, _opts]}, _result, state) do
    cache_name = Map.get(state, :cache_name, :rune_cache)

    Cachex.put(cache_name, key, %{override: true}, notify: false)

    {:ok, state}
  end

I assumed that would override the cache value, but apparently not?

Here's what I ended up doing: Instead of storing things as jsonb, I converted the db-column to binary and used a custom ecto type:

  def load(binary) when is_binary(binary) do
    term = :erlang.binary_to_term(binary)
    {:ok, term}
  rescue
    ArgumentError -> :error
  end

  def load(nil), do: {:ok, nil}
  def load(_), do: :error

  def dump(nil), do: {:ok, nil}

  def dump(term) do
    binary = :erlang.term_to_binary(term)
    {:ok, binary}
  rescue
    ArgumentError -> :error
  end

This way, no matter what value we use, it'll be stored and retrieved in a consistent format without having to care that it's encodable!

Jul 24 '25 13:07 chevinbrown