elfeed icon indicating copy to clipboard operation
elfeed copied to clipboard

content gets overridden on elfeed-search-fetch

Open d3rped opened this issue 5 years ago • 3 comments

I've added a function to elfeed-new-entry-hook that adds some metadata and overrides the default content. If I run elfeed-search-fetch (press G) some time later it resets the content of all already existing entries to it's original state (though as expected the things I've set in meta stay the same).

I'm not sure if this is intended behavior.

If so what would be the best way to get around this?

d3rped avatar May 08 '20 16:05 d3rped

As a general rule, information provided by the feed is kept in sync with the feed. This includes title, date, and content. That's why if you want a custom title displayed in elfeed-search, you set the metadata :title slot. This is local — i.e. not supplied by the feed itself — so it doesn't get synced / overridden during updates. It's useful because elfeed-search knows to use it instead of the original title when it's non-nil.

The same could be extended to content via a :content metadata slot. For this to completely work, it would require support from elfeed-search (to choose it instead of the original content) and the database garbage collector (treat :content, if not all metadata slots, as a GC roots).

There's nothing stopping you from using a :content metadata slot now except for those two reasons.

skeeto avatar May 08 '20 18:05 skeeto

Thanks for the quick answer, especially pointing me towards the garbage collector has likely saved me a lot of time/headache.

Would it be sensible to make the GC process check the whole entry for refs by default? (I suppose "indirect" references like images would still be collected/removed that way)
Not sure how much of a performance impact that would be on bigger feeds. (though I suppose it shouldn't make much of a difference)

Other than that I consider this issue solved.

d3rped avatar May 09 '20 04:05 d3rped

I ran into this issue as well, and after considering a few solutions, I think I found one that could work. The idea is that elfeed would expose a list of functions (technically a hook) that are each passed an entry and must return an elfeed-ref or a list of elfeed-refs, which are counted as being in use by elfeed-db-gc.

Heres an example of what I mean:

(defvar elfeed-db-gc-trace-functions nil
  "A list of functions called to determine which elfeed-refs are reachable.
Each function is called with an elfeed-entry and should return a
single elfeed-ref or a list of them, which will be considered in
use by `elfeed-db-gc'.")

(defun elfeed-db-gc (&optional stats-p)
  "Clean up unused content from the content database.
If STATS is true, return the space cleared in bytes."
  (elfeed-db-gc-empty-feeds)
  (let* ((data (expand-file-name "data" elfeed-db-directory))
         (dirs (directory-files data t "^[0-9a-z]\\{2\\}$"))
         (ids (cl-mapcan (lambda (d) (directory-files d nil nil t)) dirs))
         (table (make-hash-table :test 'equal)))
    (dolist (id ids)
      (setf (gethash id table) nil))
    (with-elfeed-db-visit (entry _)
      (let ((content (elfeed-entry-content entry))) ;this is left hardcoded in
        (when (elfeed-ref-p content)
          (setf (gethash (elfeed-ref-id content) table) t)))
      ;; NEW STUFF
      (dolist (trace-function elfeed-db-gc-trace-functions)
	(let ((refs (funcall trace-function entry)))
	  (unless (listp refs)
	    (setf refs (list refs)))
	  (dolist (ref refs)
	    (when (elfeed-ref-p ref)
	      (setf (gethash (elfeed-ref-id ref) table) t))))))
     ;;END OF NEW STUFF
    (cl-loop for id hash-keys of table using (hash-value used)
             for used-p = (or used (member id '("." "..")))
             when (and (not used-p) stats-p)
             sum (let* ((ref (elfeed-ref--create :id id))
                        (file (elfeed-ref--file ref)))
                   (* 1.0 (nth 7 (file-attributes file))))
             unless used-p
             do (elfeed-ref-delete (elfeed-ref--create :id id))
             finally (cl-loop for dir in dirs
                              when (elfeed-directory-empty-p dir)
                              do (delete-directory dir)))))

I think its worth leaving elfeed-entry-content hardcoded in as a GC root (as opposed to having it being the default value of elfeed-gc-functions) so if elfeed-gc-functions gets set to nil by accident the user won't lose all their old entry contents (like I did while testing this out), and if they really want to delete all the old contents, they can just run this:

(with-elfeed-db-visit (entry _)
  (setf (elfeed-entry-content entry) nil))

(elfeed-db-gc)  ;; garbage collect everything

which i found in this neat blog post.

As for the other block, elfeed-show-refresh-function already exists, which has been more than customizable enough for my needs.

zabe40 avatar Feb 04 '22 23:02 zabe40