rss2email icon indicating copy to clipboard operation
rss2email copied to clipboard

[Feature request] Network locaed feeds.txt

Open StayPirate opened this issue 2 years ago • 9 comments

I'd like to have the possibility to load the feeds.txt file from the network. I think that a cmdline option -f/--feeds could be used for that.

Once implemented it could result in sth like: rss2email cron -f https://raw.githubusercontent.com/StayPirate/rss2email/master/custom_config/feeds.txt.

Feedback?

StayPirate avatar May 02 '22 07:05 StayPirate

That strikes me as an odd thing to support within the tool itself.

I could imagine a --config or --file argument to specify the name of a file to load instead of the default, but I don't think that supporting a URL would make sense for such a thing.

The obvious approach would be:

 #!/bin/sh
 # ~/bin/my-rss2email
 wget  -O ~/.rss.conf http://server.example.com/path/to/config.txt
 exec rss2email --config ~/.rss.conf

skx avatar May 02 '22 07:05 skx

I see your point, maybe my PoV is quite different from the most common deployment of the rss2email tool. My idea is to have it deployed within a container, and that container will automatically update itself every time a new version is released. Hence, my only work would be to maintain the feeds.txt file (and just few times custom templates). Because of that I found a feature like that very handy, since It will allow me to maintain feeds.txt almost anywhere, even in a dedicated github repository.

StayPirate avatar May 02 '22 08:05 StayPirate

@skx is that a no-go for you? Or are you still considering it?

StayPirate avatar May 03 '22 15:05 StayPirate

That's a no-go from me, I'm afraid.

I could imagine that you could do what you want a couple of different ways. As I said before using a wrapper with curl/wget before the command is launched.

Now you've said you're building a container you could do that your container build itself if you locally copied the Dockerfile and used that as the source for your image you could add the wget there at build-time.

Something like:

$ git diff
diff --git a/Dockerfile b/Dockerfile
index 102f9e7..8678954 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -46,8 +46,15 @@ RUN ls -ltr /go/bin
 ###########################################################################
 FROM alpine
 
+RUN apk update && apk add --no-cache get
+
+RUN mkdir ~/.rss2email
+RUN wget -O ~/.rss2email/feeds.txt http://remote.source.here/feeds.txt
+
 # Create a working directory
 WORKDIR /app
 
 # Copy the binary.
 COPY --from=builder /go/bin/rss2email /app/
+
+

That would get you what you want:

  • An image that can be rebuilt whenever there is a new release.
  • Each time it is built it gets the latest configuration file, from a remote source.

I realize it's not quite as good as native support, but I think it's not a terrible suggestion :)

skx avatar May 03 '22 16:05 skx

Maybe I failed on explaining my idea. The goal is to run rss2email in daemon mode and expect that it automatically fetches the feeds.txt file from the internet at each iteration (like every 15m), not just at the time it starts. That will allow me to push updates for the feeds.txt file on github, and my running instance of rss2email will automatically use it at its next round (within the next 15m) without the need for me to do anything.

I you still don't want to add this feature, I'll try to implement it myself in a fork. As for now I refrain to adopt suggested workarounds of fetching the feeds.txt file from outside rss2email (like sh-wrapper or image rebuild), since those won't allow me to use the daemon mode.

What are the differences between cron and daemon modes other than keep the process running and waiting for a $SLEEP amount of time between each iteration?

StayPirate avatar May 04 '22 12:05 StayPirate

Now I understand, thanks for your patience.

As for the difference between cron/daemon - none, other than the sleep & retry behaviour. (Cron came from the identical command in the original r2e script, daemon was added when I moved everything I deploy to running in containers, as that made more sense.)

skx avatar May 04 '22 13:05 skx

As for the difference between cron/daemon - none, other than the sleep & retry behaviour. (Cron came from the identical command in the original r2e script, daemon was added when I moved everything I deploy to running in containers, as that made more sense.)

Gotcha, thanks.

As you know I'm not a golang developer, so it will take some time for me to implement this, but it might ends up to be a good exercise. Could you just let me know how you would implement this? Like in which files should I put the code. Kind a guide-lines to follow, so I can focus on learn and implement it in go.

Just to recap, I'd like to run rss2email daemon --feed https://raw.githubusercontent.com/StayPirate/rss2email/master/custom_config/feeds.txt ${TO_EMAIL} and expect the running instance to fetch the file at each feeds-execution (when ProcessFeeds() is called). In case the file cannot be fetched (404, 500, etc) it should write a warning to the stderr and then try again after $SLEEP seconds, but it should not terminate the process.

StayPirate avatar May 04 '22 15:05 StayPirate

I'm happy to try to help, and if this isn't sufficient please do feel free to ask for clarification.

So taking things from the top:

  • The applications' entry-point is in func main() which lives in .. main.go!
  • There we instantiate a bunch of structures for the various sub-commands.
    • The details of those don't really matter too much.
    • But ultimately one of the instantiated objects is given control, via it's Execute method.
    • So if you ran rss2email daemon .. the Execute method of cmd_daemon.go would get executed.

The cron and daemon commands look very very similar, both of them start by creating a Processor object - which lives in processor/processor.go.

The processor gets created once, and then thrown away, and then the application terminates. Or the processor gets created once, used, and then a sleep happens - before another new processor gets created .. in the damon case.

In processor.go you'll see a config-file is created in the ProcessFeeds function. So I'd suggest that you'd probably want to make your changes there:

  • The configfile object lives in configfile/configfile.go
    • It's a bit indirect due to legacy reasons, and will get simplified in the 3.x release as per #86.
    • Ultimately though the Parse method is called which reads a file and does the appropriate thing.

What I'd probably do is create a method in the configfile.go method which looked like this:

func (c * ConfigFile) LoadFromURL( url string ) error {
    // Download the remote URL to a temporary file
    // update the local 'path' variable to point to that path.
}

What that would do is:

  • Generate a temporary file.
  • Download the remote URL.
  • Save the contents to that temporary file.
  • Then call x.path = temporary-file-name

The end result is that the path saved there would be used by the Parse method, and there'd be no need to mess with any other logic.

In processor.go you could then change:

	// Get the configuration-file
	conf := configfile.New()

	// Upgrade it if necessary
	conf.Upgrade()

	// Now do the parsing
	entries, err := conf.Parse()

To read something like:

	// Get the configuration-file
	conf := configfile.New()

        // Check for URL/
        remote = os.Getenv("REMOTE_URL")
        if remote != "" {
           err := conf.LoadFromURLremote)
           if err != nil {  // error handling 
            }
        }

	// Now do the parsing
	entries, err := conf.Parse()

That should do the right thing:

  • Every time the processor is used to process feeds
  • It fetches the remote contents.
  • It writes to a local path
  • It updates the config-reader to read that local path - not the global config
  • And everything else remains unchanged.

I suspect this would mean you'd leak a temporary file every fifteen minutes, but .. otherwise things should be reasonably self-contained.

You could add a new member to the configfile object to store the temporary path, and ensure it was cleaned up.

skx avatar May 04 '22 16:05 skx

Wow, that's a lot of information! I'll take this as an opportunity to learn a little bit of golang and better understand the rss2email code-base. Thank you very much for the time you spent here.

StayPirate avatar May 04 '22 19:05 StayPirate