nikola icon indicating copy to clipboard operation
nikola copied to clipboard

Support posts without source file

Open remram44 opened this issue 7 years ago • 19 comments

#1708 added a PostScanner plugin that allows pluggable post sources, noting that "the only currently existing plugin of this kind reads them from disk".

However all info is read from Post.source_path across Nikola, so it seems that having a file backing each post is the only way to go.

It would be nice to be able to create a Post object with metadata & source and feed them to Nikola rather than autogenerating files (and in fact, if that is what you have to do, no PostScanner is required).

remram44 avatar Feb 13 '17 20:02 remram44

Well, that would require some work on the Post class. There's already an issue for that: #1709.

You can of course also create your own subclass of Post and do in there whatever you want, or provide a "mockup" which behaves as Nikola expects it. It is just more complicated without the refactoring in #1708.

felixfontein avatar Feb 13 '17 22:02 felixfontein

The problem is that a lot of code, outside of the Post class, open the source_path file directly.

remram44 avatar Feb 13 '17 22:02 remram44

Actually, most of the times source_path is only used to have something to sort posts by, or to show in error/status messages for the user of Nikola.

There is only one place (outside Post) which assumes that source_path is actually a real file on disk: namely sources.py, the plugin which tries to copy the source files to the output when COPY_SOURCES is True. So as long as you don't have COPY_SOURCES enabled and put something sensible into source_path which allows to identify the generated post, there's no need for source_path to actually point to an existing file.

felixfontein avatar Feb 13 '17 22:02 felixfontein

It is definitely doable 👍

I'll take a crack at this myself if I can find the time.

remram44 avatar Feb 13 '17 23:02 remram44

Perhaps, one should consider adding a member to the Post object strictly for sorting, defaulting to the path with the native post scanner, and copy source be adjusted to accept None as a valid option for post path (and therefor not copy the source, and not try and link to the source). Its a bigger diff, but a more flexible one.

tritium21 avatar Feb 16 '17 11:02 tritium21

It would also be good if copy_sources wouldn't have to use some logic to guess the correct source paths for translated posts. Currently it uses post.source_path together with utils.get_translation_candidate. It would probably be best if there's a function in Post which returns the source path for a given language, or None if there is no source (copy_sources could then ignore it).

Actually, I just noticed that there is already such a function: translated_source_path. I'll make a PR to use it (and to allow return value None).

felixfontein avatar Feb 16 '17 18:02 felixfontein

Having (overridable) Post methods that return file-like objects, instead of externally deriving filenames and opening the file, would go a long way towards extensibility!

remram44 avatar Feb 16 '17 19:02 remram44

The remaining uses of source_path are for:

  1. Final tie-breaking when sorting posts (see Nikola.sort_posts_chronologically);
  2. Allowing to link via source path (see Nikola.filename_path);
  3. For various error/warning messages where it will be used to identify posts.

All in all, for these usages, source_path just has to contain something sensible which allows the user to see who generated this post. It definitely shouldn't be None :)

felixfontein avatar Feb 16 '17 19:02 felixfontein

@remram44: I disagree. Using file-like objects is not really making anything simpler. You can always override Post.compile to use the compiler's compile_string interface, if supported. (Some compilers need files on disk and won't work with anything else.)

felixfontein avatar Feb 16 '17 19:02 felixfontein

There's compiling, there's reading the metadata, there's copying the source... Being able to read from wherever the Post lives would be better than assuming it's a source file, or making do without it when it's None.

My personal use case is to create a bunch of articles from a single news feed. I have the metadata, I have the source, but writing down temporary source files seems unnecessary.

remram44 avatar Feb 16 '17 19:02 remram44

You have to consider that Nikola is based on doit, which uses files (on disk) and "up-to-date objects" (usually hashes of dicts) to determine what to rebuild. If you want to work with file-like objects, you somehow need to provide something for doit to be able to do its work.

You can of course refactor many helper functions to allow file-like objects to be passed (instead of file names), but rewriting Nikola to work with file-like objects will not be a good idea.

felixfontein avatar Feb 16 '17 19:02 felixfontein

Oh I'm fine with the Post being linked to some on-disk file, so doit can do its work, so long as Nikola doesn't go to it directly for any of its operations. In my case, all those posts could link to that single news feed, or something like that.

remram44 avatar Feb 16 '17 19:02 remram44

I tried to create an abstract BasePost class. Hopefully this will help with implementing a non-file-based Post-like object. It could well be that I moved too little/too much stuff; if you notice something, tell me (or create a PR on that branch or change it directly if you have commit rights).

felixfontein avatar Feb 16 '17 20:02 felixfontein

It seems to me that there are two separate issues here:

  1. Post has a hard-coded dependency on "sources" being located on a mounted file system.
  2. The only existing PostScanner scans a local file system.

To me, 1 is a bug, and 2 is an enhancement.

Could this Issue be rewritten to cover one of those cases and a new Issue created for the other?

gwax avatar Jun 12 '17 01:06 gwax

I have already worked around this since February and I no longer have time to write a patch for this myself. I agree with your assessment of the situation, and I think working towards fixing those would be good 👍

remram44 avatar Jun 12 '17 02:06 remram44

Ok, first major issue (as mentioned above):

Compilers open the post's source path (or its translated source path, same thing), like this:

image

But that's ok, we know how to fix that. Adding an issue for it: #3384

Getting this done is going to be a very slow, long process, but it's a start.

ralsina avatar Apr 14 '20 20:04 ralsina

So, now the Post object has a source() method it can use to get the source instead of opening the file. Next task: make all the internals use that: #3395

ralsina avatar Apr 22 '20 13:04 ralsina

Just noting that I'm interested in this feature. I have a need to generate some boilerplate pages and they are currently done by an external script. Being able to create file-less Post objects with fully programatically-generated source from a plugin would be very convenient.

Xeverous avatar May 18 '20 13:05 Xeverous

@Xeverous I am interested too, it just takes a while since I can only work on it every once in a while.

In the meantime ... well, you can do what I currently do: generate files, then delete them and regenerate them :-)

ralsina avatar May 18 '20 14:05 ralsina