summarizer icon indicating copy to clipboard operation
summarizer copied to clipboard

Some questions and suggestions

Open Fleshgrinder opened this issue 11 years ago • 4 comments

Hi Reetesh!

I saw your mail on the nginx mailing list and thought that your project might be from interest to some of my applications. I'm absolutely not familiar with OTS, although I read the explanations on the GitHub page and from the original project page (the links you included in your mail).

My first question is, how good would it work to generate meta descriptions for web pages? Considering that the input text might differ from a few sentences up to a lot of text. Is it possible to limit the output to roughly 150 ... 160 characters (it's not a problem if a sentence isn't totally complete as you might know)?

If you think that this is possible or you don't know and I should test myself, I'd have a suggestion that would make your CLI more useable within a web application (at least for me).

Most text is read in one's application from the database, there isn't really a text file and creating one would produce unnecessary IO. Would it be possible to extend the CLI executable to read from stdin and return the summarized text?

Thanks!

Fleshgrinder avatar Jan 27 '14 16:01 Fleshgrinder

Hi Richard,

Sorry for the delay. I was away on a vacation and did not follow up github.

About limiting the output: I guess if you know the size of your text, you can do a simple calculation of the ratio and feed it to the CLI/nginx module. For example, let's say your text is 3000 bytes, and you want 150 characters only, the ratio is (150/3000)*100. However; it definitely is more universal to make the CLI/nginx module accept either the ratio or the words-limit (and in latter case derive the ratio. I would add that.

About meta descriptions: I am not sure if you have any more questions here apart from the size of the summary output.

About input format: yes, in fact the application reading stdin is more universal way of designing a service. I have that in my mind and it's just that I developed a version that is more suitable for my use (not the CLI but the nginx module listening on socket), where keeping a huge number of articles in a database and getting any caching advantage is not possible due to a small RAM. I would work on that.

Let me know if I clarified all your questions.

Please feel free to fork and contribute just in case you are running out of time and you are OK with C. I would be able to get back and work on these after 1 week.

Thanks, Reetesh

reeteshranjan avatar Jan 30 '14 08:01 reeteshranjan

Thanks for your reply. No need for an apology, time is not an issue on my side.

Thanks for clarifying how to limit it to an arbitrary amount of characters, that's exactly what I was looking for. Now I just have to test how meaningful the summaries really are.

Great, the stdin-stdout-feature is all I'd need to use it in my project.

Yes, that answers all my questions. Take your time, as I said, no hurry on my side.

Greetz

Fleshgrinder avatar Jan 31 '14 08:01 Fleshgrinder

Hi,

It's painful that I have not been able to do this yet. It's just to ack that it's never been out of my mind; but I have been consumed with getting my own product's first beta release out. Most of my open source components at github were developed as part of this product.

Ever since I left my job to do my own stuff, I have been in a fix to get to some place soon and have had such embarrassments. Thanks for your patience in case you are still there.

Reetesh

reeteshranjan avatar Mar 16 '14 05:03 reeteshranjan

Hey, as I said, no rush on my side. Take your time.

Fleshgrinder avatar Mar 17 '14 07:03 Fleshgrinder