spring-integration icon indicating copy to clipboard operation
spring-integration copied to clipboard

FeedEntryMessageSource fails if entry date fields are null [INT-1810]

Open spring-operator opened this issue 14 years ago • 9 comments

David Turanski opened INT-1810 and commented

feed:inbound-channel-adapter outputs nothing with attached feed content from http://feeds.feedburner.com/NF-NewestInstantTitles (NetFlix). The entries are not added because the entries do not include published date or entry date. I'm a newb when it comes to rss, but the bug appears to be in FeedEntryMessageSource.populateEntryList().


Affects: 2.0.3

Attachments:

spring-operator avatar Feb 20 '11 13:02 spring-operator

Oleg Zhurakousky commented

I am changing it to the improvement as this was discussed durting the initial implementation and we decided to live with it for now. The real issue is how do we distinguish from the entry that was already read vs the new/updated one

spring-operator avatar Feb 22 '11 04:02 spring-operator

Oleg Zhurakousky commented

Well, this one is tough. need to think about it some more

The core fo the issue is that MetadataStore works based on remembering only the latest component retrieved based on its create Date.

if ((entryDate != null && entryDate.getTime() > this.lastTime) {
   // save the entry
}

The last.time comes from MetadataStore. Disabling it will cause another issue and that is retrieval of the same data on each poll - DUPLICATES which is what the MetadataStore was supposed to solve.

spring-operator avatar Mar 01 '11 12:03 spring-operator

David Turanski commented

It's not as bad as that. AFAICT an RSS feed will return http status 304 Not Modified with an empty message body unless the content has changed. I observed this w/ TCP Monitor on a couple of feed URLs I tested. If the feed is updated, you get all entries not just the deltas. Maybe provide an option to disable the filter?

spring-operator avatar Mar 02 '11 04:03 spring-operator

Oleg Zhurakousky commented

I have to see how we are calling it and if I have access to the status, since while trying with the URL you provided here, it was giving me the same results on every poll. Also, if that works it will only provide a partial solution, since as soon as update happens we'll have to deal with everything again.

spring-operator avatar Mar 02 '11 05:03 spring-operator

Oleg Zhurakousky commented

Moving it to 2.1. Need to do a bit more thinking. Not as simple as it sounds based on how we track what's been read. Most likely need enhancements to MetadataStore strategy

spring-operator avatar Mar 09 '11 14:03 spring-operator

Oleg Zhurakousky commented

Moving it to 2.2. Not sure how can we address this with current infrastructure.

spring-operator avatar Sep 30 '11 11:09 spring-operator

Artem Bilan commented

Well, I'd say the best solution here would be SyndEntryDateStrategy :

public interface SyndEntryDateStrategy {

     Date entryDate(SyndEntry entry, SyndFeed feed);

}

To allow end-user to have full control over those "dateless" feeds.

spring-operator avatar Dec 17 '15 01:12 spring-operator

Artem Bilan commented

Similar problem: http://stackoverflow.com/questions/36859724/how-to-parse-rss-feeds-with-spring-integration-when-pubdate-not-available

spring-operator avatar Apr 26 '16 13:04 spring-operator

Artem Bilan commented

One more use-case: https://stackoverflow.com/questions/44435815/spring-integration-feed-inbound-channel-adapter-duplicate-entries

spring-operator avatar Jun 08 '17 17:06 spring-operator