audiobookshelf icon indicating copy to clipboard operation
audiobookshelf copied to clipboard

[Bug]: CBC Radio podcast RSS feeds fail due to User-Agent string

Open Bigmack3000 opened this issue 1 year ago • 13 comments

What happened?

Trying to get metadata for the podcast "Someone Knows Something" but abs can't load the rss feed:

https://www.cbc.ca/podcasting/includes/sks.xml

it matches with the podcast. but it can't load individual episodes from the link.

What did you expect to happen?

to load the episode infos.

Steps to reproduce the issue

  1. match to the podcast
  2. try to load episode list.

Audiobookshelf version

2.12.3

How are you running audiobookshelf?

Docker

What OS is your Audiobookshelf server hosted from?

Other (list in "Additional Notes" box)

If the issue is being seen in the UI, what browsers are you seeing the problem on?

None

Logs

No response

Additional Notes

No response

Bigmack3000 avatar Aug 21 '24 16:08 Bigmack3000

Can you provide any logs? You will want to enable Debug logs in the server settings.

nichwall avatar Aug 21 '24 16:08 nichwall

Sure thing. logs attached here. 2024-08-21.txt

also, i believe this is the relevant part:

2024-08-21 17:22:05.759

DEBUG

[podcastUtils] getPodcastFeed for "https://www.cbc.ca/podcasting/includes/sks.xml"

2024-08-21 17:22:17.770

ERROR

[podcastUtils] getPodcastFeed Error AxiosError: timeout of 12000ms exceeded

Bigmack3000 avatar Aug 21 '24 17:08 Bigmack3000

I tested this as well as other podcasts by CBC https://www.cbc.ca/radio/podcasts

If I replace our User-Agent string with the default axios one the requests go through. They are rejecting our User-Agent string. I also tested some other User-Agent strings and I couldn't figure out what criteria they are using. Some work some don't.

This is ours audiobookshelf (+https://audiobookshelf.org; like iTMS)

Which follows the best practices described here https://developers.whatismybrowser.com/learn/browser-detection/user-agents/user-agent-best-practices

So I'm not sure what we would do about that. I would think it is an issue on their end.

advplyr avatar Aug 21 '24 21:08 advplyr

Interesting. Thanks for looking into it. I guess I'll manually adjust that one.

Bigmack3000 avatar Aug 22 '24 00:08 Bigmack3000

@Bigmack3000 I am curious about how you solved this one? I too and looking to add some CBC podcasts. How did you manually adjust?

fastrack20 avatar Sep 13 '24 13:09 fastrack20

Ha, I have not solved this one yet. But I had used itunes as my original podcast library, so I had all of the episodes for some of these shows downloaded already with metadata. Then I imported those into abs and manually adjusted all the airdates. Still hoping they are able to figure out an official way of importing these shows.

Bigmack3000 avatar Sep 27 '24 21:09 Bigmack3000

Would it be outlandish to request configuration that would allow User-Agent override?

I've forked it for now to get my CBC fix.

If anyone else wants to use it:

ghcr.io/davegallant/audiobookshelf:2.14.0-cbc-fix

davegallant avatar Oct 12 '24 21:10 davegallant

So this fork fixed getting metadata for cbc podcasts? Is this something abs can implement into the main app?

Bigmack3000 avatar Oct 13 '24 20:10 Bigmack3000

It can be added to the main ABS, but the current user agent string was added because other podcast servers were blocking ABS because it was not using an application specific user agent string. It seems to be hit or miss depending on the specific podcast.

nichwall avatar Oct 13 '24 23:10 nichwall

It can be added to the main ABS, but the current user agent string was added because other podcast servers were blocking ABS because it was not using an application specific user agent string. It seems to be hit or miss depending on the specific podcast.

@nichwall What do you think of allowing a User-Agent like this?

This could also serve as a way to harden configuration for more privacy-oriented people.

davegallant avatar Oct 14 '24 14:10 davegallant

I've also opened an issue on https://cbchelp.cbc.ca/ to see if they can potentially remedy the situation on their back end.

davegallant avatar Oct 16 '24 16:10 davegallant

I seem to have issues downloading many podcast feeds. Examples of some that are working: https://podcasts.files.bbci.co.uk/p09by3yy.rss; http://feeds.feedburner.com/StoneClearingWithRichardHerring; https://feeds.buzzsprout.com/2084361.rss Examples of some that are throwing an error: https://feeds.acast.com/public/shows/b19ac1f5-6adf-4c8b-aa1a-2af2160f99e4; https://access.acast.com/rss/62a222737c02140013aa4c03/; https://feed.podbean.com/IFLScienceTheBigQuestions/feed.xml

I'm running 2.15.0, using Docker container ghcr.io/advplyr/audiobookshelf:latest Have run up another Docker container on another server just to try to rule something out.

Does this look similar to your issue?

The error is pretty impenetrable for me: [podcastUtils] getPodcastFeed Error AxiosError: Call to 64:ff9b::9765:8334 is blocked. at TLSSocket.<anonymous> (/node_modules/ssrf-req-filter/lib/index.js:38:29) at TLSSocket.emit (node:events:519:28) at GetAddrInfoReqWrap.emitLookup [as callback] (node:net:1439:14) at GetAddrInfoReqWrap.onlookupall [as oncomplete] (node:dns:132:8) { config: { transitional: { silentJSONParsing: true, forcedJSONParsing: true, clarifyTimeoutError: false }, adapter: [Function: httpAdapter], transformRequest: [ [Function: transformRequest] ], transformResponse: [ [Function: transformResponse] ], timeout: 12000, xsrfCookieName: 'XSRF-TOKEN', xsrfHeaderName: 'X-XSRF-TOKEN', maxContentLength: -1, maxBodyLength: -1, env: { FormData: [Function] }, validateStatus: [Function: validateStatus], headers: { Accept: 'application/rss+xml, application/xhtml+xml, application/xml, */*;q=0.8', 'User-Agent': 'audiobookshelf (+https://audiobookshelf.org; like iTMS)' }, url: 'https://feed.podbean.com/IFLScienceTheBigQuestions/feed.xml', method: 'get', responseType: 'arraybuffer', httpAgent: Agent { _events: [Object: null prototype], _eventsCount: 2, _maxListeners: undefined, defaultPort: 443, protocol: 'https:', options: [Object: null prototype], requests: [Object: null prototype] {}, sockets: [Object: null prototype] {}, freeSockets: [Object: null prototype] {}, keepAliveMsecs: 1000, keepAlive: false, maxSockets: Infinity, maxFreeSockets: 256, scheduling: 'lifo', maxTotalSockets: Infinity, totalSocketCount: 0, maxCachedSessions: 100, _sessionCache: [Object], createConnection: [Function (anonymous)], [Symbol(shapeMode)]: false, [Symbol(kCapture)]: false, [Symbol(active)]: true }, httpsAgent: Agent { _events: [Object: null prototype], _eventsCount: 2, _maxListeners: undefined, defaultPort: 443, protocol: 'https:', options: [Object: null prototype], requests: [Object: null prototype] {}, sockets: [Object: null prototype], freeSockets: [Object: null prototype] {}, keepAliveMsecs: 1000, keepAlive: false, maxSockets: Infinity, maxFreeSockets: 256, scheduling: 'lifo', maxTotalSockets: Infinity, totalSocketCount: 1, maxCachedSessions: 100, _sessionCache: [Object], createConnection: [Function (anonymous)], [Symbol(shapeMode)]: false, [Symbol(kCapture)]: false, [Symbol(active)]: true }, data: undefined }, request: <ref *1> Writable { _events: { close: undefined, error: [Function: handleRequestError], prefinish: undefined, finish: undefined, drain: undefined, response: [Function: handleResponse], socket: [Array], timeout: undefined, abort: undefined }, _writableState: WritableState { highWaterMark: 16384, length: 0, corked: 0, onwrite: [Function: bound onwrite], writelen: 0, bufferedIndex: 0, pendingcb: 0, [Symbol(kState)]: 17580812, [Symbol(kBufferedValue)]: null }, _maxListeners: undefined, _options: { maxRedirects: 21, maxBodyLength: 10485760, protocol: 'https:', path: '/IFLScienceTheBigQuestions/feed.xml', method: 'GET', headers: [Object], agent: [Agent], agents: [Object], auth: undefined, hostname: 'feed.podbean.com', port: null, nativeProtocols: [Object], pathname: '/IFLScienceTheBigQuestions/feed.xml' }, _ended: true, _ending: true, _redirectCount: 0, _redirects: [], _requestBodyLength: 0, _requestBodyBuffers: [], _eventsCount: 3, _onNativeResponse: [Function (anonymous)], _currentRequest: ClientRequest { _events: [Object: null prototype], _eventsCount: 7, _maxListeners: undefined, outputData: [], outputSize: 0, writable: true, destroyed: false, _last: true, chunkedEncoding: false, shouldKeepAlive: false, maxRequestsOnConnectionReached: false, _defaultKeepAlive: true, useChunkedEncodingByDefault: false, sendDate: false, _removedConnection: false, _removedContLen: false, _removedTE: false, strictContentLength: false, _contentLength: 0, _hasBody: true, _trailer: '', finished: true, _headerSent: true, _closed: false, socket: [TLSSocket], _header: 'GET /IFLScienceTheBigQuestions/feed.xml HTTP/1.1\r\n' + 'Accept: application/rss+xml, application/xhtml+xml, application/xml, */*;q=0.8\r\n' + 'User-Agent: audiobookshelf (+https://audiobookshelf.org; like iTMS)\r\n' + 'Host: feed.podbean.com\r\n' + 'Connection: close\r\n' + '\r\n', _keepAliveTimeout: 0, _onPendingData: [Function: nop], agent: [Agent], socketPath: undefined, method: 'GET', maxHeaderSize: undefined, insecureHTTPParser: undefined, joinDuplicateHeaders: undefined, path: '/IFLScienceTheBigQuestions/feed.xml', _ended: false, res: null, aborted: false, timeoutCb: null, upgradeOrConnect: false, parser: null, maxHeadersCount: null, reusedSocket: false, host: 'feed.podbean.com', protocol: 'https:', _redirectable: [Circular *1], [Symbol(shapeMode)]: false, [Symbol(kCapture)]: false, [Symbol(kBytesWritten)]: 0, [Symbol(kNeedDrain)]: false, [Symbol(corked)]: 0, [Symbol(kOutHeaders)]: [Object: null prototype], [Symbol(errored)]: null, [Symbol(kHighWaterMark)]: 16384, [Symbol(kRejectNonStandardBodyWrites)]: false, [Symbol(kUniqueHeaders)]: null }, _currentUrl: 'https://feed.podbean.com/IFLScienceTheBigQuestions/feed.xml', _timeout: null, [Symbol(shapeMode)]: true, [Symbol(kCapture)]: false } }

banigithub-2 avatar Oct 18 '24 08:10 banigithub-2

I seem to have issues downloading many podcast feeds. Examples of some that are working: https://podcasts.files.bbci.co.uk/p09by3yy.rss; http://feeds.feedburner.com/StoneClearingWithRichardHerring; https://feeds.buzzsprout.com/2084361.rss Examples of some that are throwing an error: https://feeds.acast.com/public/shows/b19ac1f5-6adf-4c8b-aa1a-2af2160f99e4; https://access.acast.com/rss/62a222737c02140013aa4c03/; https://feed.podbean.com/IFLScienceTheBigQuestions/feed.xml

I'm running 2.15.0, using Docker container ghcr.io/advplyr/audiobookshelf:latest Have run up another Docker container on another server just to try to rule something out.

Does this look similar to your issue?

No, it's a timeout for CBC. These are the logs: [podcastUtils] getPodcastFeed Error [AxiosError: timeout of 12000ms exceeded] { code: 'ECONNABORTED', config: { transitional: { silentJSONParsing: true, forcedJSONParsing: true, clarifyTimeoutError: false }, adapter: [Function: httpAdapter], transformRequest: [ [Function: transformRequest] ], transformResponse: [ [Function: transformResponse] ], timeout: 12000, xsrfCookieName: 'XSRF-TOKEN', xsrfHeaderName: 'X-XSRF-TOKEN', maxContentLength: -1, maxBodyLength: -1, env: { FormData: [Function] }, validateStatus: [Function: validateStatus], headers: { Accept: 'application/rss+xml, application/xhtml+xml, application/xml, */*;q=0.8', 'User-Agent': 'audiobookshelf (+https://audiobookshelf.org; like iTMS)' }, url: 'https://www.cbc.ca/podcasting/includes/spark.xml', method: 'get', responseType: 'arraybuffer', httpAgent: Agent { _events: [Object: null prototype], _eventsCount: 2, _maxListeners: undefined, defaultPort: 443, protocol: 'https:', options: [Object: null prototype], requests: [Object: null prototype] {}, sockets: [Object: null prototype] {}, freeSockets: [Object: null prototype] {}, keepAliveMsecs: 1000, keepAlive: false, maxSockets: Infinity, maxFreeSockets: 256, scheduling: 'lifo', maxTotalSockets: Infinity, totalSocketCount: 0, maxCachedSessions: 100, _sessionCache: [Object], createConnection: [Function (anonymous)], [Symbol(shapeMode)]: false, [Symbol(kCapture)]: false, [Symbol(active)]: true }, httpsAgent: Agent { _events: [Object: null prototype], _eventsCount: 2, _maxListeners: undefined, defaultPort: 443, protocol: 'https:', options: [Object: null prototype], requests: [Object: null prototype] {}, sockets: [Object: null prototype], freeSockets: [Object: null prototype] {}, keepAliveMsecs: 1000, keepAlive: false, maxSockets: Infinity, maxFreeSockets: 256, scheduling: 'lifo', maxTotalSockets: Infinity, totalSocketCount: 1, maxCachedSessions: 100, _sessionCache: [Object], createConnection: [Function (anonymous)], [Symbol(shapeMode)]: false, [Symbol(kCapture)]: false, [Symbol(active)]: true }, data: undefined }, request: <ref *1> Writable { _events: { close: undefined, error: [Function: handleRequestError], prefinish: undefined, finish: undefined, drain: undefined, response: [Function: handleResponse], socket: [Array], timeout: undefined, abort: undefined }, _writableState: WritableState { highWaterMark: 16384, length: 0, corked: 0, onwrite: [Function: bound onwrite], writelen: 0, bufferedIndex: 0, pendingcb: 0, [Symbol(kState)]: 17580812, [Symbol(kBufferedValue)]: null }, _maxListeners: undefined, _options: { maxRedirects: 21, maxBodyLength: 10485760, protocol: 'https:', path: '/podcasting/includes/spark.xml', method: 'GET', headers: [Object], agent: [Agent], agents: [Object], auth: undefined, hostname: 'www.cbc.ca', port: null, nativeProtocols: [Object], pathname: '/podcasting/includes/spark.xml' }, _ended: true, _ending: true, _redirectCount: 0, _redirects: [], _requestBodyLength: 0, _requestBodyBuffers: [], _eventsCount: 3, _onNativeResponse: [Function (anonymous)], _currentRequest: ClientRequest { _events: [Object: null prototype], _eventsCount: 2, _maxListeners: undefined, outputData: [], outputSize: 0, writable: true, destroyed: true, _last: true, chunkedEncoding: false, shouldKeepAlive: false, maxRequestsOnConnectionReached: false, _defaultKeepAlive: true, useChunkedEncodingByDefault: false, sendDate: false, _removedConnection: false, _removedContLen: false, _removedTE: false, strictContentLength: false, _contentLength: 0, _hasBody: true, _trailer: '', finished: true, _headerSent: true, _closed: false, socket: [TLSSocket], _header: 'GET /podcasting/includes/spark.xml HTTP/1.1\r\n' + 'Accept: application/rss+xml, application/xhtml+xml, application/xml, */*;q=0.8\r\n' + 'User-Agent: audiobookshelf (+https://audiobookshelf.org; like iTMS)\r\n' + 'Host: www.cbc.ca\r\n' + 'Connection: close\r\n' + '\r\n', _keepAliveTimeout: 0, _onPendingData: [Function: nop], agent: [Agent], socketPath: undefined, method: 'GET', maxHeaderSize: undefined, insecureHTTPParser: undefined, joinDuplicateHeaders: undefined, path: '/podcasting/includes/spark.xml', _ended: false, res: null, aborted: true, timeoutCb: null, upgradeOrConnect: false, parser: [HTTPParser], maxHeadersCount: null, reusedSocket: false, host: 'www.cbc.ca', protocol: 'https:', _redirectable: [Circular *1], [Symbol(shapeMode)]: false, [Symbol(kCapture)]: false, [Symbol(kBytesWritten)]: 0, [Symbol(kNeedDrain)]: false, [Symbol(corked)]: 0, [Symbol(kOutHeaders)]: [Object: null prototype], [Symbol(errored)]: null, [Symbol(kHighWaterMark)]: 16384, [Symbol(kRejectNonStandardBodyWrites)]: false, [Symbol(kUniqueHeaders)]: null, [Symbol(kError)]: undefined }, _currentUrl: 'https://www.cbc.ca/podcasting/includes/spark.xml', _timeout: null, [Symbol(shapeMode)]: true, [Symbol(kCapture)]: false } }

davegallant avatar Oct 19 '24 00:10 davegallant

Fixed in v2.17.0.

github-actions[bot] avatar Nov 17 '24 22:11 github-actions[bot]