elfeed icon indicating copy to clipboard operation
elfeed copied to clipboard

Not Well-Formed Each attribute must be unique within an element

Open divansantana opened this issue 8 years ago • 8 comments

Trying to get the rss working with gitlab.

Testing on gitlab.com with a free account.

Browse to: https://gitlab.com/dashboard/activity

From there one can obtain the link to the rss feed. Mine is in this format (private token changed obviously):

https://gitlab.com/dashboard/projects.atom?rss_token=ajsasjkdh_asjk

I then added to this elfeed and tried to sync this feed.

My elfeed can actually sync this feed without issues. Though my gitlab.com feed is not very busy and there's not much there.

With our internal gitlab system doing the same thing I get the below error.

How do I go about debugging this further? I already have elfeed-log-level set to debug.

Would be nice to get this working with GitLab.

[2017-12-08 11:13:44] [error]: https://git.example.com/dashboard/projects.atom?rss_token=mytokenverylongabcde : (error XML: (Not Well-Formed) Each attribute must be unique within an element)

Thanks for your awesome software!

divansantana avatar Dec 08 '17 09:12 divansantana

The two most likely possibilities are:

  • GitLab is serving malformed XML
  • GitLab isn't actually serving a feed, but a "please log in" HTML page

Try visiting the page in your browser, or use curl to download the feed an inspect it manually to make sure it's actually a feed. Assuming you're using the curl backend, you can get the exact arguments Elfeed is using with elfeed-curl--args:

(elfeed-curl--args URL nil)
;; => ("--http1.1" "--compressed" ...)

That's in case it's something specific to the way Elfeed fetches feeds. It's based on your configuration and your system's version of curl.

skeeto avatar Dec 08 '17 17:12 skeeto

It seems that the problem may be with the diff highlighting that comes back in the output. It looks like some attributes have a namespace as well, i.e. lang="yaml" xml:lang="yaml" which seems to be what trips up the parser.

I'm not sure if the problem is with simply stripping the namespace so it ends up as lang="yaml" lang="yaml" which is invalid, I've tried to do some quick debugging but my elisp is not very great.

ci avatar Sep 22 '20 22:09 ci

@divansantana I have the same issue with our internal gitlab. Have you found any solution?

alishir avatar Nov 11 '21 13:11 alishir

@alishir I don't use gitlab these days

divansantana avatar Nov 15 '21 10:11 divansantana

I've just run into this with my gitlab activity feed and I can see from the W3C feed validator there are numerous issues but I can't spot the non unique attributes. There is an upstream bug against gitlab:

https://gitlab.com/gitlab-org/gitlab/-/issues/336579

but as that is unlikely to get fixed anytime soon I wonder if there is a potential workaround within elfeed? Is the XML failure in core Emacs code or part of elfeed itself? Could it be made more forgiving and just drop elements that are poorly formed?

stsquad avatar Feb 11 '22 17:02 stsquad

I should note if I load the feed into a buffer and run:

ELISP> (xml-parse-region (point-min) (point-max) nil t t)

it seems to generate a reasonable sexp without any complaint.

stsquad avatar Feb 11 '22 17:02 stsquad

I have the same problem with Gitlab feeds. Did anyone find a workaround if not a fix.

I works sometimes but unfortunately I haven’t captured a specific example of entry that breaks the parser.

fredfortier avatar Mar 23 '23 16:03 fredfortier

Here's a (poor) workaround that seems to work for the Gitlab activity feed: https://gitlab.com/gitlab-org/gitlab/-/issues/336579#note_1326248925

Is there a hook available to run such replacement before parsing the XML?

fredfortier avatar Mar 23 '23 19:03 fredfortier