summaryrefslogtreecommitdiff
path: root/sfeed_web.c
diff options
context:
space:
mode:
authorHiltjo Posthuma <hiltjo@codemadness.org>2020-10-31 19:51:17 +0100
committerHiltjo Posthuma <hiltjo@codemadness.org>2020-10-31 19:57:13 +0100
commit134a1ac3372fe1eae6bc5c6acd12666c17e82696 (patch)
treed2baf5fa3b11b075e16f2000b0558577f5e603c4 /sfeed_web.c
parent6a7229149f03a54d7d63241c4cbc1c83aa9831f0 (diff)
sfeed_web: improve parsing a <link> if it has no type attribute
This happens because the previous link type is not reset when a <link> tag starts again, but it is reset when a type attribute starts. Found on the spanish newspaper site: elpais.com Input: <link rel="alternate" href="https://feeds.elpais.com/mrss-s/pages/ep/site/elpais.com/portada" type="application/rss+xml" title="RSS de la portada de El PaĆ­s"/> <link rel="canonical" href="https://elpais.com"/> Would print (second line is incorrect). https://feeds.elpais.com/mrss-s/pages/ep/site/elpais.com/portada application/rss+xml https://elpais.com/ application/rss+xml Now prints: https://feeds.elpais.com/mrss-s/pages/ep/site/elpais.com/portada application/rss+xml Fix: reset it also at the start of a <link> tag in this case (for <base href /> it is still not wanted).
Diffstat (limited to 'sfeed_web.c')
-rw-r--r--sfeed_web.c1
1 files changed, 1 insertions, 0 deletions
diff --git a/sfeed_web.c b/sfeed_web.c
index e0ab874..6d547a7 100644
--- a/sfeed_web.c
+++ b/sfeed_web.c
@@ -32,6 +32,7 @@ xmltagstart(XMLParser *p, const char *t, size_t tl)
} else if (!strcasecmp(t, "link")) {
islinktag = 1;
linkhref[0] = '\0';
+ linktype[0] = '\0';
}
}