sfeed.git - Suckless rss Feed reader with my configs

Age	Commit message (Collapse)	Author
2021-01-01	README: tested on MIPS32 (big-endian)	Hiltjo Posthuma

2021-01-01	LICENSE: bump year	Hiltjo Posthuma

2021-01-01	sfeed_update: if baseurl is empty then use the path from the feed by default	Hiltjo Posthuma
	Feeds should contain absolute urls, but if it does not have it then this makes it more convenient to configure such feeds.
2020-11-09	bump version to 0.9.20	Hiltjo Posthuma

2020-11-01	sfeed_xmlenc: be more paranoid in printing encoding names	Hiltjo Posthuma
	sfeed_xmlenc is used automatically in sfeed_update for detecting the encoding. In particular do not allow slashes anymore either. For example "//IGNORE" and "//TRANSLIT" which are normally allowed. Some iconv implementation might allow other funky names or even pathnames too, so disallow that. See also the notes about the "frommap" for the "-f" option. https://pubs.opengroup.org/onlinepubs/9699919799/utilities/iconv.html + some minor parsing handling improvements.
2020-10-31	sfeed_web: improve parsing a <link> if it has no type attribute	Hiltjo Posthuma
	This happens because the previous link type is not reset when a <link> tag starts again, but it is reset when a type attribute starts. Found on the spanish newspaper site: elpais.com Input: <link rel="alternate" href="https://feeds.elpais.com/mrss-s/pages/ep/site/elpais.com/portada" type="application/rss+xml" title="RSS de la portada de El País"/> <link rel="canonical" href="https://elpais.com"/> Would print (second line is incorrect). https://feeds.elpais.com/mrss-s/pages/ep/site/elpais.com/portada application/rss+xml https://elpais.com/ application/rss+xml Now prints: https://feeds.elpais.com/mrss-s/pages/ep/site/elpais.com/portada application/rss+xml Fix: reset it also at the start of a <link> tag in this case (for <base href /> it is still not wanted).
2020-10-24	bump version to 0.9.19	Hiltjo Posthuma

2020-10-22	sfeed_web: whoops, fix bug mentioned in the previous commit	Hiltjo Posthuma
	(ascii.jp)
2020-10-22	sfeed_web: attribute parsing improvements, improve man page	Hiltjo Posthuma
	Fix attribute parsing and now decode entities. The following now works (from helsinkitimes.fi): <base href="https://www.helsinkitimes.fi/" /> <link href="/?format=feed&type=rss" rel="alternate" type="application/rss+xml" title="RSS 2.0" /> <link href="/?format=feed&type=atom" rel="alternate" type="application/atom+xml" title="Atom 1.0" /> Properly associate attributes with the actual tag, this now parses properly (from ascii.jp). <link rel="apple-touch-icon-precomposed" href="/img/apple-touch-icon.png" /> <link rel="alternate" type="application/rss+xml" />
2020-10-22	Do not change the referenced matched tag data (from gettag()).	Hiltjo Posthuma
	Fixes a regression introduced in the refactor in commit e43b7a48b08a6bbcb4e730e80395b3257681b33e Now copy the data by value. This structure is small and no performance regression has been seen. This was because the tag ID was modified which made subsequent parsed tags of this type behave strangely: ctx.tag->id = RSSTagGuidPermalinkTrue; Input data to reproduce: <rss> <channel> <item> <guid isPermaLink="false">https://def/</guid> </item> <item> <guid>https://abc/</guid> </item> </channel> </rss>
2020-10-21	README: filter example, filter Google Analytics utm_* parameters	Hiltjo Posthuma
	https://support.google.com/analytics/answer/1033867?hl=nl
2020-10-21	sfeed_web: reset feedlink buffer	Hiltjo Posthuma
	Noticed strange output on the site ascii.jp: The site HTML contained: <link rel="apple-touch-icon-precomposed" href="/img/apple-touch-icon.png" /> <link rel="alternate" type="application/rss+xml" /> This would print: "/img/apple-touch-icon.png application/rss+xml" Now it prints: " application/rss+xml"
2020-10-18	README: improve etag example with escaping of the filename	Hiltjo Posthuma
	Use the same base filename as the feed file, because sfeed_update replaces '/' in names with '_': filename="$(printf '%s' "$1" \| tr '/' '_')" This fixes the example for fetching feeds with names containing '/'. Reported by __20h__, thanks!
2020-10-18	README: add example to support ETag caching	Hiltjo Posthuma

2020-10-18	xml.c: initialize i = 0	Hiltjo Posthuma
	Forgot it in the cleanup commit 37afcf334fa1ba0b668bde08e8fcaaa9fd7dfa0d
2020-10-16	README.xml: reference examples, ANSI compatible, mention original parser	Hiltjo Posthuma

2020-10-16	README: fix unescaped character in regex in awk in filter example	Hiltjo Posthuma
	Found by testing using mawk.
2020-10-12	add a comment about the intended date priority	Hiltjo Posthuma

2020-10-12	Revert "RSS: give Dublin Core <dc:date> higher priority over <pubDate>"	Hiltjo Posthuma
	This reverts commit a1516cb7869a0dd99ebaacf846ad4161f2b9b9a2.
2020-10-12	README: filter example: strip Facebook fbclid parameter	Hiltjo Posthuma

2020-10-12	simplify time parsing	Hiltjo Posthuma

2020-10-12	remove unneeded check for NUL terminator	Hiltjo Posthuma

2020-10-12	RSS: give Dublin Core <dc:date> higher priority over <pubDate>	Hiltjo Posthuma
	This way dc:date could be the updated time of the item. For Atom there is <published> and <updated> with the same logic.
2020-10-12	parse categories, add multiple field values support (for categories)	Hiltjo Posthuma
	Fields with multiple values are separated by '\|'. In the future multiple enclosure support might be added. The categories tags are now parsed. This feature is useful for filtering and categorizing. Parsing of nested tags such as <author><name> has been improved. This code has been refactored. RSS <guid> isPermaLink is now handled differently also and will now prefer a permalink with "true" (link) over the ID. In practise multiple <guid> in an item does not happen.
2020-10-09	xml: remove unused code for sfeed	Hiltjo Posthuma

2020-10-09	fix counting due to uninitialized variable when the time could not be parsed	Hiltjo Posthuma
	Since commit 276d5789fd91d1cbe84b7baee736dea28b1e04c0 if the time is empty or could not be parsed then it is shown/aligned as a blank space instead of being skipped. An oversight in this change was that items should be counted and set in `isnew`. This commit fixes the uninitialized variable and possible miscounting.
2020-10-09	xml.h: minor comment rewording	Hiltjo Posthuma

2020-10-09	sfeed: parse day with max 2 digits (instead of 4)	Hiltjo Posthuma

2020-10-09	sfeed: support the ISO8601 time format without separators	Hiltjo Posthuma
	For example "19720229T132245Z" is now supported.
2020-10-09	README: tested with cproc and sdcc on Z80 emulator, for fun	Hiltjo Posthuma
	cproc: cproc: https://github.com/michaelforney/cproc qbe: https://c9x.me/compile/ z80 (sfeed base program) fuzix: http://www.fuzix.org/ RC2014 emulator: https://github.com/EtchedPixels/RC2014 sdcc: http://sdcc.sourceforge.net/
2020-10-09	man pages: tweak alignment of lists	Hiltjo Posthuma

2020-10-09	xml.c: remove buffering of comment data, which is unused anyway	Hiltjo Posthuma

2020-10-09	xml.h: add underscore for #ifdef guard	Hiltjo Posthuma
	This is the common style.
2020-10-09	XML cdata callback: handle CDATA as data	Hiltjo Posthuma
	This improves handling CDATA for example in Atom feeds with: <author><email><![CDATA[abc]]><name><![CDATA[[person]]></name></author>
2020-07-06	bump version to 0.9.18	Hiltjo Posthuma

2020-07-05	sfeed_atom: minor simplification, gmtime_r is not needed here	Hiltjo Posthuma

2020-07-05	README: reference sfeed_curses	Hiltjo Posthuma

2020-07-05	README: improvements	Hiltjo Posthuma
	- Add an example to optimize bandwidth use with the curl -z option. - Add a note about CDNs blocking based on the User-Agent (based on a question mailed to me). - Add an script to convert existing newsboat items to the sfeed(5) TSV format.
2020-07-05	format tools: don't skip items with a missing/invalid timestamp field	Hiltjo Posthuma
	Handle it appropriately in the context of each format tool. Output the item but keep it blanked. NOTE: maybe in sfeed_twtxt it should use the current time instead?
2020-07-05	sfeed_mbox: don't ignore items with a missing/invalid timestamp	Hiltjo Posthuma
	The Date header is mandatory. Use the current time if it is missing/invalid.
2020-07-05	sfeed_atom: the updated field is mandatory: use the current time...	Hiltjo Posthuma
	... if it is missing/invalid.
2020-07-05	sfeed_atom: fix timezone, output if timestamp is set	Hiltjo Posthuma
	Timezone should be GMT (as intended), do not convert to localtime.
2020-06-25	README: small tweaks and a filter example improvement	Hiltjo Posthuma
	This is a "quick&dirty" regex to block some of the typical 1px width or height tracking pixels.
2020-06-21	sfeed_html/sfeed_frames: simplify struct feed allocation	Hiltjo Posthuma
	There's no need for a dynamic struct feed **. The required size is known (argc). Just allocate it in one go.
2020-06-21	Makefile: tiny compatibility improvement for tar -cf	Hiltjo Posthuma

2020-06-10	Makefile: pedantic change: use ar -rc instead of ar rc	Hiltjo Posthuma

2020-06-04	sfeed.{1,5}: clarify the timestamp field a bit	Hiltjo Posthuma
	In particular for RSS feeds where a pubDate is optional.
2020-06-04	sfeed_atom: make the output more conform	Hiltjo Posthuma
	- Set mandatory entry tags: id, updated. - Change entry published (optional tag) to updated (mandatory). - Add <feed> tags: author name, id, updated, title. Thanks lich for the feedback and testing.
2020-06-01	fix typo	Hiltjo Posthuma

2020-05-28	sfeed: simplify/optimize checking end tags while inside a RSS/Atom tag	Hiltjo Posthuma
	Instead of a binary search do set a pointer to the assigned expected end tag. This makes more sense and is also a minor optimization. No behavioural change intended.