diff options
author | Hiltjo Posthuma <hiltjo@codemadness.org> | 2016-04-10 19:51:18 +0200 |
---|---|---|
committer | Hiltjo Posthuma <hiltjo@codemadness.org> | 2016-04-10 19:51:18 +0200 |
commit | eb6fe6f11a14afc82cd0039d88759d6c1c524d2f (patch) | |
tree | 641ae3fc5f338e359addb33619f5ee98686e1575 | |
parent | 969ec64ef3195e00ae597e49a39e804bb6ce6464 (diff) |
improve documentation, add sfeed(5) for the file format
separate sfeed(5) page for just the feed file format.
-rw-r--r-- | Makefile | 1 | ||||
-rw-r--r-- | sfeed.1 | 12 | ||||
-rw-r--r-- | sfeed.5 | 50 |
3 files changed, 57 insertions, 6 deletions
@@ -35,6 +35,7 @@ LIB = ${LIBUTIL} ${LIBXML} MAN1 = ${BIN:=.1}\ ${SCRIPTS:=.1} MAN5 = \ + sfeed.5\ sfeedrc.5 DOC = \ CHANGELOG\ @@ -17,12 +17,12 @@ recommended to always have absolute urls in your feeds. .Sh TAB-SEPARATED FORMAT FIELDS The items are saved in a TSV-like format. .Pp -The fields: title, id, author are not allowed to have newlines and TABs. All -whitespace is replaced by a single space character. Control characters are -removed. +The fields: title, id, author are not allowed to have newlines and TABs, all +whitespace characters are replaced by a single space character. Control +characters are removed. .Pp The content field can contain newlines and is escaped. TABs, newlines and '\\' -are escaped with '\\', so: '\\n', '\\t', and '\\\\'. Other whitespace +are escaped with '\\', so it becomes: '\\t', '\\n' and '\\\\'. Other whitespace characters except space are removed. Control characters are removed. .Pp The order and format of the fields are: @@ -30,7 +30,7 @@ The order and format of the fields are: .It item timestamp UNIX timestamp in UTC+0, empty on parse failure. .It item title -Title text, HTML in titles is treated as plain-text (on purpose). +Title text, HTML in titles is treated as plain-text. .It item link Absolute url, unsafe characters are encoded. .It item content @@ -55,4 +55,4 @@ Item author. .Sh CAVEATS if a timezone is not supported (non-RFC-822) the UNIX timestamp is interpreted as UTC+0. -HTML in titles is treated as plain-text (on purpose). +HTML in titles is treated as plain-text. @@ -0,0 +1,50 @@ +.Dd April 10, 2016 +.Dt SFEED 5 +.Os +.Sh NAME +.Nm sfeed +.Nd feed format +.Sh SYNOPSIS +.Nm +.Sh DESCRIPTION +.Xr sfeed 1 +reads RSS or Atom feed data (XML) from stdin. It writes the feed data in a +TAB-separated format to stdout. +.Sh TAB-SEPARATED FORMAT FIELDS +The fields: title, id, author are not allowed to have newlines and TABs, all +whitespace characters are replaced by a single space character. Control +characters are removed. +.Pp +The content field can contain newlines and is escaped. TABs, newlines and '\\' +are escaped with '\\', so it becomes: '\\t', '\\n' and '\\\\'. Other whitespace +characters except space are removed. Control characters are removed. +.Pp +The order and format of the fields are: +.Bl -tag -width 17n +.It item timestamp +UNIX timestamp in UTC+0, empty on parse failure. +.It item title +Title text, HTML in titles is treated as plain-text. +.It item link +Absolute url, unsafe characters are encoded. +.It item content +Newlines and TABs are escaped. Control characters are removed. See the +.Sx TAB-SEPARATED FORMAT FIELDS +text. +.It item content\-type +"html" or "plain". +.It item id +RSS item GUID or Atom id. +.It item author +Item author. +.It feed type +"rss" or "atom". +.El +.Sh SEE ALSO +.Xr sfeed 1 +.Sh AUTHORS +.An Hiltjo Posthuma Aq Mt hiltjo@codemadness.org +.Sh CAVEATS +if a timezone is not supported (non-RFC-822) the UNIX timestamp is interpreted +as UTC+0. +HTML in titles is treated as plain-text. |