From 6b9a891452a00c176022a995334a33696d85303a Mon Sep 17 00:00:00 2001 From: Hiltjo Posthuma Date: Sat, 21 May 2016 14:09:54 +0200 Subject: improve wording in documentation link to sfeed(5) in README to avoid having to duplicate documentation text. --- README | 35 +++++------------------------------ sfeed.1 | 30 ++++++++++++++---------------- sfeed.5 | 32 +++++++++++++++++--------------- sfeed_frames.1 | 2 +- 4 files changed, 37 insertions(+), 62 deletions(-) diff --git a/README b/README index 8a1b900..da8cddc 100644 --- a/README +++ b/README @@ -87,7 +87,7 @@ Platforms tested - Linux (glibc+gcc, musl-gcc, clang). - NetBSD -- OpenBSD +- OpenBSD: (gcc, pcc). - Windows (cygwin gcc, mingw). @@ -132,36 +132,11 @@ feedname - TAB-separated format containing all items per feed. The feedname.new - Temporary file used by sfeed_update(1) to merge items. -TAB-separated format fields ---------------------------- +File format +----------- -The items are saved in a TSV-like format. - -The fields: title, id, author are not allowed to have newlines and TABs, all -whitespace characters are replaced by a space character. Control characters are -removed. - -The content field can contain newlines and TABS and are escaped. TABs, newlines -and '\' are escaped with '\', so it becomes: '\t', '\n' and '\\'. Other -whitespace characters except space are removed. Control characters are removed. - -The order and format of the fields are: - -item UNIX timestamp - UNIX timestamp (UTC+0), empty on parse failure. -item title - Title text, HTML in titles is treated as - plain-text. -item link - Absolute url, unsafe characters are encoded. -item content - Newlines and TABs are escaped. Control characters - are removed. See the "TAB-separated format fields" - text. -item contenttype - "html" or "plain". -item id - RSS item GUID or Atom id. -item author - Item author. - -CAVEATS: -- if a timezone is not supported (non-RFC-822) the UNIX timestamp is - interpreted as UTC+0. -- HTML in titles is not supported on purpose. +man 5 sfeed +man 1 sfeed Usage and examples diff --git a/sfeed.1 b/sfeed.1 index dc0d336..3784694 100644 --- a/sfeed.1 +++ b/sfeed.1 @@ -25,32 +25,30 @@ The content field can contain newlines and is escaped. TABs, newlines and '\\' are escaped with '\\', so it becomes: '\\t', '\\n' and '\\\\'. Other whitespace characters except space are removed. Control characters are removed. .Pp -The order and format of the fields are: +The order and content of the fields are: .Bl -tag -width 17n -.It item timestamp +.It timestamp UNIX timestamp in UTC+0, empty on parse failure. -.It item title -Title text, HTML in titles is treated as plain-text. -.It item link +.It title +Title text, HTML code in titles is ignored and is treated as plain-text. +.It link Absolute url, unsafe characters are encoded. -.It item content -Newlines and TABs are escaped. Control characters are removed. See the -.Sx TAB-SEPARATED FORMAT FIELDS -text. -.It item content\-type +.It content +Content, can have plain-text or HTML code depending on the content\-type field. +.It content\-type "html" or "plain". -.It item id +.It id RSS item GUID or Atom id. -.It item author +.It author Item author. .El .Sh SEE ALSO .Xr sfeed_plain 1 , -.Xr sfeed_update 1 , -.Xr sh 1 +.Xr sfeed 5 .Sh AUTHORS .An Hiltjo Posthuma Aq Mt hiltjo@codemadness.org .Sh CAVEATS -if a timezone is not supported (non-RFC-822) the UNIX timestamp is interpreted -as UTC+0. +If a timezone is not in the RFC-822 or RFC-3332 format it is not supported and +the UNIX timestamp is interpreted as UTC+0. +.Pp HTML in titles is treated as plain-text. diff --git a/sfeed.5 b/sfeed.5 index 16b4bb1..17dc58a 100644 --- a/sfeed.5 +++ b/sfeed.5 @@ -11,6 +11,8 @@ reads RSS or Atom feed data (XML) from stdin. It writes the feed data in a TAB-separated format to stdout. .Sh TAB-SEPARATED FORMAT FIELDS +The items are saved in a TSV-like format. +.Pp The fields: title, id, author are not allowed to have newlines and TABs, all whitespace characters are replaced by a single space character. Control characters are removed. @@ -19,30 +21,30 @@ The content field can contain newlines and is escaped. TABs, newlines and '\\' are escaped with '\\', so it becomes: '\\t', '\\n' and '\\\\'. Other whitespace characters except space are removed. Control characters are removed. .Pp -The order and format of the fields are: +The order and content of the fields are: .Bl -tag -width 17n -.It item timestamp +.It timestamp UNIX timestamp in UTC+0, empty on parse failure. -.It item title -Title text, HTML in titles is treated as plain-text. -.It item link +.It title +Title text, HTML code in titles is ignored and is treated as plain-text. +.It link Absolute url, unsafe characters are encoded. -.It item content -Newlines and TABs are escaped. Control characters are removed. See the -.Sx TAB-SEPARATED FORMAT FIELDS -text. -.It item content\-type +.It content +Content, can have plain-text or HTML code depending on the content\-type field. +.It content\-type "html" or "plain". -.It item id +.It id RSS item GUID or Atom id. -.It item author +.It author Item author. .El .Sh SEE ALSO -.Xr sfeed 1 +.Xr sfeed 1 , +.Xr sfeed_plain 1 .Sh AUTHORS .An Hiltjo Posthuma Aq Mt hiltjo@codemadness.org .Sh CAVEATS -if a timezone is not supported (non-RFC-822) the UNIX timestamp is interpreted -as UTC+0. +If a timezone is not in the RFC-822 or RFC-3332 format it is not supported and +the UNIX timestamp is interpreted as UTC+0. +.Pp HTML in titles is treated as plain-text. diff --git a/sfeed_frames.1 b/sfeed_frames.1 index 96118ad..b37866b 100644 --- a/sfeed_frames.1 +++ b/sfeed_frames.1 @@ -42,7 +42,7 @@ The maximum length of the path is PATH_MAX or filesystem-specific (truncated). .Sh AUTHORS .An Hiltjo Posthuma Aq Mt hiltjo@codemadness.org .Sh SECURITY CONSIDERATIONS -Each item file contain the item content formatted as HTML, if the feed data +Each item content file contains the content formatted as HTML, if the feed data contains HTML like Javascripts, tracking cookies, custom styles and such these will also be displayed. Due to the crazy nature of "the web" these things are complex to filter. Some security and privacy can be gained by using an -- cgit v1.2.3