Age | Commit message (Collapse) | Author |
|
|
|
This works on sfeed(5) feed output since they are already sorted.
|
|
This URL printing behaviour was changed recently in commit
f305b032bc19b4e81c0dd6c0398370028ea910ca
|
|
Make it const char *.
|
|
|
|
This fixes a warning on Linux glibc:
/usr/include/features.h:187:3: warning: #warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE" [-Wcpp]
187 | # warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE"
| ^~~~~~~
Tested on Void GNU/Linux glibc with gcc. Tested on various other platforms for
regressions too namely: OpenBSD, NetBSD, FreeBSD, Void GNU/Linux musl.
|
|
The "\s" escape sequence is non-POSIX and GNU awk gives a warning:
gawk: cmd. line:69: warning: escape sequence `\s' treated as plain `s'
BSD awk does not give this warning and supports it.
Use the POSIX [[:space:]] character class instead.
References:
- https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html
The table in the section "Regular Expressions".
- https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap05.html#tag_05
|
|
|
|
Most common-used compilers (gcc, clang) optimize this away though.
|
|
|
|
These are BSD functions.
- HaikuOS now compiles without having to use libbsd.
- Tested on SerenityOS (for fun), which doesn't have these functions (yet).
With a small change to support wcwidth() sfeed works on SerenityOS.
|
|
|
|
- Do not show stderr of readlink.
- Show the reference to the example sfeedrc (like sfeed_update).
- Make the error message a bit shorter.
- Fix showing the path if it does not exist, for example:
$ sfeed_opml_export "a"
readlink: a: No such file or directory
Configuration file "" does not exist or is not readable.
Now shows:
$ sfeed_opml_export "a"
Configuration file "a" cannot be read.
See sfeedrc.example for an example.
|
|
This title format now matches the one with sfeed_curses. It shows the count to
the most left and makes it more readable imho. It also works better when the
titlebar is small.
|
|
When sfeed_update was called without using a parameter and it used the default
and this path did not exist it would incorrectly print:
Configuration file "" does not exist or is not readable.
See sfeedrc.example for an example.
Make the error message a bit shorter too.
This was a partial regression of commit df74ba274c4ea5d9b7388c33500ba601ed0c991d
|
|
|
|
|
|
|
|
Input to reproduce:
<entry>
<link href="https://codemadness.org/a" href="https://codemadness.org/b"/>
</entry>
Old value:
"https://codemadness.org/ahttps://codemadness.org/b"
New value:
"https://codemadness.org/b"
same with RSS <enclosure url="" />
|
|
This standard was a draft used around 2005-2006.
Instead of the fields "published" and "updated" it used "issued" (mandatory
field) and "modified" (optional). Add support for them and also in preference
of supporting Atom 1.0 and creation dates first.
I don't know any real-life examples that still use this though.
Some references:
- http://rakaz.nl/2005/07/moving-from-atom-03-to-10.html
- https://www.dokuwiki.org/syndication (rss_type "atom" parameter value).
- https://support.google.com/merchants/answer/160598?hl=en
|
|
... if there is no content.
|
|
getchar_unlocked is part of POSIX and should be supported by most platforms. On
all tested platforms it has a performance benefit, sometimes smallish (<12%),
sometimes large (~40%).
|
|
Since newsboat version 2.22 (2020-12-21) it stores the content mime-type of a
field so allow to export this.
The older entries are empty and will be exported as "html" (even though they
might have been plain-text).
... also add the (empty) category field.
|
|
|
|
|
|
Reference:
https://www.w3.org/2003/01/xhtml-mimetype/
|
|
This fix is very important *ahem*.
|
|
|
|
|
|
This is useful so the script can be included, call main and then have
additional post-main functionality.
|
|
Workaround it by setting the empty "middle" fields to some value. The last
field can be empty.
Some feeds were incorrectly using the wrong base URL if the `baseurl` field was
empty but the encoding field was set. So it incorrectly used the encoding field
instead.
Only now noticed some feeds were failing because the baseURL is validated since
commit f305b032bc19b4e81c0dd6c0398370028ea910ca and returning a non-zero exit
status.
This doesn't happen with GNU xargs, busybox or toybox xargs.
Affected (atleast): OpenBSD, NetBSD, FreeBSD and DragonFlyBSD xargs which share
similar code.
Simple way to reproduce the difference:
printf 'a\0\0c\0' | xargs -0 echo
Prints "a c" on *BSD.
Prints "a c" on GNU xargs (and some other implementations).
|
|
Follow-up from a rushed commit:
commit 58555779d123be68c0acf9ea898931d656ec6d63
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Sun Feb 28 13:33:21 2021 +0100
sfeed_update: simplify, use feedurl directly
This also make it possible to use non-authoritive URLs as a baseurl, like
"magnet:" URLs.
|
|
No functional difference because the base URI host is copied beforehand.
|
|
The shellscript is optional, but reference it in the documentation.
|
|
This also make it possible to use non-authoritive URLs as a baseurl, like
"magnet:" URLs.
|
|
Removed/rewritten the functions:
absuri, parseuri, and encodeuri() for percent-encoding.
The functions are now split separately with the following purpose:
- uri_format: format struct uri into a string.
- uri_hasscheme: quick check if a string is absolute or not.
- uri_makeabs: make a URI absolute using a base uri and the original URI.
- uri_parse: parse a string into a struct uri.
The following URLs are better parsed:
- URLs with extra "/"'s in the path prepended are kept as is, no "/" is added
either for empty paths.
- URLs like "http://codemadness.org" are not changed to
"http://codemadness.org/" anymore (paths are kept as is, unless they are
non-empty and not start with "/").
- Paths are not percent-encoded anymore.
- URLs with userinfo field (username, password) are parsed.
like: ftp://user:password@[2001:db8::7]:2121/rfc/rfc1808.txt
- Non-authoritive URLs like mailto:some@email.org, magnet URIs, ISBN URIs/urn,
like: urn:isbn:0-395-36341-1 are allowed and parsed correctly.
- Both local (file:///) and non-local (file://) are supported.
- Specifying a base URL with a port will now only use it when the relative URL
has no host and port set and follows RFC3986 5.2.2 more closely.
- Parsing numeric port: parse as signed long and check <= 0, empty port is
allowed.
- Parsing URIs containing query, fragment, but no path separator (/) will now
parse the component properly.
For sfeed:
- Parse the baseURI only once (no need to do it every time for making absolute
URIs).
- If a link/enclosure is absolute already or if there is no base URL specified
then just print the link directly. There have also been other small performance
improvements related to handling URIs.
References:
- https://tools.ietf.org/html/rfc3986
- Section "5.2.2. Transform References" have also been helpful.
|
|
Combine E-Tags, If-Modified-Since in one section. Also mention the curl
--compression option for typically GZIP decompression.
Note that E-Tags were broken in curl <7.73 due to a bug with "weak" e-tags.
https://github.com/curl/curl/issues/5610
From a question/feedback by e-mail from Hadrien Lacour, thanks.
|
|
|
|
The commit that introduced the regression was:
commit 33c50db302957bca2a850ac8d0b960d05ee0520e
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Mon Oct 12 18:55:35 2020 +0200
simplify time parsing
Noticed on a RSS feed with the following date:
<pubDate>2021-02-03 05:13:03</pubDate>
This format is non-standard, but sfeed should support this.
A standard format would be (for Atom): 2021-02-03T05:13:03Z
Partially revert it.
|
|
Kindof a non-issue but if theres a sfeedrc with no feeds then xargs will still
be executed and give an error. The xargs -r option (GNU extension) fixes this:
From the OpenBSD xargs(1) man page:
"-r Do not run the command if there are no arguments. Normally the
command is executed at least once even if there are no arguments."
Reproducable with the sfeedrc:
feeds() {
true
}
|
|
|
|
|
|
This code uses the non-portable xargs -P option to more efficiently process
feeds in parallel.
|
|
This adds a main() function. When the environment variable
$SFEED_UPDATE_INCLUDE is set then it will not execute the main handler. The
other functions are included and can be reused. This is also useful for
unit-testing.
|
|
handler
This is useful to be able to reuse the code (together with using sfeed_update
as an included script, coming in the next commit).
|
|
basesiteurl
Move it closer before it is used.
|
|
"(FAIL CONVERT)" -> "(FAIL PARSE)". Convert may be too similar to text encoding
conversion.
|
|
This can be useful to make more cleanly make connector scripts.
This does not necesarily even have to be in the sfeed(5) format.
|
|
|
|
... and do not show stderr of readlink.
|