diff options
author | Hiltjo Posthuma <hiltjo@codemadness.org> | 2021-01-08 19:08:59 +0100 |
---|---|---|
committer | Hiltjo Posthuma <hiltjo@codemadness.org> | 2021-01-08 19:33:29 +0100 |
commit | 04b832539cd5b5392c56ef238ec9b42b689de3ae (patch) | |
tree | f934b2769023a800ee0befe7608bdf70642fd39f /sfeed_plain.1 | |
parent | c7e3ec5f37738c43b3918cba6977fa51631a23af (diff) |
util.c: printutf8pad(): improve padded printing and printing invalid unicode characters
This affects sfeed_plain.
- Use unicode replacement character (codepoint 0xfffd) when a codepoint is
invalid and proceed printing the rest of the characters.
- When a codepoint is invalid reset the internal state of mbtowc(3), from the
OpenBSD man page:
" If a call to mbtowc() resulted in an undefined internal state, mbtowc()
must be called with s set to NULL to reset the internal state before it
can safely be used again."
- Optimize for the common ASCII case and use a macro to print the character
instead of a wasteful fwrite() function call. With 250k lines (+- 350MB) this
improves printing performance from 1.7s to 1.0s on my laptop. On an other
system it improved by +- 25%. Tested with clang and gcc and also tested the
worst-case (non-ASCII) with no penalty.
To test:
printf '0\tabc\xc3 def' | sfeed_plain
Before:
1970-01-01 01:00 abc
After:
1970-01-01 01:00 abc� def
Diffstat (limited to 'sfeed_plain.1')
0 files changed, 0 insertions, 0 deletions