wcwidth() is a POSIX function that returns the number of cells that a
wide-character occupies. The glibc function cannot be used as it depends
on the locale and we need _always_ UTF8 no matter what the locale is.
This implementation is provided by Markus Kuhn and is equivalent to
xterm's behavior.
Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
We must under all conditions avoid encoding invalid UTF8. Otherwise, we
would rely on other applications to do error-recovery.
Unfortunately, this is no syntactical change but a semnatical fix as the
Unicode standard defines several codepoints which are invalid or which
must never be used in UTF8.
See the Unicode standard if you're interested in these codepoint ranges.
Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
The array type used to be from glib which did that check automatically. We
now have to check explicitely that we do not access it out-of-bounds.
This fixes a nasty resizing-bug of TSM.
Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
We actually created a new table all the time which led to huge memory
leaks. We now actually use the default table if it is available.
Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
This is no longer used. You should first retrieve the UCS4 string and then
use the UCS4 to U8 conversion helpers instead.
All users have already been converted so we can remove this helper safely.
Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
This small helper allocates a string big enough to hold the whole u8
string. This should be used for short and temporary strings only! It
allocates way to much memory for bigger or long-living strings.
Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
We never checked the memory helpers for errors because they used to be
from glib. However, with our own helpers we need to check for errors to be
sure.
Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
Add three helpers to create and manage symbol-tables. Also fix internal
default-table to use them.
Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
This helper function may be useful to other external code and allows us to
always return UCS4 strings. Other code can then use this helper to convert
it into UTF8.
Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
We should avoid any global state in shared libraries. As the TSM code is
becoming a shared library, we definitely need contexts for symbol tables.
However, we don't want to fix up all code now so we use a default table
NULL instead.
This can be fixed later but is ok for now.
Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
The logging-layer is not a dependency of TSM so we cannot use it. It is
also not needed anymore, as the unicode-layer is working pretty well.
Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>