14 Commits

Author SHA1 Message Date
David Herrmann
bc40e1ae53 tsm: unicode: add wcwidth() implementation
wcwidth() is a POSIX function that returns the number of cells that a
wide-character occupies. The glibc function cannot be used as it depends
on the locale and we need _always_ UTF8 no matter what the locale is.

This implementation is provided by Markus Kuhn and is equivalent to
xterm's behavior.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
2012-12-10 15:36:04 +01:00
David Herrmann
17a56a24f2 tsm: unicode: do not encode invalid UTF8
We must under all conditions avoid encoding invalid UTF8. Otherwise, we
would rely on other applications to do error-recovery.
Unfortunately, this is no syntactical change but a semnatical fix as the
Unicode standard defines several codepoints which are invalid or which
must never be used in UTF8.
See the Unicode standard if you're interested in these codepoint ranges.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
2012-09-30 17:59:36 +02:00
David Herrmann
fd3816ec4d tsm: unicode: fix accessing symbol-array out of bounds
The array type used to be from glib which did that check automatically. We
now have to check explicitely that we do not access it out-of-bounds.

This fixes a nasty resizing-bug of TSM.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
2012-09-27 12:13:18 +02:00
David Herrmann
090ead3bd6 tsm: unicode: fix not recreating the default-table all the time
We actually created a new table all the time which led to huge memory
leaks. We now actually use the default table if it is available.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
2012-09-27 12:11:52 +02:00
David Herrmann
f919b3ad81 tsm: fix header comments
Remove "kmscon" from any TSM headers and replace it by "tsm".

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
2012-09-18 14:50:02 +02:00
David Herrmann
32fec77420 tsm: unicode: remove tsm_symbol_get_u8()
This is no longer used. You should first retrieve the UCS4 string and then
use the UCS4 to U8 conversion helpers instead.

All users have already been converted so we can remove this helper safely.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
2012-09-18 14:12:28 +02:00
David Herrmann
57928f5df0 tsm: unicode: add helper for ucs4->u8 string conversions
This small helper allocates a string big enough to hold the whole u8
string. This should be used for short and temporary strings only! It
allocates way to much memory for bigger or long-living strings.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
2012-09-18 13:41:17 +02:00
David Herrmann
c15604b6b7 tsm: unicode: fix error-path in tsm_symbol_append()
We never checked the memory helpers for errors because they used to be
from glib. However, with our own helpers we need to check for errors to be
sure.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
2012-09-18 12:17:41 +02:00
David Herrmann
372948af53 tsm: unicode: add symbol-table helpers
Add three helpers to create and manage symbol-tables. Also fix internal
default-table to use them.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
2012-09-18 12:03:17 +02:00
David Herrmann
da51fe1589 tsm: unicode: merge tsm_symbol_get() into table__get()
We do not do locking in tsm anymore so we can remove the table__get()
helper.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
2012-09-18 11:48:02 +02:00
David Herrmann
4af046d39d tsm: unicode: export tsm_ucs4_to_utf8()
This helper function may be useful to other external code and allows us to
always return UCS4 strings. Other code can then use this helper to convert
it into UTF8.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
2012-09-18 11:39:08 +02:00
David Herrmann
d5a0c9644c tsm: unicode: add symbol-table contexts
We should avoid any global state in shared libraries. As the TSM code is
becoming a shared library, we definitely need contexts for symbol tables.
However, we don't want to fix up all code now so we use a default table
NULL instead.

This can be fixed later but is ok for now.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
2012-09-18 11:30:38 +02:00
David Herrmann
52d2eb09b1 tsm: unicode: remove log_* calls
The logging-layer is not a dependency of TSM so we cannot use it. It is
also not needed anymore, as the unicode-layer is working pretty well.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
2012-09-18 11:04:50 +02:00
David Herrmann
cca90781c0 tsm: move unicode.[ch] to tsm_unicode.[ch]
All TSM related files will get the tsm_* prefix so move unicode headers
and sources.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
2012-09-18 10:54:06 +02:00