[olug] OT? C programming question.

Luke-Jr luke at dashjr.org
Thu Sep 2 06:00:37 UTC 2010


On Wednesday, September 01, 2010 04:37:51 pm Rob Townley wrote:
> Since this may be an international application, he may want to verify
> that he is not outputting unicode characters to something expecting
> single byte characters.

C doesn't work with Unicode, just byte arrays. It won't ever do any kind of 
magic UTF-8 encoding or such.

Trivial fact: UTF-8 is not limited to 8-bit and 16-bit. Many characters take 
up 24 bits, and some can use even more.

BUT UTF-8 *is* 100% backward compatible with ASCII. Any ASCII string is valid 
in UTF-8 with the same characters. Including tilde, of course. Also, there is 
no such thing as "8-bit ASCII". ASCII is strictly 7-bit.

The information returned by 'nl_langinfo' is entirely irrelevant unless your C 
program explicitly uses it (for example, via the related 'localeconv' 
function). 



More information about the OLUG mailing list