mirror of
https://github.com/Gator96100/ProxSpace.git
synced 2025-01-09 20:33:34 -08:00
132 lines
6.4 KiB
HTML
132 lines
6.4 KiB
HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html401/loose.dtd">
|
|
<html>
|
|
<!-- Created on October, 16 2022 by texi2html 1.78a -->
|
|
<!--
|
|
Written by: Lionel Cons <Lionel.Cons@cern.ch> (original author)
|
|
Karl Berry <karl@freefriends.org>
|
|
Olaf Bachmann <obachman@mathematik.uni-kl.de>
|
|
and many others.
|
|
Maintained by: Many creative people.
|
|
Send bugs and suggestions to <texi2html-bug@nongnu.org>
|
|
|
|
-->
|
|
<head>
|
|
<title>GNU libunistring: B. The char32_t problem</title>
|
|
|
|
<meta name="description" content="GNU libunistring: B. The char32_t problem">
|
|
<meta name="keywords" content="GNU libunistring: B. The char32_t problem">
|
|
<meta name="resource-type" content="document">
|
|
<meta name="distribution" content="global">
|
|
<meta name="Generator" content="texi2html 1.78a">
|
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
|
<style type="text/css">
|
|
<!--
|
|
a.summary-letter {text-decoration: none}
|
|
pre.display {font-family: serif}
|
|
pre.format {font-family: serif}
|
|
pre.menu-comment {font-family: serif}
|
|
pre.menu-preformatted {font-family: serif}
|
|
pre.smalldisplay {font-family: serif; font-size: smaller}
|
|
pre.smallexample {font-size: smaller}
|
|
pre.smallformat {font-family: serif; font-size: smaller}
|
|
pre.smalllisp {font-size: smaller}
|
|
span.roman {font-family:serif; font-weight:normal;}
|
|
span.sansserif {font-family:sans-serif; font-weight:normal;}
|
|
ul.toc {list-style: none}
|
|
-->
|
|
</style>
|
|
|
|
|
|
</head>
|
|
|
|
<body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000">
|
|
|
|
<table cellpadding="1" cellspacing="1" border="0">
|
|
<tr><td valign="middle" align="left">[<a href="libunistring_18.html#SEC81" title="Beginning of this chapter or previous chapter"> << </a>]</td>
|
|
<td valign="middle" align="left">[<a href="libunistring_20.html#SEC83" title="Next chapter"> >> </a>]</td>
|
|
<td valign="middle" align="left"> </td>
|
|
<td valign="middle" align="left"> </td>
|
|
<td valign="middle" align="left"> </td>
|
|
<td valign="middle" align="left"> </td>
|
|
<td valign="middle" align="left"> </td>
|
|
<td valign="middle" align="left">[<a href="libunistring_toc.html#SEC_Top" title="Cover (top) of document">Top</a>]</td>
|
|
<td valign="middle" align="left">[<a href="libunistring_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td>
|
|
<td valign="middle" align="left">[<a href="libunistring_21.html#SEC92" title="Index">Index</a>]</td>
|
|
<td valign="middle" align="left">[<a href="libunistring_abt.html#SEC_About" title="About (help)"> ? </a>]</td>
|
|
</tr></table>
|
|
|
|
<hr size="2">
|
|
<a name="The-char32_005ft-problem"></a>
|
|
<a name="SEC82"></a>
|
|
<h1 class="appendix"> <a href="libunistring_toc.html#TOC82">B. The <code>char32_t</code> problem</a> </h1>
|
|
|
|
<p>In response to the <code>wchar_t</code> mess described in the previous section,
|
|
ISO C 11 introduces two new types: <code>char32_t</code> and <code>char16_t</code>.
|
|
</p>
|
|
<p><code>char32_t</code> is a type like <code>wchar_t</code>, with the added guarantee that it
|
|
is 32 bits wide. So, it is a type that is appropriate for encoding a Unicode
|
|
character. It is meant to resolve the problems of the 16-bit wide
|
|
<code>wchar_t</code> on AIX and Windows platforms, and allow a saner programming model
|
|
for wide character strings across all platforms.
|
|
</p>
|
|
<p><code>char16_t</code> is a type like <code>wchar_t</code>, with the added guarantee that it
|
|
is 16 bits wide. It is meant to allow porting programs that use the broken wide
|
|
character strings programming model from Windows to all platforms. Of course,
|
|
no one needs this.
|
|
</p>
|
|
<p>These types are accompanied with a syntax for defining wide string literals with
|
|
these element types: <code>u"..."</code> and <code>U"..."</code>.
|
|
</p>
|
|
<p>So far, so good. What the ISO C designers forgot, is to provide standardized C
|
|
library functions that operate on these wide character strings. They
|
|
standardized only the most basic functions, <code>mbrtoc32</code> and <code>c32rtomb</code>,
|
|
which are analogous to <code>mbrtowc</code> and <code>wcrtomb</code>, respectively. For the
|
|
rest, GNU gnulib <a href="https://www.gnu.org/software/gnulib/">https://www.gnu.org/software/gnulib/</a> provides the
|
|
functions:
|
|
</p><ul>
|
|
<li>
|
|
Functions for converting an entire string: <code>mbstoc32s</code> – like
|
|
<code>mbstowcs</code>, <code>c32stombs</code> – like <code>wcstombs</code>.
|
|
</li><li>
|
|
Functions for testing the properties of a 32-bit wide character:
|
|
<code>c32isalnum</code>, <code>c32isalpha</code>, etc. – like <code>iswalnum</code>,
|
|
<code>iswalpha</code>, etc.
|
|
</li></ul>
|
|
|
|
<p>Still, this API has two problems:
|
|
</p><ul>
|
|
<li>
|
|
The <code>char32_t</code> encoding is locale dependent and undocumented. This means,
|
|
if you want to know any property of a <code>char32_t</code> character, other than the
|
|
properties defined by <code><wctype.h></code> – such as whether it's a dash, currency
|
|
symbol, paragraph separator, or similar –, you have to convert it to
|
|
<code>char *</code> encoding first, by use of the function <code>c32tomb</code>.
|
|
</li><li>
|
|
Even on platforms where <code>wchar_t</code> is 32 bits wide, the <code>char32_t</code>
|
|
encoding may be different from the <code>wchar_t</code> encoding.
|
|
</li></ul>
|
|
|
|
<hr size="6">
|
|
<table cellpadding="1" cellspacing="1" border="0">
|
|
<tr><td valign="middle" align="left">[<a href="libunistring_18.html#SEC81" title="Beginning of this chapter or previous chapter"> << </a>]</td>
|
|
<td valign="middle" align="left">[<a href="libunistring_20.html#SEC83" title="Next chapter"> >> </a>]</td>
|
|
<td valign="middle" align="left"> </td>
|
|
<td valign="middle" align="left"> </td>
|
|
<td valign="middle" align="left"> </td>
|
|
<td valign="middle" align="left"> </td>
|
|
<td valign="middle" align="left"> </td>
|
|
<td valign="middle" align="left">[<a href="libunistring_toc.html#SEC_Top" title="Cover (top) of document">Top</a>]</td>
|
|
<td valign="middle" align="left">[<a href="libunistring_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td>
|
|
<td valign="middle" align="left">[<a href="libunistring_21.html#SEC92" title="Index">Index</a>]</td>
|
|
<td valign="middle" align="left">[<a href="libunistring_abt.html#SEC_About" title="About (help)"> ? </a>]</td>
|
|
</tr></table>
|
|
<p>
|
|
<font size="-1">
|
|
This document was generated by <em>Bruno Haible</em> on <em>October, 16 2022</em> using <a href="https://www.nongnu.org/texi2html/"><em>texi2html 1.78a</em></a>.
|
|
</font>
|
|
<br>
|
|
|
|
</p>
|
|
</body>
|
|
</html>
|