I am building a string library to support both ascii and utf8.
I create two typedef for t_ascii and t_utf8. ascii is safe to be read as utf8, but utf8 is not safe to be read as ascii.
Do I have any way to issue a warning when implicitely casting from t_utf8 to t_ascii, but not when implicitely casting t_ascii to t_utf8 ?

理想情况下,我希望发布以下警告(而且只有这些警告):

#include <stdint.h>

typedef char           t_ascii;
typedef uint_least8_t  t_utf8;

int main()
{
    t_ascii const* asciistr = "Hello world"; // Ok
    t_utf8 const*   utf8str = "你好世界";    // Ok

    asciistr = utf8str; // Warning: utf8 to ascii is not safe
    utf8str = asciistr; // Ok: ascii to utf8 is safe

    t_ascii asciichar = 'A';
    t_utf8   utf8char = 'B';

    asciichar = utf8char; // Warning: utf8 to ascii is not safe
    utf8char = asciichar; // Ok: ascii to utf8 is safe
}

目前,当使用-Wall(甚至-funsigned-char)构建时,我会收到以下警告:

gcc main.c -Wall -Wextra                          
main.c: In function ‘main’:
main.c:10:35: warning: pointer targets in initialization of ‘const t_utf8 *’ {aka ‘const unsigned char *’} from ‘char *’ differ in signedness [-Wpointer-sign]
   10 |         t_utf8 const*   utf8str = "你好世界";    // Ok
      |                                   ^~~~~~~~~~
main.c:12:18: warning: pointer targets in assignment from ‘const t_utf8 *’ {aka ‘const unsigned char *’} to ‘const t_ascii *’ {aka ‘const char *’} differ in signedness [-Wpointer-sign]
   12 |         asciistr = utf8str; // Warning: utf8 to ascii is not safe
      |                  ^
main.c:16:17: warning: pointer targets in assignment from ‘const t_ascii *’ {aka ‘const char *’} to ‘const t_utf8 *’ {aka ‘const unsigned char *’} differ in signedness [-Wpointer-sign]
   16 |         utf8str = asciistr; // Ok: ascii to utf8 is safe
      |                 ^

推荐答案

-Wall编译.总是用-Wall来编译.

<user>@squall:~/src/p1$ gcc -Wall -c test2.c
test2.c: In function ‘main’:
test2.c:9:31: warning: pointer targets in initialization of ‘const t_utf8 *’ {aka ‘const signed char *’} from ‘char *’ differ in signedness [-Wpointer-sign]
    9 |     t_utf8  const*  utf8str = "你好世界";
      |                               ^~~~~~~~~~~~~~
test2.c:11:13: warning: pointer targets in assignment from ‘const t_ascii *’ {aka ‘const char *’} to ‘const t_utf8 *’ {aka ‘const signed char *’} differ in signedness [-Wpointer-sign]
   11 |     utf8str = asciistr; // Ok: ascii to utf8 is safe
      |             ^
test2.c:12:14: warning: pointer targets in assignment from ‘const t_utf8 *’ {aka ‘const signed char *’} to ‘const t_ascii *’ {aka ‘const char *’} differ in signedness [-Wpointer-sign]
   12 |     asciistr = utf8str; // Should issue warning: utf8 to ascii is not safe
      |              ^

你希望从t_asciit_utf8投下是安全的,但事实并非如此.符号不同.

警告并不是说有效的utf8有时不是有效的ASCII,编译器对此一无所知.警告是关于标志的.

如果你想要一个无符号的char,用-funsigned-char编译.但到时候两个警告都不会发布.

(顺便说一句,如果你认为类型int_least8_t能够保存多字节字符/完整的utf8码点编码,那么它不会.在一个编译单元中,所有的int_least8_tutf8_t都将具有完全相同的大小.)

C++相关问答推荐

从STdin读写超过4096个字节

C:gcc返回多个错误定义,但msvc—不""'

字符数组,字符指针,在一种情况下工作,但在另一种情况下不工作?

不同到达时间的轮询实现

变量>;-1如何在C中准确求值?

是否可以通过调用两个函数来初始化2D数组?示例:ARRAY[STARTING_ROWS()][STARTING_COLUMNS()]

为什么我的Hello World EFI程序构建不正确?

为什么我从CSV文件中进行排序和搜索的代码没有显示数据的所有结果?

如何在VS 2022中正确安装额外的C头文件

MacOS下C++的无阻塞键盘阅读

带有数组指针的 struct 在print_stack()函数中打印随机数

在Ubuntu上使用库部署C程序的最佳实践

为什么一个在线编译器拒绝这个VLA代码,而本地的Apple clang却不拒绝;t?

未为同一文件中的函数执行DirectFunctionCall

C 语言中 CORDIC 对数的问题

为什么孤儿进程在 Linux 中没有被 PID 1 采用,就像我读过的一本书中声称的那样?

Linux memcpy 限制关键字语法

如何在 C 中编辑 struct 体中的多个变量

我们可以在不违反标准的情况下向标准函数声明添加属性吗?

为什么在许多开源代码中如此流行对 C 中内置的函数或变量使用 #define 或 typedef 别名?