C++ 使用 nvptxnone 与 gcc 链接时出现 OpenMP 卸载错误：未解析的符号 _fputwc_r

发布于04月07日

我正在try 使用OpenMP卸载为NVIDIA图形处理器编译一个简单的测试问题.我使用的是带有NVPTX-NONE目标的GCC.我已经用Spack安装了GCC+nvptx包(或者我自己用nvptx-Tools编译了GCC-13，结果是一样的). 在链接过程中，我收到以下错误:

unresolved symbol _fputwc_r
collect2: error: ld returned 1 exit status
mkoffload: fatal error: x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status
compilation terminated.
lto-wrapper: fatal error: /path/to/spack/opt/spack/linux-centos8-x86_64_v3/gcc-13.0.0/gcc-12.2.0-6olbpwbs53cquwnpsvrmuxprmaofwjtk/libexec/gcc/x86_64-pc-linux-gnu/12.2.0//accel/nvptx-none/mkoffload returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed

按照建议使用-fno-stack-protector编译，例如here或 here，并没有缓解这个问题.-fno-lto可以，但然后卸载就不起作用了.不同的优化标志没有区别.

所使用的ld似乎就是系统安装.Spack安装在spack/linux-centos8-x86_64_v3/gcc-13.0.0/gcc-12.2.0-6olbpwbs53cquwnpsvrmuxprmaofwjtk/nvptx-none中提供了另一个id，但Spack通常不会将其添加到路径中.我想有很好的理由，因为把它包括在内会导致

as: unrecognized option '--64'
nvptx-as: missing .version directive at start of file '/tmp/cc9YfveM.s'``

这是链接器的问题，还是其他问题？该问题仅在实际包括并行for循环时发生，仅设置#pragma omp target就不会.设备实际上是可识别的，只要不存在并行区域，此杂注中的代码就会根据OpenMP在设备上运行，这将产生上述错误.

补充资料: 系统是Rocky Linux release 8.7 (Green Obsidian) 我正在执行的测试程序基于OpenMP测试程序.它的完整代码是:

#include <stdio.h>
#include <stdlib.h>
#include <omp.h>
void saxpy(float a, float* x, float* y, int sz) {
#pragma omp target teams distribute parallel for simd \
   num_teams(3) map(to:x[0:sz]) map(tofrom:y[0:sz])
   for (int i = 0; i < sz; i++) {
      if (omp_is_initial_device()) {
         printf("Running on host\n");    
      } else {
         int nthreads= omp_get_num_threads();
         int nteams= omp_get_num_teams(); 
         printf("Running on device with %d teams (fixed) in total and %d threads in each team\n",nteams,nthreads);
      }
      fprintf(stdout, "Thread %d %i\n", omp_get_thread_num(), i );
      y[i] = a * x[i] + y[i];
   }
}
int main(int argc, char** argv) {
   float a = 2.0;
   int sz = 16;
   float *x = calloc( sz, sizeof *x );
   float *y = calloc( sz, sizeof *y );
   //Set values
   int num_devices = omp_get_num_devices();
   printf("Number of available devices %d\n", num_devices);
   saxpy( a, x, y, sz );
   return 0;
}

我试着用以下命令编译它

gcc -O0 -fopenmp -foffload=nvptx-none -o mintest mintest.c

或上面提到的旗帜.

void test_on_gpu(void) { int on_device = 0; #pragma omp target teams map(from:on_device) { #pragma omp parallel { #pragma omp master { if (0 = omp_get_team_num()) { on_device = !omp_is_initial_device() } } } } printf("on GPU: %s\n", on_device ? "yes" : "no"); }

C++ 使用 nvptxnone 与 gcc 链接时出现 OpenMP 卸载错误：未解析的符号 _fputwc_r

推荐答案

C++相关问答推荐

GCC：try 使用—WError或—pedantic using pragmas

当打印字符串时，为什么在c中没有使用常量限定符时我会收到警告？

为什么在此程序中必须使用Volatile关键字？

当execvp在C函数中失败时杀死子进程

在传统操作系统上可以在虚拟0x0写入吗？

有没有可能我不能打印？(C，流程)

C中函数类型的前向声明

为静态库做准备中的奇怪行为

Linux不想运行编译后的文件

在我的代码中，我需要在哪里编写输出函数？

如何将字符**传递给需要常量字符指针的常量数组的函数

在Apple Silicon上编译x86的Fortran/C程序

GCC奇怪的行为，有fork 和印花，有换行符和不换行符

如何在不使用字符串的情况下在c中编写函数atof().h>；

我不知道为什么它不能正常工作，我用了get()和fget()，结果是一样的

存储和访问指向 struct 的指针数组

在Ubuntu上使用库部署C程序的最佳实践

共享内存未授予父进程权限

令人困惑的返回和 scanf 问题相关

返回指向函数内声明的复合文字的指针是否安全，还是应该使用 malloc？