作为操作系统课程的一部分,我正在练习多线程程序,并编写了一个非常基本的程序,计算从1到n的和.然而,当我在我的英特尔i5 1035G7笔记本电脑上运行它时,它比我的台式机Ryzen 5 5600G运行得更快,后者应该比我的笔记本电脑强大得多.串行版(只是在没有多线程的情况下将1加到n)在我的PC上运行得更快,所以我真的很困惑为什么多线程版不能.有人能解释一下为什么吗?任何帮助都将不胜感激.
以下是我的程序的基本 struct :
struct threadData
{
unsigned long long *tempResult;
long long start;
long long end;
};
void *calculateTotal(void *arg)
{
struct threadData *structPtr = (struct threadData *)arg;
for (long long i = structPtr->start; i <= structPtr->end; i++)
{
*(structPtr->tempResult) += i;
}
pthread_exit(0);
}
int main(int argc, char *argv[])
{
...
int numThreads = atoi(argv[1]);
long long input = atoll(argv[2]);
struct threadData threadDataArray[numThreads];
unsigned long long result = 0;
unsigned long long threadResult[numThreads];
...
pthread_t threads[numThreads];
for (int i = 0; i < numThreads; i++)
{
threadDataArray[i].start = i * input / (numThreads) + 1;
threadDataArray[i].end = (i + 1) * input / (numThreads);
threadDataArray[i].tempResult = &threadResult[i];
int threadCreate = pthread_create(&threads[i], NULL, calculateTotal, (void *)&threadDataArray[i]);
...
}
// wait for all threads to join
for (int i = 0; i < numThreads; i++)
{
int threadJoin = pthread_join(threads[i], NULL);
...
}
// calculate total
for (int i = 0; i < numThreads; i++)
{
result += threadResult[i];
}
printf("(running in parallel)\nsum is: %llu\n", result);
}
我的笔记本电脑上的结果是:
$ time ./ex2parallel 10 4897582469
(running in parallel)
sum is: 11993157022776859215
real 0m9.955s
user 1m0.628s
sys 0m0.010s
我桌面上的结果是:
$ time ./ex2parallel 10 4897582469
(running in parallel)
sum is: 11993157022776859215
real 0m31.003s
user 4m1.410s
sys 0m0.011s
我在我的笔记本电脑和台式机上都运行了这两个程序,并希望我的Ryzen台式机运行得比我的笔记本电脑好得多.