#include <ti/omp/omp.h>
#include <stdio.h>
#include <stdlib.h>
#include <c6x.h>
void test()
{
int a = 0;
for (int j = 0; j<100000; j++)
a++;
}
void main()
{
int i;
printf("starting running\n");
unsigned long long t1,t2;
TSCL = 0;
TSCH = 0;
t1 = _itoll(TSCH,TSCL);
#pragma omp parallel for private(i)
for ( i = 0; i<100; i++){
test();
}
t2 = _itoll(TSCH,TSCL);
printf("time is: %ld\n",(t2-t1)/10000);
}
问题一:当我跑这个Demo的时候,单核用时0.21s,四核用时0.17s,八核用时0.14s,从时间效率上看,按理说是按倍数减少,请问下有什么额外的开销使得出现这样的结果。
问题二:当我跑我自己的程序的时候,单核时间时0.8s,四核是0.9s,八核是0.3s,请问下在这里为什么四核的时间效率反而还比单核还长???是因为调用多核产生了其他额外的开销还是什么原因??
yuanchao gan:
麻烦TI的员工帮忙解答一下,谢谢!
Wesley He:
回复 yuanchao gan:
你好,
问题解决了吗?该问题已经反馈给相关专家,有什么更新的话,我们会及时通知您。
谢谢。
Wesley He:
回复 yuanchao gan:
你好,
The execution time of a OpenMP program depends on the "parallel" to "serial" percentage as well as the overhead from the inter processor communication for running multi-thread. Accordingly, users need to balance the parallelism percentage and the overhead from multi-threads to achieve optimal performance.TI OpenMP Runtime 2.2 implements support for the OpenMP 3.0 API specification, hence the resources in openmp.org should be helpful to customer's proprietary algorithm – www.openmp.org/…/ .
yuanchao gan:
回复 Wesley He:
问题还没解决