c++ - OpenMP double for loop array with stored results -


i've spent time going on other posts still can't simple program go.

#include<iostream> #include<cmath> #include<omp.h> using namespace std;  int main() { int threadnum =4;//want manual control int steps=100000,cumulative=0, counter; int a,b,c; float dum1, dum2, dum3; float pos[10000][3] = {0}; float non=0; //rng declared  #pragma omp parallel private(dum1,dum2,dum3,counter,a,b,c) reduction (+: non, cumulative) num_threads(threadnum) {     for(int dummy=0;dummy<(10000/threadnum);dummy++)     {             dum1=0,dum2=0,dum3=0;             a=0,b=0,c=0;             (counter=0;counter<steps;counter++)             {                 dum1 = somefunct1()+rand();                 dum2=somefunct2()+rand();                 dum3 = somefunct3(dum1, dum2, ...);                  += somefunct4(dum1,dum2,dum3, ...);                 b += somefunct5(dum1,dum2,dum3, ...);                 c += somefunct6(dum1,dum2,dum3, ...);                  cumulative++; //count number of loops executed             }             pos[dummy][0] = a;//saves results of second loop array             pos[dummy][1] = b;             pos[dummy][2] = c;             non+= pos[dummy][0];//holds summed values         } } } 

i've cut down program fit here. lot of times if make changes, , i've tried lot, lot of time inner loop not execute correct number of times , cumulative equal 32,532,849 instead of 1 billion. scaling 2x code above should higher.

i want code break first 10000 iteration loop each thread runs number of iterations in parallel (if dynamic nice) , saves results of each iteration of second loop results array. second loop composed of dependents , cannot broken. order of 'dummy' iterations not matter (can switch pos[345] pos[3456] long 3 indices switches) have modify later matter.

the numerous variables , initializations in inner loop confusing me terribly. there lot of random calls , functions/math functions in inner loop - there overhead here causing problem? i'm using gnu 4.9.2 on windows.

any appreciated.

edit: fixed. moved rng declaration inside first loop. 3.75x scaling going 4 threads , 5.72x scaling on 8 threads (hyperthreads). not perfect take it. still think there issue thread locking , syncing.

...... float non=0;     #pragma omp parallel private(dum1,dum2,dum3,counter,a,b,c) reduction (+: non, cumulative) num_threads(threadnum) {     //rng declared     #pragma omp     for(int dummy=0;dummy<(10000/threadnum);dummy++)     { .... 


Comments

Popular posts from this blog

powershell Start-Process exit code -1073741502 when used with Credential from a windows service environment -

twig - Using Twigbridge in a Laravel 5.1 Package -

c# - LINQ join Entities from HashSet's, Join vs Dictionary vs HashSet performance -