I'm working on a faster version of electron_fluid.c. However, I'm running into an error:
Code: Select all
$ ./ef
Density integral result = 0.274490
absolute error = 0.000000
opening data files and reading them in.
i = 1
gsl: qag.c:261: ERROR: could not integrate function
Default GSL error handler invoked.
Aborted
So in the mean time until I figure that out, I've parallelized potential.c as a starting point. Tell it how many CPUs to use, and it gets a fairly linear increase in speed from as many CPUs as you've got.
I've taken the quick 'n dirty approach first. I changed line 270:
for( i=0; i<MAXSTEPS+1; i++) to take i from the command line, then write a "slice" of potential.dat. This means I can throw
make at it, like this:
Code: Select all
$ time make
gcc -o gen_potential100_dat -D MAXSTEPS=100 potential.c -lm -lgsl
0 of 100
1 of 100
...
99 of 100
100 of 100
real 0m30.158s
user 0m28.910s
sys 0m0.648s
Code: Select all
$ time make -j2
gcc -o gen_potential100_dat -D MAXSTEPS=100 potential.c -lm -lgsl
0 of 100
6 of 100
1 of 100
7 of 100
...
98 of 100
99 of 100
100 of 100
real 0m18.278s
user 0m29.358s
sys 0m0.692s
With MAXSTEPS=400 I get a similar improvement.
Code: Select all
$ time make
...
real 28m40.125s
user 28m11.022s
sys 0m7.932s
$ time make -j2
...
real 14m20.629s
user 28m14.206s
sys 0m8.437s
I've verified that the output files are an exact match of the original potential.c. Use -j2 to use 2 CPUs, -j4 for 4 CPUs, etc. Download the files here:
http://polywell.nfshost.com/ef_par_v01.zip.
Note: I moved MAXSTEPS to the Makefile. It's set to 400, but if you want to reduce it, be sure things are recompiled after editing the Makefile.
I still want to do a pthread version of potential.c - so that it's completely self-contained without an involved Makefile. I'd rather spend my time making electron_fluid.c faster, though.
Edit: added MAXSTEPS=400 times. There is a limit in the Makefile of 16 processors, but it's easily expandable.