A.k.a. climate modeling!icarus wrote: Then there is the huge problem of validating the code output with real world data or how else do you know if you are not just generating meaningless pretty, colourful graphics? (cartoon engineering).
Hypothesis on Electron and Ion Behavior Inside the Polywell.

 Posts: 1435
 Joined: Wed Jul 14, 2010 5:27 pm
i resent that. especially the illinformed part. i am a professional computer program and as such i am well aware of the difficulties. i am well aware that gpus are only good for dataparallel problems. this is, obviously, exactly such a problem. and there are already exist very good gpu codes for doing very similiar problems (nbody problems). although it would not be trivial to modify them, that was precisely the intent of the original GRAPE code  to be flexible such that it could be easily adapted to similar problems, such as particle physics.icarus wrote:Probably wrong, and definitely illinformed. GPU can give massive speed up over multicore and parallelised CPUs BUT only for specific problems (like graphics or graphicslike).i believe the main gist of what's being said is that with the right computer code, we could do a mixedparticle 3dsimulation of a polywell configuration on our home computers.
Super computing is problem specific, it depends on the type or equations, BCs, ICs, etc on what type of hardware will work 'best' (fastest, most accurate).
Also "the right computer code" is a massive understatement since you must develop, tailor and tune that code for the hardware you are running on and this make take years or even a decade to achieve for a real, physical solution. Then there is the huge problem of validating the code output with real world data or how else do you know if you are not just generating meaningless pretty, colourful graphics? (cartoon engineering).
as regards validating the data, that is called testing and debugging. far from being unique to this, it is part of the development cycle of any computer program, and it is not nearly as difficult as you seem to think; it is not a "huge" problem. the process, in the final stage, is precisely as you describe: take things you know what the result should be, run it through your program, and see if it comes up with the correct solution. not really all that complicated. though in actuality one tests smaller chunks of the code before testing the whole thing. this results in a faster development cycle as well as a more reliable product.
etc. etc.
i don't mean to be presumptuous, but if you want to argue points on this subject with me, bear in mind that i am not a layman. i know what i'm talking about. and starting off by calling me "illinformed" is presumptuous and, in this case, completely wrong. not to mention that it's probably not the best way to start off _any_ conversation.
I thought the main problem with GPU codes were random errors in quality of the cards. For fast frame games an odd rogue pixel is not a problem but for simmulations it could be. Perhaps dependant on consumer versus professional workstation priced cards.
btw, you pass the Turing Test very well!happyjack27 wrote: i am a professional computer program
In theory there is no difference between theory and practice, but in practice there is.

 Posts: 1435
 Joined: Wed Jul 14, 2010 5:27 pm
BenTC wrote:I thought the main problem with GPU codes were random errors in quality of the cards. For fast frame games an odd rogue pixel is not a problem but for simmulations it could be. Perhaps dependant on consumer versus professional workstation priced cards.
btw, you pass the Turing Test very well!happyjack27 wrote: i am a professional computer program
doh! apparently not well enough, for now you know my secret.
(the irony is that i intentionally introduced that spelling/grammar error to try to look more human!)
if there were random errors in the quality of cards those would be considered "defective" and thrown out. at the scale and size they're making cards at, one can only presume that they have a relatively high error rate on the lithography. i presume that's why they disable at least 1 SM on anything but the topoftheline models  because it's either that or they have to throw it out.
if there are "errors" it might be that the native sin/cos/etc. functions  the "transcendental" functions  are designed for speed over accuracy. but if that's really a concern then you can turn off "fast math", and that will give you slower versions with full IEEEcompliant accuracy.
but i think they weren't used so much for some science problem because they only did single precision floating point before, but with the new Fermi cards they now have doubleprecision capability. on the professional cards (Teslas) double prec. performance is 1/2 that of single. on the consumer market cards (GeForce), it's 1/8. (if you design a circuit wherein double precision is better than 1/2 of single, you're doing something wrong, because you can always reuse a double precision circuit as 2 single precision circuits with minimal additional circuitry.)
anyways,
I propose modifying the sapporo code:
http://modesta.science.uva.nl/Software/src/sapporo.html
to use, instead of the gravitational force, the Lorentz force:
http://en.wikipedia.org/wiki/Lorentz_force
where E and B are calculated using the BiotSavart law for a point charge at constant velocity:
http://en.wikipedia.org/wiki/Biot%E2%80 ... t_velocity
furthermore, i think i simply might do it.
then their remains the problem of adding the static E and B fields, and pumping in new particles.
i'm thinking i might just approximate the static E and B fields with charged particles. i.e. every iterations i'd put a large set of randomly distributed set of heavily charged particles moving around each electromagnet. i.e. approximating the current/voltage w/a collection of highvoltage point charges. i figure a magrid does a poor job of approximating a polyhedron, so what could be all that bad with approximating a magrid, esp. if it's a highresolution approximation.
though that's just my idea. probably wont do it. but i think i'm going take a crack at writting the code for force calculation as described above, and then maybe integrating that into the sapporo code linked to above.
question regarding that, then: would using the nonrelativistic equations introduce too much inaccuracy?

 Posts: 1435
 Joined: Wed Jul 14, 2010 5:27 pm
here's my first attempt at writing code to calculate the electromagnetic force of a particle p on a particle p0. forgive me it may be a little hard to read 'cause i did some basic algebraic optimization as i wrote it. notice for the Bfield generated by p, i'm using it's velocity RELATIVE TO p0 instead of its absolute velocity. that seemed to make theoretical sense.
anyways, if anybody sees anything wrong with my math and/or physics, please let me know.
(edited 11/11/2010 3pm to make the code a little cleaner by way of #defines)
anyways, if anybody sees anything wrong with my math and/or physics, please let me know.
Code: Select all
#define C 299792458
#define PI 3.14159265358979323846264338327950288419716939937510
#define MAGNETIC_CONSTANT (4.0*PI*0.0000001)
#define ELECTRIC_CONSTANT (1.0/(MAGNETIC_CONSTANT*C*C))
#define ECONST (1.0/(4.0*PI*ELECTRIC_CONSTANT))
#define BCONST (MAGNETIC_CONSTANT/(4.0*PI))
#define RC2 (1.0/(C*C))
#define dot(a,b) (a.x*b.x + a.y*b.y + c.z*c.z)
#define crossx(a,b) (a.y*b.z  a.z*b.y)
#define crossy(a,b) (a.z*b.x  a.x*b.z)
#define crossz(a,b) (a.x*b.y  a.y*b.x)
typedef float _float;
struct vector {
_float x,y,z,scale;
}
struct particle {
vector p;
_float charge;
vector v;
_float mass;
}
__device__ vector calc_em_force(particle p0, particle p) {
//distance
vector pdiff;
pdiff.x = p0.p.xp.p.x;
pdiff.y = p0.p.yp.p.y;
pdiff.z = p0.p.zp.p.z;
_float p2 = dot(pdiff,pdiff);
pdiff.scale = __reciprsqrt(p2);
_float scale = pdiff.scale*pdiff.scale*pdiff.scale; //recipr(p2)*pdiff.scale;
//calculate Efield
_float estrength = p.charge * ECONST * scale;
#ifdef RELATIVISTIC
vector vdiff;
vdiff.x = p.v.xp0.v.x;
vdiff.y = p.v.yp0.v.y;
vdiff.z = p.v.zp0.v.z;
_float v2 = dot(vdiff,vdiff);
vdiff.scale = __reciprsqrt(v2);
_float pdotv = dot(pdiff,vdiff);*pdiff.scale*vdiff.scale;
_float st2 = 1pdotv*pdotv; //=sin^2(arcos(pdotv))
_float v2rc2 = v2*RC2;
_floaT rcd = (1v2rc2*st2);
_float relativistic_correction = (1v2rc2)*__reciprsqr(rcd*rcd*rcd);
estrength *= relativistic_correction;
#endif
vector e;
e.x = estrength * pdiff.x;
e.y = estrength * pdiff.y;
e.z = estrength * pdiff.z;
//calculate Bfield
vector b;
b.x = crossx(vdiff,e)*RC2;
b.y = crossy(vdiff,e)*RC2;
b.z = crossz(vdiff,e)*RC2;
//calc lorentz force
vector f;
f.x = p0.charge*(e.x + crossx(p0.v,b));
f.y = p0.charge*(e.y + crossy(p0.v,b));
f.z = p0.charge*(e.z + crossz(p0.v,b));
return f;
}
Last edited by happyjack27 on Thu Nov 11, 2010 9:37 pm, edited 3 times in total.
A couple of points that may or may not be relevant.
There was a thread (~1 yr ago?) that discussed the computing power to do a full up particle code for the Polywell. It was mentioned that a super computer with perhaps 10^15 to 18 flops was still vastly inadequate. I recall the mention of ~ 10^30 flops would be needed to get results without waiting for years or decades.
But, what I don't know is how much simplification, grouping could be done and still achieve reasonable results.
The precision I thought was needed because of the differential between the electrons and ions was only ~ 1 ppm. The processing speed was needed because to do a full simulation you need to calculate the interaction of one charged particle with all of the other particles, fields, etc. So if you had 10^10 particles, you would need 10^100 calculations to process one interactive generation. What is uncertain (read as "I don't have the faintest idea") is how much you can dumb the system down before the results become unreasonably imprecise. Could you get useful results with considering only 10^6 particles, or 10^4 particles, of clumps of particles each consisting of 10^3 particles, etc. etc. Then there are other computational tricks, approximations that might greatly speed up the process, but can, or have they already been proved as valid methods?
As far as converting a program that utilizes electrical forces and gravitational forces. How would you convert the gravity elements into magnetic? One is an accelerative force, the other is not.
Dan Tibbets
There was a thread (~1 yr ago?) that discussed the computing power to do a full up particle code for the Polywell. It was mentioned that a super computer with perhaps 10^15 to 18 flops was still vastly inadequate. I recall the mention of ~ 10^30 flops would be needed to get results without waiting for years or decades.
But, what I don't know is how much simplification, grouping could be done and still achieve reasonable results.
The precision I thought was needed because of the differential between the electrons and ions was only ~ 1 ppm. The processing speed was needed because to do a full simulation you need to calculate the interaction of one charged particle with all of the other particles, fields, etc. So if you had 10^10 particles, you would need 10^100 calculations to process one interactive generation. What is uncertain (read as "I don't have the faintest idea") is how much you can dumb the system down before the results become unreasonably imprecise. Could you get useful results with considering only 10^6 particles, or 10^4 particles, of clumps of particles each consisting of 10^3 particles, etc. etc. Then there are other computational tricks, approximations that might greatly speed up the process, but can, or have they already been proved as valid methods?
As far as converting a program that utilizes electrical forces and gravitational forces. How would you convert the gravity elements into magnetic? One is an accelerative force, the other is not.
Dan Tibbets
To error is human... and I'm very human.

 Posts: 1435
 Joined: Wed Jul 14, 2010 5:27 pm
there are algorithms that give good approximations in better than n^2 time. the main thing from what i understand is as you go further away things get less significant so you can take that into account so you don't really have to match up every particle w/every other particle. this is what i believe "fast nbody" codes do. so they get something like n*log^2(n) time, which is much much better.
i'm planning on using such a "fast nbody" algorithm here. all i have to do is write the part that calculates the force of particle a on particle b. i will then fit that in to the fast nbody problem code i found (sapporo) with a little surgery. i figure, since all the forces are still r^2 forces, the assumptions made in the approximation (namely, that one) are still met.
i'm sure there are plenty other approximations. finitevolume methods. voxel octrees would be pretty cool. suffice it to say estimating computation time by assuming that the algorithm is O(N^2) may seem rational at first, but, as you implied, it's a bit naive.
i decided on going with an nbody method because then i can be sure that the physics and are very correct, even if it doesn't scale very well in comparison to say a particleincell approach.
anycase i'm not planning on doing a full up. i just want to do a mixedparticle that can give some impression of well formation, annealing, etc. cool thing is 'cause it's a simulation i can vary the scale and the magfield strength astronomically.
i'm planning on using such a "fast nbody" algorithm here. all i have to do is write the part that calculates the force of particle a on particle b. i will then fit that in to the fast nbody problem code i found (sapporo) with a little surgery. i figure, since all the forces are still r^2 forces, the assumptions made in the approximation (namely, that one) are still met.
i'm sure there are plenty other approximations. finitevolume methods. voxel octrees would be pretty cool. suffice it to say estimating computation time by assuming that the algorithm is O(N^2) may seem rational at first, but, as you implied, it's a bit naive.
i decided on going with an nbody method because then i can be sure that the physics and are very correct, even if it doesn't scale very well in comparison to say a particleincell approach.
anycase i'm not planning on doing a full up. i just want to do a mixedparticle that can give some impression of well formation, annealing, etc. cool thing is 'cause it's a simulation i can vary the scale and the magfield strength astronomically.

 Posts: 1435
 Joined: Wed Jul 14, 2010 5:27 pm
here's some notes on a much more scalable algorithm, though probably more difficult to program, and not as good of an approximation.
it is a spatialresolution limited approximation rather than a particlecount limited approximation.
then i have a few notes on an SVOextension. the SVOextension is a dynamic multiresolution version that locally adjusts resolution to correlate with the density of information in that area.
it would be more accurate and probably a little faster too (per amt. accuracy), but would be more difficult to program. in any case one would certainly write the nonSVO version first.

idea: grid(voxel)based approximation:
"layers":
efield(1 component) (aka divergence)
vfield(3 component) (aka gradient)
bfield(3 component) (aka curl)
per "species" (mass/charge combination) particle layers:
velocity(3 component)
thermalization(3 component)
density
1. calculate e,v, and b fields from particle velocities and densities
2. calculate and apply particle layers adjustments based on fields, densities, etc.

any component can be viewed as a stereoscopic projection (i.e. break out the 3dglasses) onto x,y, or z plane.
(the points are accumulated as opacities via an exponential moving average, between time step)
additional (derived) views:
*charge (sum of density*charge)
*KE (mag of velocity)
*temperature (mag of thermal)

dynamic sparse voxel octree (SVO) extension:
svo depth change algorithm (conditional constrained):
var = spatial variation(aka 2nd moment) of efield within an octet + spatial variation(aka 2nd moment) of bfield within an octet
if var above a threshold: increase resolution
if var below a threshold: decrease resolution
if increasing resolution, also increase neighbor's resolution to be at most 1 less.
if decreasing resolution, mark as "relax".
if all neighbors are either the same or less resolution, or are marked as "relax", then decrease resolution.
it is a spatialresolution limited approximation rather than a particlecount limited approximation.
then i have a few notes on an SVOextension. the SVOextension is a dynamic multiresolution version that locally adjusts resolution to correlate with the density of information in that area.
it would be more accurate and probably a little faster too (per amt. accuracy), but would be more difficult to program. in any case one would certainly write the nonSVO version first.

idea: grid(voxel)based approximation:
"layers":
efield(1 component) (aka divergence)
vfield(3 component) (aka gradient)
bfield(3 component) (aka curl)
per "species" (mass/charge combination) particle layers:
velocity(3 component)
thermalization(3 component)
density
1. calculate e,v, and b fields from particle velocities and densities
2. calculate and apply particle layers adjustments based on fields, densities, etc.

any component can be viewed as a stereoscopic projection (i.e. break out the 3dglasses) onto x,y, or z plane.
(the points are accumulated as opacities via an exponential moving average, between time step)
additional (derived) views:
*charge (sum of density*charge)
*KE (mag of velocity)
*temperature (mag of thermal)

dynamic sparse voxel octree (SVO) extension:
svo depth change algorithm (conditional constrained):
var = spatial variation(aka 2nd moment) of efield within an octet + spatial variation(aka 2nd moment) of bfield within an octet
if var above a threshold: increase resolution
if var below a threshold: decrease resolution
if increasing resolution, also increase neighbor's resolution to be at most 1 less.
if decreasing resolution, mark as "relax".
if all neighbors are either the same or less resolution, or are marked as "relax", then decrease resolution.

 Posts: 1435
 Joined: Wed Jul 14, 2010 5:27 pm
so i'm doing the math to find the mag field at a point due to a solid cylinder of current, and by way of the biotsavart law, i've got:
over the region: 0<z<h, 0 < x^2+y^2 < c^2
where sx = sin of thetax, cx = cos of thetax, tx = xtranslation, etc. i.e. they are the components of rotation and translation of the cylinder relative to the point source.
my plan was to get it to this point and then do the actual calculus to find the integral, and then i'd have an analytic formula to find the force contribution of each segment of the magrid and then i could just add those together.
however, as you can see, that's easier said than done. this is what wolfram's online integrator gives for the first integration (over x). (sorry it looks like you'll have to copy and paste the link)
needless to say, i'm not exactly looking forward to putting in the boundaries and integrating that two more times. so if anybody could help me out here, i'd appreciate it.
Code: Select all
B strength = triple integral of
dx*dy*dz
/
[(y*sx*sy+z*cx*sy+x*cy+tx)^2 + (y*cxz*sx+ty)^2 + (y*sx*cy+z*cx*cyx*sy+tz)^2]
where sx = sin of thetax, cx = cos of thetax, tx = xtranslation, etc. i.e. they are the components of rotation and translation of the cylinder relative to the point source.
my plan was to get it to this point and then do the actual calculus to find the integral, and then i'd have an analytic formula to find the force contribution of each segment of the magrid and then i could just add those together.
however, as you can see, that's easier said than done. this is what wolfram's online integrator gives for the first integration (over x). (sorry it looks like you'll have to copy and paste the link)
needless to say, i'm not exactly looking forward to putting in the boundaries and integrating that two more times. so if anybody could help me out here, i'd appreciate it.

 Posts: 1435
 Joined: Wed Jul 14, 2010 5:27 pm
never mind. i'm going to model it as a collection of straight line segments instead, per http://cnx.org/content/m31103/latest/ .
i'm thinking i'll use that to precalculate a 4component 3d vector field for the static e and b field. i'll store it as a 3d "texture" of 4vectors so i can use the gpu's texture filtering units to get trilinear interpolation for free. i'm thinking 256x256x256. that comes out to a total of 256MB of texture memory (using floats for the datatype) and a resolution like so: http://www.mare.ee/indrek/ephi/bfs/ (but with interpolation. i.e. picture that antialiased.)
this way you can have as complex of a magrid as you want and it won't effect the speed of the simulation at all.
i'm thinking i'll use that to precalculate a 4component 3d vector field for the static e and b field. i'll store it as a 3d "texture" of 4vectors so i can use the gpu's texture filtering units to get trilinear interpolation for free. i'm thinking 256x256x256. that comes out to a total of 256MB of texture memory (using floats for the datatype) and a resolution like so: http://www.mare.ee/indrek/ephi/bfs/ (but with interpolation. i.e. picture that antialiased.)
this way you can have as complex of a magrid as you want and it won't effect the speed of the simulation at all.
happyjack:
1st:
I'll be waiting for some code outputs to prove that you're not being presumptuous.
PS: do you have any experience numerically modelling physical systems?
1st:
2nd:i believe the main gist of what's being said is that with the right computer code, we could do a mixedparticle 3dsimulation of a polywell configuration on our home computers.
Ok, so let's say your are not presumptuous thus you can do what you said at the outset. A full 3D mixedparticle validated simulation on your home computer (presumably using your own code that your are writing for some gee whiz GPU CUDA platform or some such).i don't mean to be presumptuous
I'll be waiting for some code outputs to prove that you're not being presumptuous.
PS: do you have any experience numerically modelling physical systems?

 Posts: 1435
 Joined: Wed Jul 14, 2010 5:27 pm
that's kind of a convolution of what i said, but fair enough.icarus wrote:happyjack:
Ok, so let's say your are not presumptuous thus you can do what you said at the outset. A full 3D mixedparticle validated simulation on your home computer (presumably using your own code that your are writing for some gee whiz GPU CUDA platform or some such).
I'll be waiting for some code outputs to prove that you're not being presumptuous.
PS: do you have any experience numerically modelling physical systems?
i'm going to do an all pairs nbody simulation per
this paper. i just have to swap out the kernel with the code i wrote above (you can check for correctness if you like (please do)) and do a little stitching. then of course i have to add the static field between iterations. i mentioned how i plan on doing this above.
bear in mind i never said i'm going to be doing billions of particles or anything like that. i never said you could to a fullscale simulation (e.g. every individual particle). that would just be ridiculous.
i'm just writing a mixedparticle nbody simulation. probably somewhere around 1k32k particles, since the time steps are going to have to be pretty small.
you can read in the paper mention of some fast nbody methods that would scale better. but i'm just going to start with the basics.
oh, and no, i don't have any experience numerically modeling physical systems. neural nets and stuff like that. and i did a vector engine when i was little. thing is if i had done it before it wouldn't be nearly as much fun!

 Posts: 1435
 Joined: Wed Jul 14, 2010 5:27 pm
here's a video sample of the nbody code running on, i believe an 8800gtx. i believe the simulation shows 16k particles.
http://www.youtube.com/watch?v=HUGjUvjtwS8
my kernel will take about 3 times as many FLOPS to do E and M field than just G field (and a lot more in relativistic mode) and the timestep will be smaller, ofcourse. so picture that slowed down a bit. though my card is a few generations newer, so maybe not so much. (and yes, that's in real time)
http://www.youtube.com/watch?v=HUGjUvjtwS8
my kernel will take about 3 times as many FLOPS to do E and M field than just G field (and a lot more in relativistic mode) and the timestep will be smaller, ofcourse. so picture that slowed down a bit. though my card is a few generations newer, so maybe not so much. (and yes, that's in real time)
Last edited by happyjack27 on Fri Nov 12, 2010 8:50 pm, edited 1 time in total.
happyjack:
By all means go ahead and try to model whatever you like, I respect anyone who 'gives it a go', your model reduction approach seems interesting but haven't looked into it enough to know of its validity. Word of warning, your math might need to get an upgrade to understand what you are attempting.
Ok, so my "illinformed" comment was not far off the mark.oh, and no, i don't have any experience numerically modeling physical systems.
By all means go ahead and try to model whatever you like, I respect anyone who 'gives it a go', your model reduction approach seems interesting but haven't looked into it enough to know of its validity. Word of warning, your math might need to get an upgrade to understand what you are attempting.

 Posts: 1435
 Joined: Wed Jul 14, 2010 5:27 pm
your "illinformed" comment was regarding things i said about programming, etc. which were far off the mark. also, experience does not equate to information so even if it was on the relevant subject that still would not lend logical support to your statement.
but enough about that. thanks for your support. i'm good up to calc4. i was an ace in math. got a 5/5 on the a.p. calc test (bc). so i'm not too worried. just hope i don't get my signs flipped.
but enough about that. thanks for your support. i'm good up to calc4. i was an ace in math. got a 5/5 on the a.p. calc test (bc). so i'm not too worried. just hope i don't get my signs flipped.