What's the big (64-bit) deal, anyway?

Discuss how polywell fusion works; share theoretical questions and answers.

Moderators: tonybarry, MSimon

scareduck
Posts: 552
Joined: Wed Oct 17, 2007 5:03 am

What's the big (64-bit) deal, anyway?

Post by scareduck »

drmike has made some interesting pictures in the "Virtual Polywell" thread:

viewtopic.php?t=203&start=0

This got me thinking about something that Bussard mentioned in his video; it's on page 7 of the PDF transcript:
The device is almost electrically neutral. The departure from neutrality to create a 100 KV well is only one part in a million, when you have a density of 10^12 cm^3. The departure from neutrality is so small that we found current computer codes and computers available to us were incapable of analyzing it because of the numeric noise in the calculations by a factor of a thousand.
When he was first looking into this, it would have been in 1992-1994; IEEE-754 was only finalized in 1985

http://en.wikipedia.org/wiki/IEEE_float ... t_standard

with general adoption happening fairly rapidly thereafter. Assuming he was referring to the double (64-bit) type, that means his dynamic range was around 4.4E12 (order of a thousand = 10^3, 52 bits of significance = 2^52 =~ 4.5E15, so dividing the two out you end up with 4.4E12).

It seems to me there are three main problems with doing a credible simulation of a polywell device:

1) Number of particles involved, assuming you were doing particle-by-particle simulation.
2) Numeric underflow because of the dynamic range. This was essentially the problem Dr. B. was complaining of.
3) Memory size. Even if you use volume estimates or some other proxy (please be gentle -- it's been years since I took numerical analysis, and years further back since I took physics), it seems like getting good values would require a ton of memory. Problem (2) is compounded by problem (1).

Some of these problems are solveable by recent -- very recent -- developments in the hardware world.

First, from a cursory review of what gcc is doing with its command line switches and Intel, it seems that 96-bit floating point numbers (in C parlance, long doubles) are standard with Pentium-class machines; with the newer 64-bit machines, 128-bit floats are available.

Second, the x86 architecture has grown to a 64-bit architecture. This means if you can get a big enough box, you ought to be fine. In my experience, the real problem is finding a machine that can support enough DIMM slots to get you to however much memory you need, understanding that the denser the DIMM, the more you'll pay.

Third, multi-core machines are coming down in price and becoming de regeur for certain applications. (In my day job we never specify anything with less than two dual-core CPUs.)

Assuming you were spec'ing out a machine to do this kind of analysis, how big would you want it to be? How many CPUs, how much RAM?

MSimon
Posts: 14334
Joined: Mon Jul 16, 2007 7:37 pm
Location: Rockford, Illinois
Contact:

Post by MSimon »

Scareduck,

I think 80 bits is (probably 128) is required. When you do calculations in sequence the bit errors propagate. It is not a matter of dynamic range. It is a matter of truncation error propagation.

First off you give up 20 bits for the 1 in 1E6 factor. The significand in a 64 bit float is 53 bits (IIRC) so you are down to 33 bits of useful info.

Unless you are doing something like a Fibonacci series which is self healing more bits are better.

As to how big a machine? I defer to Indrek and Dr. Mike on that.
Engineering is the art of making what you want from what you can get at a profit.

JD
Posts: 42
Joined: Tue Aug 28, 2007 1:16 am
Location: Fairbanks Alaska

Post by JD »

Okay, I took a few comp sci classes some years ago, had time on my hands after retirement and wanted to finish up a degree. You just explained the situation in one clear sentence...

"When you do calculations in sequence the bit errors propagate. It is not a matter of dynamic range. It is a matter of truncation error propagation."

that the tenured guy giving the class I attended took something like 15 minutes of rambling to vaguely explain. :lol:

drmike
Posts: 825
Joined: Sat Jul 14, 2007 11:54 pm
Contact:

Post by drmike »

Scareduck,

You are right on the money. I'm using a 64 bit cpu with 1 GB of ram. Way bigger than anything available in the 1990's. And I'm looking at using 250MB of ram for 400 step size block of data in 3D space for the E field, and the same amount for the B field. The accuracy is ok, but for what Bussard is talking about I'd work with a fluid approximation rather than individual particles (or clumps of particles).

I am torn between using pure "brute force" and "pure theory". Both have advantages. Brute force (lots of ram and computational accuracy) gives a pretty good idea of how reality might work, pure theory gives you an idea of what parameters are important. Somewhere in between is where I can work with my computer - I need to pick a model which makes certain assumptions and then run with it as a theory, then compute with what few parameters it leaves. It's standard physics to compute the difference and throw away the bulk properties. That way your model is only working with the small numbers and you have full computational accuracy.

The problem is it is still just a model, and when things would hit a shock wave (for example) in the real world, the model won't see it. Brute force calculation will.

A reasonable model of brute force would not take that much with today's computers. 8 to 16 GB of ram with a quad core that has 128 bit floating point built in would do a lot. Get 20 of those in a cluster and you could include the twist in the coils and plasma sheath problems at the electron sources too.

Bussard saw the problem as having several zones and he also describes start up conditions. Modeling all that would be useful. Simple experiments to test the models would go a long way in figuring out what is really important, and what parts of the models need work. Those models can then be fed to engineering design crews to help look at what can be changed to optimize for power production. That's where real computational power would be needed and something like the Condor project might be handy.

Sigh, this is way too much fun!!!
:D

scareduck
Posts: 552
Joined: Wed Oct 17, 2007 5:03 am

Post by scareduck »

drmike wrote:You are right on the money. I'm using a 64 bit cpu with 1 GB of ram. Way bigger than anything available in the 1990's. And I'm looking at using 250MB of ram for 400 step size block of data in 3D space for the E field, and the same amount for the B field. The accuracy is ok, but for what Bussard is talking about I'd work with a fluid approximation rather than individual particles (or clumps of particles).
It turns out that I was a bit too hasty in my assertion that the newer Intel CPUs support 128-bit floats. It's only 80-bit according to the SDM:

http://www.intel.com/design/processor/m ... 253666.pdf

IEEE-754r, which does define a 128-bit (quad) float, is in balloting now, finally. I'm guessing that means we will see hardware support around 2010-2015. (Some compilers may implement 96- or 128-bit floats, but I assume those have the problem of software intercession.)
A reasonable model of brute force would not take that much with today's computers. 8 to 16 GB of ram with a quad core that has 128 bit floating point built in would do a lot. Get 20 of those in a cluster and you could include the twist in the coils and plasma sheath problems at the electron sources too.
This more or less describes the kinds of machines I have at my disposal in my day job. I'm not sure about the quantity though. Not that I'm suggesting an illicit use for these, but it would be useful from the perspective of getting some rough order of magnitude of cost of the machines, as we buy them by the pallet and get a healthy discount (I don't write hardware purchasing orders...) Dell list for the PowerEdge Rack 900 starts at just under $9k, but filled out with quad core processors and 24 GB, you're looking at a $20k machine.
That's where real computational power would be needed and something like the Condor project might be handy.
Heh. Reading the docs on Condor, it appears that it's an interesting social experiment as much as a job scheduling mechanism. You'd still have to manage the piecewise distribution of the simulation, a non-trivial task. (We use Sun's SGE under Linux for this, which basically assumes a dedicated farm environment.) Some of the problems he encountered sound an awful lot like the famous tragedy of the commons.

MSimon
Posts: 14334
Joined: Mon Jul 16, 2007 7:37 pm
Location: Rockford, Illinois
Contact:

Post by MSimon »

I think 80 bit floats have been common since the AMD 9511 chip. A very nice piece of work. They used a Chebychev approximation for trig functions.

I believe the 80 bit representation was only internal but my memory is hazy. I designed them into a board around '81 I think.

http://en.wikipedia.org/wiki/IEEE_float ... t_standard

Oh yeah, the 9511 was a stack oriented machine. Heh.

Here is some on the 8087 Intel's FPU:

http://en.wikipedia.org/wiki/Intel_8087
Engineering is the art of making what you want from what you can get at a profit.

scareduck
Posts: 552
Joined: Wed Oct 17, 2007 5:03 am

Post by scareduck »

AMD says their latest processor has a 128-bit floating-point internal data path, but they don't say whether they support the IEEE-754r quad precision format:

http://www.amd.com/us-en/assets/content ... /44109.pdf

I must say that AMD's website is pretty badly organized.

dch24
Posts: 142
Joined: Sat Oct 27, 2007 10:43 pm

Post by dch24 »

scareduck wrote:AMD says their latest processor has a 128-bit floating-point internal data path
That might refer to loads and stores, and not to the precision (e.g. IEEE-754r). Here's a quote from AMD's web site:
128-bit SSE floating-point capabilities enable each processor to simultaneously execute up to four floating-point operations per core (four times the floating-point computations of Second-Generation AMD Opteron processors) to significantly improve performance on compute-intensive floating-point applications.
Also, AMD published an errata correcting what some people were claiming was 128-bit floating point precision. It's not. :-(
Last edited by dch24 on Fri Jan 18, 2008 8:31 pm, edited 1 time in total.

Stefan
Posts: 24
Joined: Mon Jul 09, 2007 9:49 am

Post by Stefan »

If we are talking 3D particle simulations machine precision really isn't an issue.
It would take an extremly ridiculous number of particles to calculate field values with 30 bit precision.
The problem is that it also takes a pretty big number of particles for a merely acceptable precision, due to the factor of 1E3 to 1E6 from the plasma being quasi neutral. I think this was the main problem Dr. Bussards team had.

scareduck
Posts: 552
Joined: Wed Oct 17, 2007 5:03 am

Post by scareduck »

Stefan wrote:If we are talking 3D particle simulations machine precision really isn't an issue.
It would take an extremly ridiculous number of particles to calculate field values with 30 bit precision.
Do you mean 30-decimal place precision?
The problem is that it also takes a pretty big number of particles for a merely acceptable precision, due to the factor of 1E3 to 1E6 from the plasma being quasi neutral. I think this was the main problem Dr. Bussards team had.
Memory is not exactly cheap but it's less expensive than CPUs.

I still tend to think that the floating point precision is an issue from just the point that MSimon made above about cumulative errors.

scareduck
Posts: 552
Joined: Wed Oct 17, 2007 5:03 am

Post by scareduck »

UltraSPARC supports 128-bit quad floats:

http://opensparc-t1.sunsource.net/specs ... -P-EXT.pdf

Stefan
Posts: 24
Joined: Mon Jul 09, 2007 9:49 am

Post by Stefan »

I wrote 30 bit precision because MSimon estimated 33 bit as the precision double would allow us to get. This is about 10 decimal places.

Apart from the memory tracking more particles requires more CPU power too and then there's also the issue of memory bandwith.

The errors from having 'few' particles in the model will also cumulate and compared to them the floating point errors are tiny.
Using floats with far more precision than the numerical method used can deliver anyway would be a waste of bandwith, memory and depending on the CPU computational power.

TallDave
Posts: 3140
Joined: Wed Jul 25, 2007 7:12 pm
Contact:

Post by TallDave »

I wonder what Nebel's team is doing in this vein.

MSimon
Posts: 14334
Joined: Mon Jul 16, 2007 7:37 pm
Location: Rockford, Illinois
Contact:

Post by MSimon »

Stefan wrote:The errors from having 'few' particles in the model will also cumulate and compared to them the floating point errors are tiny.
Depends on the number of iterations. Lots of iterations requires longer words.

It would be nice if we had machines that could increase precision a few bits at a time to see what was really required.

Unfortunately precision only goes up in chunks.

It would be nice if we could compare 64 bit and 128 bit floats.

It would be nice if we could rent a "farm" from Sun.

I still think it would be good to set up a Seti network among ourselves to handle the problem - if it is feasible. It may not be due to memory rqmts.
Engineering is the art of making what you want from what you can get at a profit.

scareduck
Posts: 552
Joined: Wed Oct 17, 2007 5:03 am

Post by scareduck »

Setting up a distributed grid would be interesting. I have a feeling the most immediate problem wouldn't be memory space, it would be OS. I run Linux and MacOS at home, and I suspect most people run Windows.

Post Reply