Skynet is coming.

hanelyp · Post by **hanelyp** » Mon Feb 20, 2012 12:44 am

As an example of what modern CPUs can do, consider the pseudo assembly code:

load A from X
inc A
store A to X
load A from Y
dec A
store A to Y

The instruction decoder can note that the second half of that code doesn't depend on the results of the first half, allocate a separate internal register to play the role of A for each half, and execute the second half in parallel with the first. Instead of executing code [1,2,3,4,5,6], code is executed [(1,4),(2,5),(3,6)], out of the order presented.

Applying this to more sets of potentially parallel instructions with more execution units and a substantial speed increase relative to clock speed is possible over the old 80486 tech.

ScottL · Post by **ScottL** » Mon Feb 20, 2012 3:00 am

hanelyp wrote:As an example of what modern CPUs can do, consider the pseudo assembly code:
Code: Select all
load A from X
inc A
store A to X
load A from Y
dec A
store A to Y
The instruction decoder can note that the second half of that code doesn't depend on the results of the first half, allocate a separate internal register to play the role of A for each half, and execute the second half in parallel with the first. Instead of executing code [1,2,3,4,5,6], code is executed [(1,4),(2,5),(3,6)], out of the order presented.

Applying this to more sets of potentially parallel instructions with more execution units and a substantial speed increase relative to clock speed is possible over the old 80486 tech.

A good summary. I agree completely. There are arguments however; about how much power programmers need to leverage. Each industry requires a different level of control. Most coding is done with a complete disregard of how it will be converted to assembly or machine code. Most of us are more worried about compatibility and leave that level of control to the OS, then again most coding now is web-related.

Luzr · Post by **Luzr** » Mon Feb 20, 2012 10:47 am

ScottL wrote:
Luzr wrote:
ScottL wrote: If they are linear yes, otherwise no.
I don't think you read my comment right. Your link reiterates exactly what I said. To avoid execution out of order, linear code is "queued" accordingly, otherwise it can be combined in a single cycle. The article states this and I agree 100%.

The question was whether they are visible in programming model. You cannot answer this "yes" (not even conditionally) if you understand the issue.

Plus, there is no "avoiding" of execution out-of order. Out-of-order execution is the desired effect, to work around memory (and cache) latencies and momentary availability of execution units.

Luzr · Post by **Luzr** » Mon Feb 20, 2012 10:51 am

A good summary. I agree completely. There are arguments however; about how much power programmers need to leverage. Each industry requires a different level of control. Most coding is done with a complete disregard of how it will be converted to assembly or machine code. Most of us are more worried about compatibility and leave that level of control to the OS, then again most coding now is web-related.

I think you still show a good degree of "not understanding". Modern CPU OOO architecture actually works independent of programming model, nothing is required from programmer to exploit it. You are simply getting much more performance through smarter silicon that achieves higher level of parallelism (not 'multi-core' parallelism, but instruction level parallelism), without increasing clock speed and without changing the code.

Luzr · Post by **Luzr** » Mon Feb 20, 2012 11:19 am

hanelyp wrote:As an example of what modern CPUs can do, consider the pseudo assembly code:
Code: Select all
load A from X
inc A
store A to X
load A from Y
dec A
store A to Y
The instruction decoder can note that the second half of that code doesn't depend on the results of the first half, allocate a separate internal register to play the role of A for each half,

Actually, the I think the better explanation is to say that it allocated separate internal register for second 'A' (register renaming) and THIS makes second half idependent on first half.

But this is just 'superscalar' level. In OOO, it is also possible that if Y is in cache and X is not, second half gets executed 40 CPU cycles before the first half (and inbetween, other instruction get executed).

krenshala · Post by **krenshala** » Mon Feb 20, 2012 5:01 pm

Luzr wrote:
A good summary. I agree completely. There are arguments however; about how much power programmers need to leverage. Each industry requires a different level of control. Most coding is done with a complete disregard of how it will be converted to assembly or machine code. Most of us are more worried about compatibility and leave that level of control to the OS, then again most coding now is web-related.
I think you still show a good degree of "not understanding". Modern CPU OOO architecture actually works independent of programming model, nothing is required from programmer to exploit it. You are simply getting much more performance through smarter silicon that achieves higher level of parallelism (not 'multi-core' parallelism, but instruction level parallelism), without increasing clock speed and without changing the code.

ScottL never said the programmer was required to do anything special to exploit OOO. However, he did imply that a good programmer can do things to streamline the software so things run more efficiently. Why it is you can't seem to understand this?

DeltaV · Post by **DeltaV** » Mon Feb 20, 2012 6:20 pm

Lab-grown meat is first step to artificial hamburger

... or living-tissue camouflage for time-jumping Terminators.

Dr Steele, who is also a molecular biologist, said he was also concerned that unhealthily high levels of antibiotics and antifungal chemicals would be needed to stop the synthetic meat from rotting.

Not a problem. Terminator fighting frames have lots of internal room for antibiotic/antifungal ampules.

hanelyp · Post by **hanelyp** » Tue Feb 21, 2012 2:43 am

krenshala wrote:ScottL never said the programmer was required to do anything special to exploit OOO. However, he did imply that a good programmer can do things to streamline the software so things run more efficiently. Why it is you can't seem to understand this?

Such programmer tricks depend on knowing the specific hardware the software will run on. Introduce the next generation CPU and the older code won't have the tricks needed to get the best out of the new hardware. The genius of the current high end processors is that they can apply the improved performance to older code.

At one time it was common for high performance RISC processors to have lots of registers, allowing lots of software level optimization. But if an upgraded model had more registers to support more execution units, old code couldn't take advantage. The new hardware would likely even need a revamped instruction set.

ScottL · Post by **ScottL** » Tue Feb 21, 2012 6:00 pm

hanelyp wrote:
krenshala wrote:ScottL never said the programmer was required to do anything special to exploit OOO. However, he did imply that a good programmer can do things to streamline the software so things run more efficiently. Why it is you can't seem to understand this?
Such programmer tricks depend on knowing the specific hardware the software will run on. Introduce the next generation CPU and the older code won't have the tricks needed to get the best out of the new hardware. The genius of the current high end processors is that they can apply the improved performance to older code.

At one time it was common for high performance RISC processors to have lots of registers, allowing lots of software level optimization. But if an upgraded model had more registers to support more execution units, old code couldn't take advantage. The new hardware would likely even need a revamped instruction set.

Outside of frameworks and paradigm choices, coding has not changed in the last 15 years. This has nothing to do with modern CPU vs 20 year old CPUs. This is more OS dependent, not processor.

Edit: Clarification, syntax hasn't changed and the compilation from high level code to low level code hasn't changed that much. Speaking from experience writing compilers, linker-loaders, and assemblers. You're still only optimising for x86 or x64, you tell the compiler which you intend to use, x86 being the same since like the P1 or P2 if not earlier.

Luzr · Post by **Luzr** » Tue Feb 21, 2012 8:30 pm

ScottL wrote: Edit: Clarification, syntax hasn't changed and the compilation from high level code to low level code hasn't changed that much. Speaking from experience writing compilers, linker-loaders, and assemblers. You're still only optimising for x86 or x64, you tell the compiler which you intend to use, x86 being the same since like the P1 or P2 if not earlier.

That's true. But has a little to do with original debate...

ScottL · Post by **ScottL** » Tue Feb 21, 2012 9:13 pm

Luzr wrote:
ScottL wrote: Edit: Clarification, syntax hasn't changed and the compilation from high level code to low level code hasn't changed that much. Speaking from experience writing compilers, linker-loaders, and assemblers. You're still only optimising for x86 or x64, you tell the compiler which you intend to use, x86 being the same since like the P1 or P2 if not earlier.
That's true. But has a little to do with original debate...

Part of original debate:

Luzr wrote:Not sure what you mean, but I am quite sure that if you as programmer are unable to put those cores to work where possible, you should perhaps consider another career.

So now you agree with me and not your own previous statement? Programmers aren't coding for cores, they're coding for OS. There are exceptions, but these are really rare, Battlefield 3 being the only one off the top of my head. So unless you're an OS writer, which most are not, knowing how to "put those cores to work" is largely pointless.

That's all on this specific subset of the original conversation.

Diogenes · Post by **Diogenes** » Tue Feb 21, 2012 9:59 pm

Here's a victory for OUR SIDE. (Against the Machines. (and the animal rights crackpots))

Animal rights group says drone shot down

He said the animal rights group decided to send the drone up anyway.

"Seconds after it hit the air, numerous shots rang out," Hindi said in the release. "As an act of revenge for us shutting down the pigeon slaughter, they had shot down our copter."

http://thetandd.com/animal-rights-group ... z1n3Xmwx3h

Diogenes · Post by **Diogenes** » Tue Feb 21, 2012 10:03 pm

Will Artificial Intelligences Find Humans Enjoyable?

(Yes, they'll find us very tasty.)

But why should AIs find humans individually valuable or valuable as a group? Once AIs are smarter and can rewrite their own software why should they want human companionship? What will we be able to say to them that will seem remotely interesting? What types of observations or proposals will the smarter humans be able to make that cause AIs to appreciate having us around? What will we be able to do for them that tey won't be able to do better for themselves?

http://www.futurepundit.com/archives/008521.html

I for one welcome our new Machine overlords.

Luzr · Post by **Luzr** » Wed Feb 22, 2012 7:23 am

ScottL wrote:
Luzr wrote:
ScottL wrote: Part of original debate: So now you agree with me and not your own previous statement?
That's all on this specific subset of the original conversation.
No. I guess you are mixing two things. There are two directions in increasing the performance:

- increasing performance per core
- increasing number of cores

I think we both agree that you need to adjust your code if you are going from single core code to multicore (and IME, it is only one core or multiple cores, it is usually easy to write the code so that it adapts to any number of cores available, as long as model is symmetric).

Anyway, what you have stated what I do not agree with is this:

each core contains the capability of a single P4 x.x GHz chip. The core is the idea that your single processor was busy so pass work to be done to another core. Furthermore, the law isn't about performance over price. The law, which is 30 years old, states that processing power would double every 18-24 months. From 2002 to 2012, 10 years, we should've seen a 5x speed increase. Assuming a Quad core at 2.2Ghz per core, 8.8 GHz, assuming your mentioned 2GHz after 2 years should be 4GHz, after 2 more should be 8GHz, after 2 more should be 16GHz, 2 more 32GHz, and finally 2 more for 64GHz
or this

This is the number of instructrions per cycle, not the measure of cycles. Think of it as 2 hoses, the same size and generally the same through-put. You're pouring water through that hose, but in the "new quad core hose" you have a tech that is able to shrink the water molecules before entering the hose. Well of course you can pump more molecules through. The hose is still the same size and stuff in general still flows through it the same speed.
and especially this:

You have a tightened instruction set. Modify any P4 to handle the same instruction set and you'll get the same amont of instructions handled per cycle.

So if you think that IPC improvements are (only) the result of "thightened instruction set", you simply do not understand the issue.

(That is not to say that ISA is not being extended and that this extension does not lead to increased performance. But 3 times more IPC works for the SAME code on SINGLE core by increasing its processing width.)

ScottL · Post by **ScottL** » Wed Feb 22, 2012 6:19 pm

I think we both agree that you need to adjust your code if you are going from single core code to multicore (and IME, it is only one core or multiple cores, it is usually easy to write the code so that it adapts to any number of cores available, as long as model is symmetric).

I disagree. It's not important to the majority of programmers who reside in the space of application development or web development. I'd estimate that 99 out of a 100 programmers reside in this space. THere is nothing in the higher level programming paradigms that gives a flying rats' ass about the number of cores. Obviously, this is not the case for OS architects at say Microsoft or the open source community contributing to various flavors of Linux/Unix, but it is true for most programmers.

As for the other discussions, what do you feel are the major differences between a single core processor and a single core of a dual core processor without mentioning caching?