@vikpaw: Regarding Hyper-Threading; yes, it does make it appear as though there's another core to hand out, but it's not real and will quite often take away CPU resources from another process.
For those that aren't sure what it is, it was designed to help the old P4 from being so slow. A pipeline in a CPU is made up of all the stages a chunk of program goes through when it gets processed and it takes one CPU clock cycle to get something from one end of the pipeline to the other. Generally speaking the longer a pipeline is the faster the CPU can go, but it makes the CPU clock cycle longer (but the cycles are going quicker 'cause in theory the CPU is fast enough to overcome this). A shorter pipeline will limit the CPU speed earlier, but will process things quicker per clock cycle. Clear as mud?! This is one reason why the earliest P4s were slower than the latest P3s (which had a shorter pipeline). The speed of these early P4s wasn't sufficient to overcome the penalty of the longer pipeline / cycle.
The P4 had a very long pipeline because Intel wanted it to get to silly speeds (4GHz was the initial promise), but they soon discovered that not everything that was getting sent down the pipeline was using each stage, so quite a lot of the pipeline was going unused during the clock cycle. This led them to invent Hyper-Threading (HT). It 'fooled' the OS to think there were two CPU cores in the system and the OS would allocate two chunks of program at the same time. The HT system in the CPU would then slot one chunk in along side the other so that more of the pipeline was being used per clock cycle, but this quite often slowed down both processes. In a desktop scenario HT worked quite well; we were more interested in being able to do more at once and didn't notice the latency impact that HT was having. In a server environment, latency is quite important. So in the P4 days, in a server environment, it was more often quicker to run a single core CPU without HT than with.
AMD have always had shorter pipelines and never needed HT. This is why the Athlon64 was so much quicker than the P4 even though they ran at slower clock speeds (less GHz). This led to the shameful period of Intel's marketing dep't to try and fool us into thinking more GHz was always better. Yeah right!
Today's CPUs all have shorter and more efficient pipelines than the old P4. In simplistic terms, because the pipeline is shorter, there's much less impact if some of the pipeline goes unused (the P3 and AMD way of thinking, if you like) and the latency impact of trying to cram two chunks down one pipeline is relatively higher. That and we also have 2 or more real cores to play with, not fake ones.
The performance of hyper-threaded environments varies. Conservative testing has shown 10 to 20 percent gains for SQL Server workloads, but the application patterns have a significant affect. You might find that some applications do not receive an increase in performance by taking advantage of hyper-threading. If the physical processors are already saturated, using logical* processors can actually reduce the workload achieved.
If you look at the SIMS CPU usage as described in my previous post, you'll see that a single person using SIMS can max out one CPU core, rather than have the load spread over several cores, ergo, in my opinion, the last sentence of the quote comes into effect.
EDIT: * a logical processor is an HT CPU core.
EDIT2: What I didn't mention above is that when you have HT enabled, not only is there contention in the pipeline, but also in the L1, L2 and L3 caches and there may be the possibility that one thread (chunk of work) will throw out the other thread's L1-3 data for it's own, which will in turn get thrown out by the other thread, meaning lots of return trips to main RAM (which is much slower than the caches).
Last edited by NorthernSands; 26th June 2011 at 08:18 AM.
Reason: Clarity & extra info
One thing I would mention is that I thought hyper-threading was useful, certainly on a VM as it manages it
I would have thought so too. VMware and Microsoft both recommend leaving HT enabled since thier hypervisors can distinguish between physical and logical processors. The best thing to do would be to test both configurations.
Performance Best Practices for VMware vSphere 4.1
Hyper-threading technology (recent versions of which are called symmetric multithreading, or SMT) allows a single physical processor core to behave like two logical processors, essentially allowing two independent threads to run simultaneously. Unlike having twice as many processor cores—that can roughly double performance—hyper-threading can provide anywhere from a slight to a significant increase in system performance by keeping the processor pipeline busier.
If the hardware and BIOS support hyper-threading, ESX automatically makes use of it. For the best performance we recommend that you enable hyper-threading.
An ESX system enabled for hyper-threading will behave almost exactly like a system without it. Logical processors on the same core have adjacent CPU numbers, so that CPUs 0 and 1 are on the first core, CPUs 2 and 3 are on the second core, and so on.
ESX systems manage processor time intelligently to guarantee that load is spread smoothly across all physical cores in the system. If there is no work for a logical processor it is put into a special halted state that frees its execution resources and allows the virtual machine running on the other logical processor on the same core to use the full execution resources of the core. (Source)
Q. Does Hyper-Threading affect Hyper-V?
A. The new four-core Intel Core i7 processor enables hyper-threading, which splits each processor core into two virtual cores to (potentially) improve performance.
The concern with Hyper-V and hyper-threading is that you assign a number of processor cores to each virtual machine (VM). Imagine that you assign one processor each to two guest VMs from the Hyper-V management console, thinking that each is going to use a separate core. What if the hypervisor assigns each of the VMs to the same physical core, with each getting a virtual core? You'd potentially get lousy performance and three physical cores not doing much, where you'd have liked each VM to get its own physical core.
Fortunately, this isn't the case. Microsoft has done a lot of work around Hyper-Threading and Hyper-V. Essentially, while Hyper-Threading will aid performance sometimes, it will never hurt performance, so Hyper-Threading should be enabled. (Source)
By default vSphere will prefer physical cores to logical cores, although you can change this behaviour if required.
vSphere will prefer to spread virtual CPUs across NUMA nodes (option one above) to gain the benefit of more physical cores. But if you are running an application where memory throughput is more important than processor speed, you should consider testing a change vSphere’s default behavior. You can do this by setting the ESX 4.1 advanced parameter NUMA.preferHT to 1. This will configure the scheduler to prefer consolidating threads on logical processors on a single NUMA instead of using more physical cores across multiple nodes. (Source)
Last edited by Arthur; 26th June 2011 at 12:07 PM.
I would have thought so too. VMware and Microsoft both recommend leaving HT enabled since thier hypervisors can distinguish between physical and logical processors. The best thing to do would probably be to test both configurations.
I would agree, test both scenarios. I would still argue that when a CPU core is being hammered, HT is a bad thing (a bit simplistic but that's the idea). SIMS using SQL is an example of this. However, Arthur is correct in saying that modern Hypervisors utilise HT far more effectively than before. You can set certain vhosts to not allow more than one thread to run on a single physical core (effectively disabling HT on that core). (Source then look at the sub-topic).
My opinion is still that a core running SIMS SQL should not be shared.