It's often been suggested that the CPU-based implementation of NVIDIA's PhysX API is rather severely lacking, giving GPU accelerated physics a significant advantage due to reasons other than the additional processing horsepower available to modern graphics boards. "But where's the proof?" I here you shout. Right here in the linked article.
Overall, the results are somewhat surprising. In each case, the PhysX libraries are executing with an IPC>1, which is pretty good performance. But at the same time, there is a disturbing large amount of x87 code used in the PhysX libraries, and no SSE floating point code. Moreover, PhysX code is automatically multi-threaded on Nvidia GPUs by the PhysX and device drivers, whereas there is no automatic multi-threading for CPUs.
Read the full article at Real World Technologies.