Despite me not really being here I feel obligated to add to this just to benefit the community. As stated an AMD FX-8350 has 8 cores, 2 cores on each modules totally 4 modules. Each module shares L2 cache unlike Intel cores. The reason why FX cores drop in productivity is because each module has 1 floating integer for execution and this creates the bottleneck. The theory is, FX cores are 100% efficient at 50% load, but lose HALF it's efficiency from 50-100% load. That is assuming, you can max out 1 core on each module up to 50%. This effectively makes an FX-8350 about 6 cores worth of power give or take.
What hurts the FX cores even more is they have I believe 32 pipelines and HALF the IPC's as Intel, where Intel has I believe 12 pipelines. Think of this like ECC ram. Data sets have to go down the pipelines to be executed, Often times data sets go down the wrong pipeline and have to try other pipelines. Increased IPC's makes this process more efficient. So having MORE pipelines and far fewer IPC's is the reason why a Pentium G860 can execute a quad threaded task more efficiently than an FX-4100. The architecture improvements in Piledriver increased IPC's and along with an unlocked multiplier (which helps on the execution level), they can offset some of this deficiency. They also improved cache latency as well with Piledriver which helps a bit for gaming especially on the second level. This is the more difficult side of the explanation.