找回密碼註冊
作者: petergo
查看: 9390
回復: 0

文章分享:

+ MORE精選文章:

+ MORE活動推薦:

FIT V DDR5 電競/超頻記憶體 玩家開箱體驗

FIT V DDR5 電競/超頻記憶體最 FIT 專業工作者的效能首選 [*]靈巧俐 ...

華碩 極速WiFi 7 寫文競走開始!-- 得獎公

第一名 dwi042 https://www.xfastest.com/thread-294970-1-1.html ...

Ducky One X 玩家開箱體驗分享活動

重新定義類比鍵盤 全球首款電感式鍵盤 Ducky One X導入最新的類比軸 ...

UNI FAN TL Wireless LCD 120 ARGB 玩家開

[*]1.6吋液晶屏,解析度為400×400。 [*]支援 GIF、MP4、JPG 和 PN ...

打印 上一主題 下一主題

Intel Xeon 5570: Smashing SAP records

[複製鏈接]| 回復
跳轉到指定樓層
1#
petergo 發表於 2009-1-31 09:32:41 | 只看該作者 回帖獎勵 |倒序瀏覽 |閱讀模式
We have emphasized it more than once: the Nehalem architecture is all about regaining the performance crown in servers and HPC, desktop and mobile use were sometimes a bonus, sometimes an afterthought. Today it becomes almost painfully obvious. Just read Anand's thoughts about the Core i7:


    "The Core i7's general purpose performance is solid, you're looking at a 5 - 10% increase in general application performance at the same clock speeds as Penryn"

and now look at the graph below.


Intel has apparantely allowed HP and Fujitsu-Siemens to break the NDA on the Xeon 5570 processor for PR reasons as both companies have published SAP numbers on a Dual Xeon 5570. The Xeon 5570 is based on the same architecture as the Core i7. It is a 2.93 GHz quadcore CPU with 4 times a 256 KB L2-cache and one huge shared 8 MB L3.  

he SAP numbers are absolutely astonishing, as Intel's dual socket is able to outperform quad socket opteron machines. Based on the scaling of Barcelona, we speculate that a quad Shanghai at 2.7 GHz would obtain the performance of the Dual Xeon 5570 w/o HT.The new Xeon 5570 outperforms the "old" 5450 by 119%!!!

These numbers are so high, that we checked and checked again. The database used is the same (SQL Server 2005), so unless there is some incredible tuning parameter that HP and FS have discovered and that we have yet to hear about, that is not it.

At this point we have no idea how it is possible that a 3 GHz Nehalem outperforms the latest Opteron by a margin as high as 80% and more. But we can give it a try. In a previous server oriented article, we summed up a rough profile of SAP S&D:

• Very parallel resulting in excellent scaling
• Low to medium IPC, mostly due to “branchy” code
• Not really limited by memory bandwidth
• Likes large caches
• Sensitive to Sync (“cache coherency”) latency

One of the biggest bottlenecks for Intel has been the sync latency. It is possible that once the "sync" bottleneck was removed, the intel architecture is able to show it's real integer crunching power thanks to the out of order loads (memory disambiguation) and better branch prediction.Those are two areas where the opteron architecture is still weak.

The slightly lower latency of the L3-cache of Nehalem helps too. This kind of software also makes the buffers fill up due to the long dependency chains. Those OOO buffers have been increased and the depencency chains have been shortened by a very low latency L2 cache and relatively fast L3.

Still we are absolutely amazed that the difference is this large. We would have expected Nehalem to outperform Shanghai by lower margins. Although we still are a bit skeptical that the difference is this large ("too good to be true" syndrome), we do not see how you could artificially inflate a SAP benchmark. It sure is not as easy as SPECJBB or SPECfp/int.   17917.png
您需要登錄後才可以回帖 登錄 | 註冊 |

本版積分規則

小黑屋|手機版|無圖浏覽|網站地圖|XFastest  

GMT+8, 2025-2-16 19:34 , Processed in 0.348450 second(s), 67 queries .

專業網站主機規劃 威利 100HUB.COM

© 2001-2018

快速回復 返回頂部 返回列表