[olug] OT-CPU Benchmarks

Sun Feb 22 21:33:23 UTC 2004

>> dthacker9 at cox.net said:
>> > I'm looking for a benchmark that would allow me to make
>> > a meaningful comparison of the processing power of hte SPARC
>> > chips in this 5 year old server vs today's Intel/AMD/PPC
>> > chips.  Has anyone found a benchmark that they would
>> > consider accurate for this type of comparison?

On Thursday 19 February 2004 08:46 pm, Daniel Linder wrote:
> Doing what?

Dave Thacker said:
> A large scale forecasting/optimization task in batch mode.
[snip...]
> Nope, CPU is the bottleneck, and I want to compare what today's Intel/AMD
> stuff can do versus 5 year old SPARC's

Ok, that's a start.  What is your current system (RAM, CPU, data set size,
etc)? A Sun 220 is vastly different than a E10K! :)

Is your app multi-threaded or scale well with multiple CPUs?  What is the
working-set size of the data in question?  (10's of Megabytes of 100's of
Gigabytes?  Constantly changing, or only daily/weekly updates?)

I'm hazzarding a guess, but it sound like your app is not so much
high-level math as it is just large datasets that get massaged.  If that
is the case, then the newer AMD Opteron or Athlon64 their wider data
busses would help immensly.  I'm not a Sparc guy, but I believe the the
average 5 year old Sparc CPU (Sparc IIIi at best?) had a FSB of
133-150MHz.  From the Sun web site regarding the Sparc IIIi, it has a peak
theoretical memory bandwidth of 4.28 GB/sec.  The AMD Athlon64 marketing
information slick I have infront of me claims to have a 9.6GB/sec rate... 
I can see that the newer AMD with the higher FSB and DDR could get to the
2x multiplier, but I am sure market-speak and market-fog cloud this
comparison.

With the AMD64's integrated memory controller and 128bit data path running
at the CPU core frequency you should be able to pump through some large
datasets if the CPU and MB are running at highest FSB available and RAM to
match.

Can your application run across multiple CPUs or systems for enhanced
"clustering" performance?  If it is scalable across multiple systems (ala
grid or clustering) then you'll probably see the larges performance boost
using that scheme.  Multiple smaller, single CPU systems might give more
bang for the buck if the dataset can be replicated across each node with
minimal cross-communication between the systems.  Each system would have
access to their own local (i.e. private) RAM and database so contention on
that level is avoided.

If the app or the budget don't allow for the cluster solution, a dual CPU
motherboard would probably also give a speed-up if the app itself is SMP
aware.  The drawbacks to that solution are that the two CPUs might always
be in contention for the total bandwidth to the RAM and/or the database --
you shift the bottleneck from the CPUs to the SCSI controller.

I am sure the group would love to hear more about this both in the
planning and in the final outcome...  Having said so much above I'd love
to have it proven correct, but then we only learn when we make mistakes...
:)

Dan

-- 
Daniel Linder