Cell Processor Powerful
It’s looking like the Cell processor is quite a powerful one, powering the PS3 to the top. IBM’s Alex Chow gave a presentation about the Cell at the Power.org conference in Barcelona.
For the large FFT test, an algorithm designed by David H. Bailey in 1986 for communication between supercomputer banks was used as the starting point, since the model fit especially well with what the IBM team was trying to accomplish. Tasked with working through a 16.7 million (2^24) element data set filling 128 MB of memory, the program designed by Dr. Chow and his team worked by breaking the data set down into ‘mega-blocks,’ allowing the SPE’s to process 2^13 (256×32) points per iteration. The end result being that for the large FFT test, the Cell working with it’s 8 SPE’s was able to achieve a FLOPS rating of ~42 GFLOPS to the Xeon’s half-a-GFLOPS. Dr. Chow went on to state that in the proper environments, Cell should be able to approach it’s theoretical performance limit.
Now this is for highly optimized code that suites the Cell well. But the numbers are still impressive. But how easy will it be to program the Cell?
Lastly, a member from the audience asked how difficult Chow felt it would be to program for Cell, something that has been discussed extensively around the Internet, especially as it applies to Sony’s upcoming Playstation 3. The answer Dr. Chow gave was in the form of an analogy, stating that when programming for Cell, it might be best to approach it in the same fashion as one might approach dividing a workload between nine seperate workstations, with each of the workstations in the analogy representing an SPE (plus PE). What he was trying to emphasize was that in as much as a task could be threaded or seperated to run on different CPU’s, so too could it be divided to run on the SPE’s. The audience member felt his question was misunderstood though, and revised it to ask specifically what tools IBM might be able to provide to programmers interested in programming for Cell but feeling daunted by the task. During the course of Chow’s response he revealed that a compiler is currently in the prototyping stages at IBM that would dynamically chop-up code and thread it such that it would automatically make use of the SPE’s. One has to wonder just how efficient such a compiler can be, but it will come as a much welcomed tool for developers should the project see fruition.