Modern Marvels: Computers ?? no graphics supers

10 Mar 2007

Randy Dawson wrote:
...
...
  Lost, sadly was the machine between then and now, the
Graphics 
 Supercomputer.  In an effort to add computational speed to graphics and 
 scientific visualazation, two vendors went head to head on this problem, 
 Ardent and Stellar.
 If you were around at the time, and saw one of these I would love to 
 hear from you.  The performance was truly spectacular.  I had a chance 
 to use one for a couple of years, and it still comes pretty close to 
 current GPU tec in graphics performance.  With pipeline vector processor 
 and compiler to unroll loops it was WOW.  Todays Ghz processors cannot 
 beat a vector machine in computation, Titan had a 16 Mhz 1K floating 
 point vector ALU. 
Randy, no doubt you know a lot more about the ardent/stellar/stardent 
stuff than me.  I was aware of it and I once got hold of the design spec 
for the TOE processor (the 4x4 pixel "stamper").  However, I think you 
aren't aware of how sophisticated todays GPUs are.

16 MHz * 1K flops = 16 Gflops.  A single top end GPU is more like 500 
GFLOPS (single prec only, though).  Today's GPUs have myriad pixel 
formats, including ARGB with an FP32 for each component.  Pixel shaders 
are highly programmable.  A single GPU can have > 80 GB/sec of bandwidth 
to DRAM (not cache).

The TOE processor was a fixed point affair with limited, fixed point 
precision.  There is no comparison.  I wish I still had the spec to make 
a more concrete comparison.

A google search turned up this quote:

With the Dore' rendering package [Borden89], each processor is capable 
of rendering a maximum of 20,000 smoothly shaded small polygons/seconds.

Today's GPUs can render thousands of times more triangle per second, 
antialiased, with multiple, high quality texture maps and arbitrary 
blending.

Another google search

	http://www.ece.cmu.edu/~ece548/handouts/17v_perf.pdf

says that the Titan 1 had a 125 ns clock period and two FPUs, for 16 
MFLOP/s peak.  Perhaps you recall 1K FPUs, but maybe it was a 1K vector 
register length.  The same pdf (written by Philip Koopman) says that 
even with four processor, and with a large (1000x1000) array size, the 
titan-1 peaked at 15.7 Mflops.  It attributes this to the fact that the 
aggregate bus bandwidth of the titan was 256 MB/sec.  By rewriting the 
linpack code to block the data appropriately, they got it up to 46 MFLOP/s.

So, overall, I think there is no comparison.  The rose colored glasses 
of time have fooled you.

...
...
  Are there any graphics guys on the list? 
Yes, from the hw end of things.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Modern Marvels: Computers ?? no graphics supers