I'm not nearly so concerned about the implementation details. Frankly, the
FPGA vendors are all heading off in the wrong direction for implementation
of those "old" processors and their peripherals. They give you 10 times
I/O's you need and only half the routing resources. I'd much rather look at
a plcc44 housing a 2500 CLB FPGA rather than a 500-pin FPGA housing what
they claim is a 400K-gate equivalent. What's more, I'd rather see a 2
million gate "sea" of gates than a few dozen CLB's or macrocells, providet
there were yards and yards of interconnection resources. That's not where
they're headed. They want you to buy 32 ram bits with which to build a
single nand gate.
The PLD vendors aren't any better . . . their devices have always had too
many inputs and not nearly enough buried resources for my taste. If I have
to "do something" to a couple of inputs based on what a couple more do, then
they work OK, but if I have to do a bunch of well-defined things based on
what one input does, and generate one output based on a complex sequence of
processes, always the same, however, then I have no choice other than the
Scenix SX, which is a microcontroller. PALs and PLDs have never had the
right input/output pin ratio, nor have they often had sufficient internally
buried registers. Crying about it won't fix it, though.
Dick
-----Original Message-----
From: Hans B Pufal <hansp(a)digiweb.com>
To: Discussion re-collecting of classic computers
<classiccmp(a)u.washington.edu>
Date: Saturday, August 28, 1999 12:13 PM
Subject: Re: FPGAs and PDP-11's
>Richard Erlacher wrote:
>
>> I've taken a good hard look at implementing the 6500 core in XILINX and
find
>> that performance, which is VERY much of interest, is impacted most by ALU
>> design.
>
>No-one has mentioned the free IP project at <http://www.free-ip.com/>
>which has a VHLD implementation of a 6502 now available. No idea of
>performance on this, I have just begun to dabble in this area.
>
>I too bemoan the fact the full configuration specs are not availble for
>the FPGA's.
>A few years ago I was working for a company that had a Xilinx part
>monitoring a processor bus. We wanted to dynamically reconfigure the
>FPGA so that we could change the bus pattern it triggered on - no joy
>though geting the necessary info.
>
>I see implementing old processors in FPGA's as a way of preserving those
>the design of those processors. Yes, we would all prefer to have an
>original, but practically speaking that is not possible.
>
>For some uses, a modern re-implementation or an emulator is better than
>nothing at all.
>
>Regards
>
>_---_--__-_-_----__-_----_-__-__-_-___--_-__--___-__----__--_--__-___-
>Hans B Pufal Comprehensive Computer Catalogue
><mailto:hansp@digiweb.com> <http://digiweb.com/~hansp/ccc>
please see embedded comments below.
Dick
-----Original Message-----
From: Clint Wolff (VAX collector) <vaxman(a)oldy.crwolff.com>
To: Discussion re-collecting of classic computers
<classiccmp(a)u.washington.edu>
Date: Saturday, August 28, 1999 3:00 PM
Subject: Re: PDP era and a question
>
>
>
>On Sat, 28 Aug 1999, Richard Erlacher wrote:
>
>> please see my embedded comments below.
>>
>> Dick
>>
<snip>
>I saw a blurb about that several years ago in one of the trade rags.
>Basically, the part was sector based (not their name for it). You could
>reload a portion of the FPGA while the rest continued to operate. The
>example that was given was loading different image processing algorithms
>into the chip while the rest of the chip continued to pull in and output
>the video stream.
>
I've see writing, but not authoritative writing about this. I don't
consider marketing departments capable of authoritative writing, by the way.
>
>> One thing I find shameful about the FPGA makers is that they have all
this
>> secrecy about one aspect or another of THEIR intellectual property as
>> pertains to their parts, yet they do absolutely nothing to protect YOUR
IP
>> as it sits in a completely visible medium. If they would at least
provide a
>> feature to allow you to flash in a persistent encryption circuit not
>> detectable from the outside but permanently associated with a given
design .
>> . .
>
>Publishing what each bit in the bitstream did would get your competitor
>half way to having a schematic of your design.
>
>clint
>
That's not as much a problem as allowing him to dupe your board (Plenty of
PC market boards have just the one major ASIC and a large and
price-sensitive market which a $1 lower price with take over.) and the
contents of your configuration EEPROM, then buy the same part from XILINX or
whoever supplies your parts, build them down the very street in TAIPEI from
where yours are made, then sell your work to the public, documentation and
all, leaving you with a market saturated with counterfeits of your product
and a HUGE support burden to pay for with your non-profits.
>
Dick
Well, the 650x is a VERY thrifty architecture. It has no memory-to memory
operations, nor does it have any operations involving more than one register
at a time. Additionally, if one chooses to implement it in the way the
original manufacturers did, the ALU serves, not only to operate the
instruction set, but also is used to operate on the PC and SP as well. This
save LOTS of resources in the construction of the associated counter chains.
That's not to say it's easy to implement this architecture in an efficient
way, though.
You have to look at another aspect of FPGA's however, and that's the
combined effect of routing and resource utilization. The ALTERA folks may
claim to have implemented this architecture in only 7% of the resources of
the part, but at what cost? In general, a substantial portion of the
resources available in a device, in terms, for example, of raw gate count,
is lost in the implmentation of a design. In each logic cell or logic
block, there are resources which the marketing department proudly counts and
advertises, yet which, once a part of the logic cell is used, are gone
forever and unusable. The routing is another factor which plays a big role
in the way FPGA's work out. Allocating a given routing resource in a
certain way can effectively render other logic resources unusable because of
lack of interconnection resources with which to do that. Consequently,
routing in a manner essential to a given level of performance for some of
the device resources can render other resources unreachable for any
practical purpose.
The marketing guys don't consider this when publishing their full-color
glossy brocheures, though. If they go to work, they'll say, well, this
nand gate is only 6% of a CLB, even though the entire CLB is used up, say,
and that pipeline register used to synchronize these functions is only 12% .
. . when in reality as much as 50% of the array may be consumed by such a
design, and the remaining "half" may be very difficult to utilize beyond
15%.
I've taken a good hard look at implementing the 6500 core in XILINX and find
that performance, which is VERY much of interest, is impacted most by ALU
design. Now, the Virtex CLB allows a single CLB to function as a two-bit
full-adder. If one wants the best performance/resource allocation tradeoff,
I'm nearly convinced that the best way might be to design it with a 2-bit
ALU slice because the resource consumption is small yet the delay for a
2-bit registered implementation of an 8-bit ALU would be just as fast as an
8-bit implementation because of the carry delay from stage to stage. It
appears to me that the rate-determining step, then, becomes how fast a clock
can be routed through the array. In the case of the 2-bit slice, it doesn't
have to propagate very far to get the job done. With an 8-bit
implementation, there's a lot more routing delay, and at least four times as
much delay per cycle in order to allow the carry to settle. Since the ALU
is used more than once per machine cycle . . . (see where all this leads?)
Dick
-----Original Message-----
From: Alex Knight <aknight(a)mindspring.com>
To: Discussion re-collecting of classic computers
<classiccmp(a)u.washington.edu>
Date: Friday, August 27, 1999 9:58 AM
Subject: Re: FPGAs and PDP-11's
>Hi,
>
>Another data point w.r.t. implementing microprocessors in FPGAs
>involves the 6502: When Altera was initially rolling out their 10K
>family of FPGAs, one of their marketing charts shows how they
>built a 6502 processor inside a 10K50 device using only 7% of
>the FPGA resources.
>
>Regards,
>Alex Knight
>Calculator History & Technology Web Page
>http://aknight.home.mindspring.com/calc.htm
>
>At 06:05 PM 8/26/99 -0700, Chuck wrote:
>
>>I did a preliminary "floor plan" for the PDP-8 and it used just under 1/3
>>of the 4010 (or 75% of a 4005 given the routing issues, which leaves
enough
>>to do an M8660 serial port.)
>>
>>--Chuck
>>
>>
FOR SALE:
Approx. 18 pounds of software/manuals, consisting of two packages; all sells
for one money. Best offer over $24 takes. Deadline for offers is
September 11, 1999.
* Retix Open Server 400 for UNIX MH-4410 ISC/SCO, Ver. 1.41
* Retix SMTP Gateway to X.400, Ver. 2.01
BOTH are provided on dual-format hi-density floppies. Software disk packages
have been opened, but appear to show little usage.
Sys. Requirements:
------------------
* SCO UNIX Sys. V/386, Rel. 3.2, Ver. 2.0 and 4.0
* Interactive UNIX Sys. V/386, Rel. 3.2, Ver. 2.2 and 3.0
and 386 cpu, 4 or 8 Mb RAM, 100 Mb disk space, including OS; hi-density
floppy
Ships from Laurel, Maryland 20707
USA only, please.
=============================================================
______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com
please see comments embedded below.
Dick
-----Original Message-----
From: Pete Turnbull <pete(a)dunnington.u-net.com>
To: Discussion re-collecting of classic computers
<classiccmp(a)u.washington.edu>
Date: Friday, August 27, 1999 4:41 PM
Subject: Re: FPGAs and PDP-11's
>On Aug 27, 20:46, Hans Franke wrote:
>> Subject: Re: FPGAs and PDP-11's
>> > Well, the 650x is a VERY thrifty architecture. It has no memory-to
>memory
>> > operations, nor does it have any operations involving more than one
>register
>> > at a time.
>>
>> TXA ? (Don't kill me :)
>
>And the indexed instructions such as ADC (nn,X), of course, and TSX, etc.
>
This is a case like the TXA, etc, which is a simple transfer from one
register to another with no ALU operation.
>
>> > much delay per cycle in order to allow the carry to settle. Since the
>ALU
>> > is used more than once per machine cycle . . . (see where all this
>leads?)
>>
>> More than once ?
>> Maybe I'm just blind, but I cant see more than one ALU op per cycle.
>
Well, on each cycle it flows the PCL through the ALU, adding zero with
carry. The indexing operations and stack pointer op's also do arithmetic on
the ABL and SP. Likewise, the INC and DEC instructions flow data from the
register block to the register block through the ALU. Still, there are no
register operations which require access to more than one register's
contents at a time. The critical issue being that the registers can simply
be implemented in a RAM. In fact, it appears that the RAM block might best
be implemented in an inverting RAM like the 74189 (actually a 16x4, but two
would work) because the arithmetic unit might work quite well as a simple
adder/subtractor, with a multiplexer as the shifter unit. The fact that
this RAM has separate inputs and outputs makes the TTL model very simple.
>
>Some of the indexed instructions do. Once to add the offset, and once for
>the operation requested, eg ADC (nn),Y.
>
The indexing operations involve arithmetic on memory address operands rather
than on register contents. The instruction contains the absolute address or
a pointer to it, and an index register contains an offset. Arithmetic is
done on the address components and only on one element in the register set.
Either one or two address bytes are part of the instruction, depending on
the mode, and the index register contains the offset to be added to the low
address byte either from the instruction or from the table to which a zero
page pointer directs it and 16-bit arithmetic is done on that using only one
byte from the register set. These indexed instructions using indirection
take as many as 6 (7 if a page boundary is crossed) cycles. The arithmetic
can always be done using the ALU, however.
>--
>
>Pete Peter Turnbull
> Dept. of Computer Science
> University of York
On Aug 27, 20:46, Hans Franke wrote:
> Subject: Re: FPGAs and PDP-11's
> > Well, the 650x is a VERY thrifty architecture. It has no memory-to
memory
> > operations, nor does it have any operations involving more than one
register
> > at a time.
>
> TXA ? (Don't kill me :)
And the indexed instructions such as ADC (nn,X), of course, and TSX, etc.
> > much delay per cycle in order to allow the carry to settle. Since the
ALU
> > is used more than once per machine cycle . . . (see where all this
leads?)
>
> More than once ?
> Maybe I'm just blind, but I cant see more than one ALU op per cycle.
Some of the indexed instructions do. Once to add the offset, and once for
the operation requested, eg ADC (nn),Y.
--
Pete Peter Turnbull
Dept. of Computer Science
University of York
You're quite right, but I actually meant that there aren't any instructions
which operate on more than one register at a time using the ALU with more
than one register for inputs. If you consider the instructions which do use
the ALU, you can see that a single register set, implemented as a RAM block
would allow you to transfer from the register RAM outputs through the ALU
and back into the registers in a single operation. That's what makes this
architecture so thrifty, as it means that you can send the PCL through the
ALU, adding a zero with carry set, and back to PCL, setting a carry flag if
that's applicable and if carry's true, then adding zero with carry to PCH
again storing the result in the source register.
In reality there are several operations which use the register set as both
source and destination, but none which use TWO registers as operands and
then use the registers as a destination as well. What that allows is that
you use a ram location as PCH, one as PCL, one as SP, and one as each of the
registers, X, Y, and A. Because of the way the thing works, the logic paths
are simple and straightforward to steer via a single data bus from the ALU
back to the register inputs. That explains why there's an extra cycle
needed whenever addressing across a page-boundary occurred.
If you constrain your thinking to the logic components which were available
back in the mid '70's, e.g. 74181, 74189 (for the register set), and
consider what was on the data bus when a "float" was encountered during a
read, namely the PCH, you begin to see the rudiments of this processor's
internal architecture. Moreover, if you think of the "pipleining" used by
the 650x in terms, not of synchronous pipleining as commonly used today, but
of pipelining the control structure so that the data flow could be managed
not with edge-triggered flip-flops but with gated latches, ala-7475, then
you see how the timing was developed.
The ALU was always a path for data from the registers to the registers'
input bus. The data bus output latch was, of course taking inputs from this
as well, and the output data, coincidentally followed the rising edge of the
phase-2 clock by about the same amount of time as the valid addresses
followed the falling edge. Since register-to-register operations had to
flow through the ALU, and since the registers had a common input path, only
one register could be targeted at a time. Since the register set is a RAM,
you couldn't do it any other way. If separate registers had been used, the
number of multiplexers would have been made the chip much larger.
The operations on the accumulator which required either immediate data or
data from memory were served by an impending operand register which was
loaded from the last memory fetch prior to the execution of the operation.
This action took a cycle, but didn't involve the data bus, so that what when
the processor fetched the next opcode, knowing that the impending operand
register was not involved in that operation and knowing that the one
register which would be unaffected by an opcode fetch was the IOR.
Dick
-----Original Message-----
From: Hans Franke <Hans.Franke(a)mch20.sbs.de>
To: Discussion re-collecting of classic computers
<classiccmp(a)u.washington.edu>
Date: Friday, August 27, 1999 12:45 PM
Subject: Re: FPGAs and PDP-11's
>> Well, the 650x is a VERY thrifty architecture. It has no memory-to
memory
>> operations, nor does it have any operations involving more than one
register
>> at a time.
>
>TXA ? (Don't kill me :)
>
>[...using 'only' one ALU...]
>
>Not uncommon back than and very efficient. I still belive the 65xx
>is one of the best - the instruction set is well defined to get
>the maximum out of a minimal hardware. You can see the function
>blocks klick just by looking at the instructions.
>
>> [... about resources]
>
>Exact, thats the main Problem with most %used numbers.
>
>
>> I've taken a good hard look at implementing the 6500 core in XILINX and
find
>> that performance, which is VERY much of interest, is impacted most by ALU
>> design. Now, the Virtex CLB allows a single CLB to function as a two-bit
>> full-adder. If one wants the best performance/resource allocation
tradeoff,
>> I'm nearly convinced that the best way might be to design it with a 2-bit
>> ALU slice because the resource consumption is small yet the delay for a
>> 2-bit registered implementation of an 8-bit ALU would be just as fast as
an
>> 8-bit implementation because of the carry delay from stage to stage. It
>> appears to me that the rate-determining step, then, becomes how fast a
clock
>> can be routed through the array. In the case of the 2-bit slice, it
doesn't
>> have to propagate very far to get the job done.
>
>Well, after all, any serious attempt to bring a 6502 into a FPGA
>will be about speed - and saving resources might not be the
>primary goal.
>
>> With an 8-bit
>> implementation, there's a lot more routing delay, and at least four times
as
>> much delay per cycle in order to allow the carry to settle. Since the
ALU
>> is used more than once per machine cycle . . . (see where all this
leads?)
>
>More than once ?
>Maybe I'm just blind, but I cant see more than one ALU op per cycle.
>
>Gruss
>H.
>
>--
>Stimm gegen SPAM: http://www.politik-digital.de/spam/de/
>Vote against SPAM: http://www.politik-digital.de/spam/en/
>Votez contre le SPAM: http://www.politik-digital.de/spam/fr/
>Ich denke, also bin ich, also gut
>HRK
My serial number project is going slowly. I only received two responses (thanks, Charlie and Joe) and so my sample set has a woeful three date points:13352 (Charlie's), 13513 (mine, sold 1/77) and 14213 (Joe's, sold 9/77). Perhaps a rate of about 1000 units per year in 1977, but too little info to tell. If this rate is correct, I would suspect the the numbering began around 12000 or so.
C'mon everybody, have a look at the back of your 5100 (the number is engraved into the back of the case, usually preceeded by a "10-") and keep those numbers coming.
Thanks.