Chris Rodie wrote:
Tom,
Assembly source tends to contain directives that are assembler
specific. The source could also be invoking macros that are
contained in a specific assembler's libraries. Are there no comments
at the top of the sources that say "assemble with ..."? Unless you
are able/ willing to modify such assembler specific references in the
source for other tools, you need to use the tools that John Wilson
used. So if there are no clues in the sources, ask John.
In general, good (and free) tools for assembling and linking x86 code
for DOS or Windows include:
NASM assembler
http://sourceforge.net/projects/nasm
MINGW compiler (for linking Windows dlls and exes)
http://www.mingw.org/
DJGPP (for linking DOS protected mode programs)
http://www.delorie.com/djgpp/
Regards,
C. Rodie
Tom Peters wrote:
At 03:52 PM 11/27/2007 -0500, you wrote:
Can anyone help me with a simple link to an
assembler / linker for a
Pentium III / Pentium 4?
I am attempting to use the EMEM.DLL (Emulated PDP-11 Memory)
under Ersatz-11, but I am basically a dummy when it comes to using
X86 code, especially finding a pair of suitable assembler / linker
programs.
I downloaded the Watfor programs, wasm.exe and wlink.exe, but I am
not able to assemble the original file, EMEM.ASM at this point.
In case anyone is interested, the EMEM.DLL under E11 provides access
to PC RAM via emulated PDP-11 hardware registers. The current version
which I have been using allows up to 8 MBytes, but I want to
increase that
to about 600 MBytes so that I can write a faster sieve program for
Prime
Numbers which looks like it runs in a PDP-11. If I can get the test
version
to run fast enough on a Pentium III, I will try it on a Pentium 4
with 4 GBytes
of memory and see if it is possible to sieve the primes up to 10**18
(essentially
a 64 bit sieve program) in a reasonable time (i.e. less than 1 year!).
http://www.grc.com/smgassembly.htm
Contains some asm resources as well as a list 8 or 9 other links at
the bottom of the page.
Jerome Fine replies:
I have contacted John and he mentioned TASM (from Borland).
At one point, I probably had that assembler on an AT, but
it was so long ago, the AT died and it probably did not support
386 instructions in any case.
As far as I understand the code for EMEM.ASM, it is all very
simple 386 type instructions without any macros. There are
only 271 lines of code in total with over 100 lines of comments
and less than 140 lines of actual code - the rest being directives
and subroutine entry points. It is a relatively small program when
I consider that it can make more than a GigiByte of Pentium memory
available addressed by either byte or 16 bit word.
Since the EMEM.ASM file is only 7,148 bytes and the EMEM.DLL
file is only 1060 bytes, I doubt that John would consider
the EMEM concept anything other than a shell for how to
write a DLL for E11. But I don't wish to make the file
available in a public manner.
What I should be able to do is to take advantage of the
considerable CPU power of the Pentium 4 while at the same
time writing most of the code for the PDP-11. At the same
time, I understand it can never execute on a real DEC
PDP-11 mostly due to how long it would take to do anything
useful if I write PDP-11 code for the inner loops that will
done at very high speed on the Pentium 4.
What may be able to really speed things up is if I am able to
effectively use the 2 MegaByte L2 cache on the newer Pentium 4
CPUs with a 1066 front side bus (whatever that means). While
the work space in EMEM memory will likely be bit a bit map of
about 404 MegaBytes, the concept I hope to use may be
able to split that into portions that fit into the cache.
What I don't know is whether the overhead of repeating each
inner loop will be worthwhile. There are about 50 million
inner loops. Each inner loop must be executed between
about 10 times and about 10 million times. (Maybe now you
understand why I estimate a year to execute!) However,
when the inner loop is executed less than 1000 times (over
VERY widely scattered points in that 404 MegaByte bit map or
only about twice per portion which fits into the cache), it
seems doubtful there would be sufficient improvement. That
seems to leave a small fraction of the inner loops (only
about 100,000 of them - which end up using most of the total
time in any case which is why splitting the work space into
portions which temporarily fit into the cache may be so very
useful).
Can anyone comment on how much of a speed improvement there
might be by modifying the code so that data is accessed via
cache rather than from memory?
I will locate the assemblers that you suggest and try them!
THANK YOU!!!
Sincerely yours,
Jerome Fine
--
If you attempted to send a reply and the original e-mail
address has been discontinued due a high volume of junk
e-mail, then the semi-permanent e-mail address can be
obtained by replacing the four characters preceding the
'at' with the four digits of the current year.