Chuck Guzis wrote:
Date: Sat, 26 Jan 2008 17:21:56 -0500
From: Sean Conner <spc at conman.org>
I came across this bit of code from
http://www.hackersdelight.org/ to
divide by 10:
I think that goes back to my original suggestion of effectively
multiplying by a scaled reciprocal approximation of 0.1. The 6 in
the last computation statement appears to be some sort of rounding
factor.
Well, I was trying another variation on Chuck's earlier suggestion to see
if it might do better than Dwight's fast routine. Whittled it down to
exactly the same size as Dwight's in instruction count (29) and byte
count (32), but Dwight's is still faster by 11 cycles.
Dwight's routine implements:
q = ( 51*x + (51*x/256+16) ) / 512
r = x - (q*4+q)*2
This routine implements:
q = ( 2*x + x + ( (2*x+x)*4 + x )*4/256 )*8 / 256
which is variation on
q = x/16 + x/32 + x/256 + x/512 + x/2048
remainder same as r above
It fails a little earlier than Dwight's as well, at 1029 rather than 1210.
I tried a coded version of the hackersdelight routine, it appears to be
valid for a far larger range, in part as it only does right-shifts (no
scaling up).
----------------------------------------------------
Segment: MAIN
base=$0000 end=$001F bytes=32
machine z80
0000 54 Div10F LD D,H de = hl = dividend
0001 5D LD E,L
0002 29 ADD HL,HL 2x
0003 19 ADD HL,DE + x
0004 44 LD B,H bc = 3x
0005 4D LD C,L
0006 29 ADD HL,HL * 4
0007 29 ADD HL,HL
0008 19 ADD HL,DE + x
0009 29 ADD HL,HL * 4
000A 29 ADD HL,HL
000B 6C LD L,H / 256
000C 26 00 LD H,0
000E 09 ADD HL,BC + 3x
000F 29 ADD HL,HL * 8
0010 29 ADD HL,HL
0011 29 ADD HL,HL
0012 6C LD L,H / 256
0013 26 00 LD H,0
0015 7D LD A,L a = quotient
0016 44 LD B,H
0017 4F LD C,A bc = a
0018 29 ADD HL,HL rem = x - 10q
0019 29 ADD HL,HL
001A 09 ADD HL,BC
001B 29 ADD HL,HL
001C EB EX DE,HL
001D ED 52 SBC HL,DE l = remainder
001F C9 RET
----------------------------------------------------