dwight elvey wrote:
Hi
Pete Turnbull sent some suggestions and here is what we have
now. It trades a little speed for size but the inner loop is faster
As I stated, faster is good but smaller is better. Any more help
would be great:
Next try at code:
div10:
xor a
ld de,#-640d ; largest power of 2 times 10 less then 800d
ld b,#7 ; only seven loops needed to finish divide 800/10
divloop:
call div10s ; divide steps
djnz divloop
add hl,hl ; push full remainder in HL
ret
div10s:
add a,a ; 2*a
add hl,de ; trial subtract
jr c,div10s1 ; carry means trial passed
sbc hl,de ; undo previous add, carry is clear
dec a ; to nullify the following inc
div10s1:
inc a ; add to quotient
add hl,hl ; shift number to reuse same constant, 640 decimal
ret
This is 23 bytes and looks good.
(I expect you may have noticed already..) the subroutine (div10s) is
only called once now, you could inline it to save another 4 bytes.