On Thu, 13 Jun 2024, Henry Bent via cctalk wrote:
I always
assumed that was because the latency of multiply, let alone
divide, was far too many cycles for anyone to plausibly schedule "useful"
instructions into. Wasn't r4000 divide latency over 60 cycles? Wasn't r4000
divide latency more than 60 cycles?
The MIPS R4000 manual
https://groups.csail.mit.edu/cag/raw/documents/R4400_Uman_book_Ed2.pdf
says that double precision divide is 36 cycles, and double precision square
root is 112 (!).
Note that these figures are for floating-point arithmetic.
What's interesting about that is that GCC's
model of the R4000 says that
divide is 69 cycles; I'm not sure of the reason for the discrepancy.
https://gcc.gnu.org/git/?p=gcc.git;a=blob_plain;f=gcc/config/mips/4000.md;h…
And this is for integer arithmetic, that's the reason.
Maciej