OT? Upper limits of FSB
Guy Sotomayor Jr
ggs at shiresoft.com
Tue Jan 8 16:51:51 CST 2019
Some architectures (I’m thinking of the latest Intel CPUs) have a small loop cache
whose aim is to keep a loop entirely within that cache. That cache operates at the
full speed of the instruction fetch/execute (actually I think it keeps the decoded uOps)
cycles (e.g. you can’t go faster). L1 caches impose a penalty and of course there is
the instruction decode time as well both of which are avoided.
TTFN - Guy
> On Jan 8, 2019, at 2:43 PM, Chuck Guzis via cctalk <cctalk at classiccmp.org> wrote:
>
> On 1/8/19 1:23 PM, Tapley, Mark via cctalk wrote:
>
>> Why so (why surprising, I mean)? Understood an unrolled loop executes
>> faster...
>
> That can't always be true, can it?
>
> I'm thinking of an architecture where the instruction cache is slow to
> fill and multiple overlapping operations are involved and branch
> prediction assumes a branch taken. I'd say it was very close in that case.
>
> --Chuck
>
More information about the cctalk
mailing list