Linus Torvalds writes: (Summary) wrote:
Yeah, it's only true on the very latest uarchs, and even there it's not perfect for small copies.
not perfect for small copies.
On the older machines that are relevant for 32-bit code, it's often tens of cycles just for the ucode overhead, I think, and "rep movsb" actually does things literally a byte at a time.
actually does things literally a byte at a time.
Linus
Linus
Linus
[...]
is probably only true on modern CPUs.Yeah, it's only true on the very latest uarchs, and even there it's not perfect for small copies.
not perfect for small copies.
On the older machines that are relevant for 32-bit code, it's often tens of cycles just for the ucode overhead, I think, and "rep movsb" actually does things literally a byte at a time.
actually does things literally a byte at a time.
Linus
Linus
Linus