¿ªÔÆÌåÓýAs an academic exercise, two assembler alternatives attached as text files. ? MEMSET.txt uses STC/MVC to set 256 bytes at a time and then STC and variable-length MVC to set the remainder less than 256 bytes, with optimizations not to set anything unneeded if N is zero or an integer multiple of 256. ? MEMSET16.txt uses 2 more registers than MEMSET.txt but replaces the length=255 MVC instruction with multiple STore operations (loop 16 times storing 4 bytes at a time 4 times for each loop around).? MVC was notoriously slow on some real-iron IBM hardware models. ? Not sure which technique would have been faster on any real 360-era iron, but there could be differences in the Hercules MVC vs STore operations that may make the STore solution faster (or not). ? These are untested, so I could have errors in my coding.? Corrections and improvements welcome. ? A C solution doing the same sort of thing as MEMSET16.txt using casts to INT and taking advantage of the C compiler¡¯s optimization and code generation could be even faster.? Something along these lines doing 16 bytes at a time (could obviously do 32 or 64 each loop as well, but 16 gives you the picture): ? void *memset(void *s, int c, size_t n) { ? ? size_t x; ??? int cccc = c + (c << 8) + (c << 16) + (c << 24); ? ? ? for (x = 0; x < (n / 16); x+=16) ? ? { ? ? ? ? *((int *)((char *)s + x???? )) = cccc; ? ? ? ? *((int *)((char *)s + x + ?4)) = cccc; ? ? ? ? *((int *)((char *)s + x + ?8)) = cccc; ? ? ? ? *((int *)((char *)s + x + 12)) = cccc; ? ? } ? ? for (; x < n; x++) ? ? { ? ? ? ? *((char *)s + x) = (unsigned char)c; ? ? } ? return (s); } ? HTH ? Peter ? From: [email protected] <[email protected]> On Behalf Of adriansutherland67
Sent: Monday, April 13, 2020 1:33 PM To: [email protected] Subject: [h390-vm] memset help ? Folks void *memset(void *s, int c, size_t n) { ? ? size_t x = 0; ? ? ? for (x = 0; x < n; x++) ? ? { ? ? ? ? *((char *)s + x) = (unsigned char)c; ? ? } ? ? return (s); } |