开云体育


Re: memset help

 

Mark,

No doubt ... but MVCL/E are still millicode instructions. MVC is a hardware instruction.

Joe

On Wed, Apr 15, 2020 at 4:28 PM Mark L. Gaubatz via <mgaubatz=[email protected]> wrote:

Joe:

?

$MVCL was written many, many years ago. At the time, it was faster on certain machines. Need to re-validate this claim.

?

Also, at the time the macro was written, the cachelines were shorter. On several machines, MVCL performed additional checks whenever crossing a cacheline. On others, it was at page boundaries. Consequently, on some machines, $MVCL was faster, on others, slower. It entirely depends on the underlying firmware and hardware implementing the two instructions.

?

So please remember, some of the “official” text when referring to performance is long outdated.

?

Mark

?

?

From: [email protected] <[email protected]> On Behalf Of Joe Monk
Sent: Wednesday, April 15, 2020 11:32 AM
To: [email protected]
Subject: Re: [h390-vm] memset help

?

Even IBM admits that multiple?MVCs are faster than MVCL...

?

"Use $MVCL to generate a MVC (move character) instruction when you need to move more than 256 bytes of storage. Use this macro instruction in high performance areas because multiple MVCs (as created by this macro) are faster than using an MVCL instruction."

?

?

Joe

?

On Wed, Apr 15, 2020 at 1:11 PM Peter Coghlan <groups@...> wrote:

Drew Derbyshire wrote:
> On 4/14/20 11:32 PM, Peter Coghlan wrote:
> > I wonder could this have been the COBOL compiler abusing MVCL instructions
> > in situations where they were not the appropriate instructions to use?
> >
> > Perhaps instructions such as MVCL would be expected to be "hot spots" because
> > they can deliver a relatively large amount of work for a single instruction?
> > Or is it that implementations of this instruction were sometimes poorer than
> > they ought to be and they were really not delivering bang for buck?
> I was told back in the 1980s that for performance reasons MUSIC moved
> 4096 bytes of data via a series of MVC commands in place of one MVCL.
>

Drew,

This is very interesting given our recent discussion on the matter.

(By the way, I replied to your recent "Inquiring minds" email but I fear you
may not have seen my reply as Google tends to route anything I send these days
into their recipients spam folder or otherwise quarantine it.? Apparantly
Google regards me as a notorious source of spam or something for some time now.
I also sent emails to other Google mail users both on the this list not and
these also seem to have disappeared into black holes...)

Regards,
Peter Coghlan

> --
> Drew Derbyshire
>
> "All right, Mr. DeMille, I'm ready for my close-up." -- "Sunset Blvd.,"
>
>



Re: memset help

 

开云体育

Joe:

?

$MVCL was written many, many years ago. At the time, it was faster on certain machines. Need to re-validate this claim.

?

Also, at the time the macro was written, the cachelines were shorter. On several machines, MVCL performed additional checks whenever crossing a cacheline. On others, it was at page boundaries. Consequently, on some machines, $MVCL was faster, on others, slower. It entirely depends on the underlying firmware and hardware implementing the two instructions.

?

So please remember, some of the “official” text when referring to performance is long outdated.

?

Mark

?

?

From: [email protected] <[email protected]> On Behalf Of Joe Monk
Sent: Wednesday, April 15, 2020 11:32 AM
To: [email protected]
Subject: Re: [h390-vm] memset help

?

Even IBM admits that multiple?MVCs are faster than MVCL...

?

"Use $MVCL to generate a MVC (move character) instruction when you need to move more than 256 bytes of storage. Use this macro instruction in high performance areas because multiple MVCs (as created by this macro) are faster than using an MVCL instruction."

?

?

Joe

?

On Wed, Apr 15, 2020 at 1:11 PM Peter Coghlan <groups@...> wrote:

Drew Derbyshire wrote:
> On 4/14/20 11:32 PM, Peter Coghlan wrote:
> > I wonder could this have been the COBOL compiler abusing MVCL instructions
> > in situations where they were not the appropriate instructions to use?
> >
> > Perhaps instructions such as MVCL would be expected to be "hot spots" because
> > they can deliver a relatively large amount of work for a single instruction?
> > Or is it that implementations of this instruction were sometimes poorer than
> > they ought to be and they were really not delivering bang for buck?
> I was told back in the 1980s that for performance reasons MUSIC moved
> 4096 bytes of data via a series of MVC commands in place of one MVCL.
>

Drew,

This is very interesting given our recent discussion on the matter.

(By the way, I replied to your recent "Inquiring minds" email but I fear you
may not have seen my reply as Google tends to route anything I send these days
into their recipients spam folder or otherwise quarantine it.? Apparantly
Google regards me as a notorious source of spam or something for some time now.
I also sent emails to other Google mail users both on the this list not and
these also seem to have disappeared into black holes...)

Regards,
Peter Coghlan

> --
> Drew Derbyshire
>
> "All right, Mr. DeMille, I'm ready for my close-up." -- "Sunset Blvd.,"
>
>



Re: REXX Interpreter immediate commands

 

Bob ... Noted

Am I right in assuming that the concept of processes receiving signals / interrupts, and of registering signal handlers is just alien to VM/370 and can be ignored?

A


Re: OT, mail issues, was Re: [h390-vm] memset help

 

Does your email server implement?SPF, DKIM and DMARC?

A


Re: GCCLIB - Double Branch on calls to RESLIB library - needed?

 

On Tue, Apr 14, 2020 at 02:51 PM, Joe Monk wrote:
LA R15,0(R15) would add 0 to the value of R15. I dont?think thats what you want.
Perhaps I mean just L? Anyway I am guessing from silence that the double branch is as good as it gets!


OT, mail issues, was Re: [h390-vm] memset help

 

On 4/15/20 2:07 PM, Peter Coghlan wrote:
(By the way, I replied to your recent "Inquiring minds" email but I fear you
may not have seen my reply as Google tends to route anything I send these days
into their recipients spam folder or otherwise quarantine it. Apparantly
Google regards me as a notorious source of spam or something for some time now.
I also sent emails to other Google mail users both on the this list not and
these also seem to have disappeared into black holes...)
I am noticing this to an increasing degree as well. My assumption is
that Google, having spent years pushing "free" email services while
making money on the back end by using the information they mine from it,
are now trying to push yet more people to use gmail accounts.

My mail server predates the very existence of Google as a company by
many years, and has only been use as a spam relay once, a decade ago,
for not quite a day, until I noticed that one of my hundred-or-so users
passwords had been cracked. The only reason I can see for email
originating from my server to be considered spam by Google is that it
comes from a non-gmail server.

This is just my assumption; I have no evidence to suggest that this is
why this is happening, it but it is the only explanation that I've been
able to come up with for this widespread and rapidly growing problem.

This is what happens when people take the lazy or cheap route and get
a "free" email account from a for-profit corporation. Google
occasionally does good things for society, but they are not a charity.
As many have said, and almost nobody actually pays attention to: If you
receive value from a corporation for free, you are the product.

-Dave

--
Dave McGuire, AK4HZ
New Kensington, PA


Re: memset help

 

-----Original Message-----
From: [email protected] <[email protected]> On Behalf Of Harold
Grovesteen
Sent: 15 April 2020 20:15
To: [email protected]
Subject: Re: [h390-vm] memset help

On Wed, 2020-04-15 at 14:25 +0100, Dave Wade wrote:


-----Original Message-----
From: [email protected] <[email protected]> On Behalf Of Harold
Grovesteen
Sent: 15 April 2020 14:15
To: [email protected]
Subject: Re: [h390-vm] memset help

On Wed, 2020-04-15 at 13:24 +0100, Steven Fosdick wrote:


I did wonder about the possibility of setting up gcc as a
cross-compiler but that doesn't seem trivial to do.

Steve.
Yes. I did accomplish this and it is documented with scripts in the
SATK. It used GNU as as the assembler for stand alone, aka bare
metal, coding on Hercules. After literally years of work on that,
it just did not work well enough and I changed to a new toolset that
is part of the project.

However, the key difference between GCC on VM and other operating
systems supported by Hercules and GCC as used on Linux is the output
format. GCC typically generates ELF object module files. The GCC
on VM generates mainframe object modules. Huge difference and a
fundamental reason this GCC is used with the operating systems that
run on Hercules.

With my pedants' hat on, it actually generates "normal" Assembler that
is assembled by the XF assembler on VM or MVS.
That is why we get "normal" object files which can be loaded with the
VM loader or the OS Linkage Editor.
I have tried feeding the assembler into Assembler G with poor results.
I haven't tried Assembler H...
.... its pretty easy to produce a GCC that compiles 370 code on
Windows/Linux. After all that’s how I built the first GCCCMS.
Getting all the assembler to CMS to compile it was the fun part..
No, doubt.
Yes, the generated assembler is the path to the object. And the object
dictates where the program can run.

BTW, can CMS compile GCC on VM/370? There were issues with that on
MVS.
Originally it could on VM/370, but not on SP or later because you lose memory. I don't know how Paul builds it these days, he may use the 380 mods..
There were some modules that needed different optimization, but I can't remember which ones they were.
Because CMS is so small you get more usable memory than on MVS.

Dave





Harold Grovesteen

Dave







Re: memset help

 

On Wed, 2020-04-15 at 14:25 +0100, Dave Wade wrote:


-----Original Message-----
From: [email protected] <[email protected]> On Behalf Of Harold
Grovesteen
Sent: 15 April 2020 14:15
To: [email protected]
Subject: Re: [h390-vm] memset help

On Wed, 2020-04-15 at 13:24 +0100, Steven Fosdick wrote:


I did wonder about the possibility of setting up gcc as a
cross-compiler but that doesn't seem trivial to do.

Steve.
Yes.??I did accomplish this and it is documented with scripts in
the SATK.??It
used GNU as as the assembler for stand alone, aka bare metal,
coding on
Hercules.??After literally years of work on that, it just did not
work well
enough and I changed to a new toolset that is part of the project.

However, the key difference between GCC on VM and other operating
systems supported by Hercules and GCC as used on Linux is the
output
format.??GCC typically generates ELF object module files.??The GCC
on VM
generates mainframe object modules.??Huge difference and a
fundamental
reason this GCC is used with the operating systems that run on
Hercules.

With my pedants' hat on, it actually generates "normal" Assembler
that is assembled by the XF assembler on VM or MVS.
That is why we get "normal" object files which can be loaded with the
VM loader or the OS Linkage Editor.
I have tried feeding the assembler into Assembler G with poor
results.
I haven't tried Assembler H...
.... its pretty easy to produce a GCC that compiles 370 code on
Windows/Linux. After all that’s how I built the first GCCCMS.
Getting all the assembler to CMS to compile it was the fun part..
No, doubt.
Yes, the generated assembler is the path to the object. ?And the object
dictates where the program can run.

BTW, can CMS compile GCC on VM/370? ?There were issues with that on
MVS.



Harold Grovesteen

Dave







Re: memset help

 

On Wed, 2020-04-15 at 07:25 -0700, adriansutherland67 wrote:
On Wed, Apr 15, 2020 at 08:02 AM, adriansutherland67 wrote:
it takes me about 5 mins to write a single line of S/370 assembler
not counting debugging!
I do indeed loath assembler - especially as I use the "infinite
number of monkeys" method - however I have managed to detect the
stack running out of space. Big day!

A
?
That is how we all learned. ?Getting slapped a few times helps with the
memory. ?There is no way to learn assembler, like any language, without
doing it.

When I started, application programs were frequently written in
assembler. ?One shop I worked at, had a large CICS system with all
applications in assembler.

Harold


Re: memset help

 

Even IBM admits that multiple?MVCs are faster than MVCL...

"Use $MVCL to generate a MVC (move character) instruction when you need to move more than 256 bytes of storage. Use this macro instruction in high performance areas because multiple MVCs (as created by this macro) are faster than using an MVCL instruction."



Joe

On Wed, Apr 15, 2020 at 1:11 PM Peter Coghlan <groups@...> wrote:
Drew Derbyshire wrote:
> On 4/14/20 11:32 PM, Peter Coghlan wrote:
> > I wonder could this have been the COBOL compiler abusing MVCL instructions
> > in situations where they were not the appropriate instructions to use?
> >
> > Perhaps instructions such as MVCL would be expected to be "hot spots" because
> > they can deliver a relatively large amount of work for a single instruction?
> > Or is it that implementations of this instruction were sometimes poorer than
> > they ought to be and they were really not delivering bang for buck?
> I was told back in the 1980s that for performance reasons MUSIC moved
> 4096 bytes of data via a series of MVC commands in place of one MVCL.
>

Drew,

This is very interesting given our recent discussion on the matter.

(By the way, I replied to your recent "Inquiring minds" email but I fear you
may not have seen my reply as Google tends to route anything I send these days
into their recipients spam folder or otherwise quarantine it.? Apparantly
Google regards me as a notorious source of spam or something for some time now.
I also sent emails to other Google mail users both on the this list not and
these also seem to have disappeared into black holes...)

Regards,
Peter Coghlan

> --
> Drew Derbyshire
>
> "All right, Mr. DeMille, I'm ready for my close-up." -- "Sunset Blvd.,"
>
>




Re: memset help

 

Drew Derbyshire wrote:
On 4/14/20 11:32 PM, Peter Coghlan wrote:
I wonder could this have been the COBOL compiler abusing MVCL instructions
in situations where they were not the appropriate instructions to use?

Perhaps instructions such as MVCL would be expected to be "hot spots" because
they can deliver a relatively large amount of work for a single instruction?
Or is it that implementations of this instruction were sometimes poorer than
they ought to be and they were really not delivering bang for buck?
I was told back in the 1980s that for performance reasons MUSIC moved
4096 bytes of data via a series of MVC commands in place of one MVCL.
Drew,

This is very interesting given our recent discussion on the matter.

(By the way, I replied to your recent "Inquiring minds" email but I fear you
may not have seen my reply as Google tends to route anything I send these days
into their recipients spam folder or otherwise quarantine it. Apparantly
Google regards me as a notorious source of spam or something for some time now.
I also sent emails to other Google mail users both on the this list not and
these also seem to have disappeared into black holes...)

Regards,
Peter Coghlan

--
Drew Derbyshire

"All right, Mr. DeMille, I'm ready for my close-up." -- "Sunset Blvd.,"


Re: memset help

 

On 4/14/20 11:32 PM, Peter Coghlan wrote:
I wonder could this have been the COBOL compiler abusing MVCL instructions
in situations where they were not the appropriate instructions to use?

Perhaps instructions such as MVCL would be expected to be "hot spots" because
they can deliver a relatively large amount of work for a single instruction?
Or is it that implementations of this instruction were sometimes poorer than
they ought to be and they were really not delivering bang for buck?
I was told back in the 1980s that for performance reasons MUSIC moved 4096 bytes of data via a series of MVC commands in place of one MVCL.

--
Drew Derbyshire

"All right, Mr. DeMille, I'm ready for my close-up." -- "Sunset Blvd.,"


Re: memset help

 

On Wed, Apr 15, 2020 at 08:02 AM, adriansutherland67 wrote:
it takes me about 5 mins to write a single line of S/370 assembler not counting debugging!
I do indeed loath assembler - especially as I use the "infinite number of monkeys" method - however I have managed to detect the stack running out of space. Big day!

A


Re: memset help

 

On Wed, Apr 15, 2020 at 01:42 PM, pjfarley3 wrote:
Apologies for the mistake.
No drama!


Re: memset help

 

开云体育

Adrian,

?

I made a coding error in all three of the MEMSET texts that I sent – in all of them I have this assembler instruction to declare register 15 as the code base register:

?

???????? USING 15,*????????? USE R15 AS BASE REG

?

Wrong order of the operands.? Please change this to the following in all three versions:

?

???????? USING *,15????????? USE R15 AS BASE REG

?

The syntax is “ USING BASEADDR,BASEREG[,BASEREG]* “

?

Apologies for the mistake.

?

Peter

?

From: [email protected] <[email protected]> On Behalf Of adriansutherland67
Sent: Wednesday, April 15, 2020 3:02 AM
To: [email protected]
Subject: Re: [h390-vm] memset help

?

All interesting ... and I will try out each candidate and report back. It will be tested only on Hercules so in one sense not a fair test. On the other hand we could argue that that Hercules is S/370 done well ... meaning I agree with a comment that if a CPU manufacturer defines a bulk move command they should implement it fast!

One thing, modern compilers generally produce faster code than a human assembler programmer. This is because both of front end optimisations (e.g. reassigning / calculating values that have not changes), and backend optimisation based on loop unrolling, and instruction reordering based on CPU pipelines etc.

This is why LLVM is becoming the one toolchain to rule them all ... even IBM works with them to ensure mainframe CPU internals are fully leveraged.

A


Re: memset help

 

-----Original Message-----
From: [email protected] <[email protected]> On Behalf Of Harold
Grovesteen
Sent: 15 April 2020 14:15
To: [email protected]
Subject: Re: [h390-vm] memset help

On Wed, 2020-04-15 at 13:24 +0100, Steven Fosdick wrote:

I did wonder about the possibility of setting up gcc as a
cross-compiler but that doesn't seem trivial to do.

Steve.
Yes. I did accomplish this and it is documented with scripts in the SATK. It
used GNU as as the assembler for stand alone, aka bare metal, coding on
Hercules. After literally years of work on that, it just did not work well
enough and I changed to a new toolset that is part of the project.

However, the key difference between GCC on VM and other operating
systems supported by Hercules and GCC as used on Linux is the output
format. GCC typically generates ELF object module files. The GCC on VM
generates mainframe object modules. Huge difference and a fundamental
reason this GCC is used with the operating systems that run on Hercules.

With my pedants' hat on, it actually generates "normal" Assembler that is assembled by the XF assembler on VM or MVS.
That is why we get "normal" object files which can be loaded with the VM loader or the OS Linkage Editor.
I have tried feeding the assembler into Assembler G with poor results.
I haven't tried Assembler H...
.... its pretty easy to produce a GCC that compiles 370 code on Windows/Linux. After all that’s how I built the first GCCCMS.
Getting all the assembler to CMS to compile it was the fun part..


Harold Grovesteen

Dave



Re: memset help

 

On Wed, 2020-04-15 at 13:24 +0100, Steven Fosdick wrote:
?
I did wonder about the possibility of setting up gcc as a
cross-compiler but that doesn't seem trivial to do.

Steve.
Yes. ?I did accomplish this and it is documented with scripts in the
SATK. ?It used GNU as as the assembler for stand alone, aka bare metal,
coding on Hercules. ?After literally years of work on that, it just did
not work well enough and I changed to a new toolset that is part of the
project.

However, the key difference between GCC on VM and other operating
systems supported by Hercules and GCC as used on Linux is the output
format. ?GCC typically generates ELF object module files. ?The GCC on
VM generates mainframe object modules. ?Huge difference and a
fundamental reason this GCC is used with the operating systems that run
on Hercules.

Harold Grovesteen


Re: memset help

 

On Wed, 2020-04-15 at 00:02 -0700, adriansutherland67 wrote:
?

One thing, modern compilers generally produce faster code than a
human assembler programmer. This is because both of front end
optimisations (e.g. reassigning / calculating values that have not
changes), and backend optimisation based on loop unrolling, and
instruction reordering based on CPU pipelines etc.
I remind everyone working with GCC on VM/370 that it is a port of the
old i370 architecture version of GCC. ?The modern optimizations are
likely to be quite limited. ?This GCC is not the same version of GCC
that is used today.

The GCC group had decided to remove i370 from the product because of
its lack of use or development. ?It was rescued and modified to run on
the various mainframe operating systems.

Whether this is a consideration or not, there are other versions of
this GCC that are essentially the same. ?The GCC that runs on other
operating systems are from the same source code. ?I do not know enough
of the inner workings to know where the operating system dependent code
exists (which is different for each) and what is common.

If it is important that this altered version of GCC on VM/370 is source
compatible, care needs to be taken as to what is altered.

Just a heads up, but I think this should be understood.

Harold Grovesteen


Re: memset help

 

开云体育

Hi Steven,

no, not that I know of. But it is good to remember because assumptions that it will end during a scheduler slice can be false.
That is, in z/OS, where I, long time ago, came across errors based on that assumption, some my own. I don’t know about VM.

搁别苍é.


On 15 Apr 2020, at 14:24, Steven Fosdick <stevenfosdick@...> wrote:

On Tue, 14 Apr 2020 at 23:14,?rvjansen@...?<rvjansen@...> wrote:

The other thing to remember about mvcl is that it is interruptible.

Is that part of the performance issue? ?I mean indirectly, of course.


Re: memset help

 

On Tue, 14 Apr 2020 at 23:14, rvjansen@... <rvjansen@...> wrote:

The other thing to remember about mvcl is that it is interruptible.
Is that part of the performance issue? I mean indirectly, of course.
I don't really know the 370 architecture but I have come across a
similar move instruction, LDIR on the Z80 that is rather slow. That's
a relative term because it's still faster than writing the loop
yourself.

In the case of LDIR there is a non-repeating version (LDI) which loads
the value whose address is in register HL, stores it to the address in
register DE, increments HL and DE and decrements BC. The repeating
version works by adding a final step of testing BC and, if that is not
zero, it forces the program counter back to the address of the LDIR
instruction. That means the LDIR instruction is now re-fetched,
re-decoded and re-executed and the process repeats for each byte moved
until BC becomes zero.

It is interruptible and the interrupt is serviced just before the LDIR
instruction is re-fetched so it would push the values for BC, DE, HL
from halfway through the move, service the interrupt, then pop those
values and carry on where it left off.

Back to memset on 370, it's great to have an efficient implementation
in the library but having the compiler inline it would make it even
faster. Apart from removing the call overhead the compiler may know
the length already, i.e. it may be a constant expression, and can thus
avoid tests and having two loops, one for 256 bytes and one for the
remainder. It should also know if the data are aligned. I know gcc
can and does do this, at least on X86 - here's an example, first the
C:

#include <string.h>

extern void do_something(char *x);

int main(int argc, char *argv[])
{
char x[45];
memset(x, 0, sizeof(x));
do_something(x);
return 0;
}

I have deliberately declared an external function to received the
string so the compiler does not detect dead code and remove it
altogether. Here's the result to compiling to assembler:

.file "memstst.c"
.text
.section .text.startup,"ax",@progbits
.p2align 4
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
subq $72, %rsp
.cfi_def_cfa_offset 80
pxor %xmm0, %xmm0
movq %fs:40, %rax
movq %rax, 56(%rsp)
xorl %eax, %eax
movq %rsp, %rdi
movl $0, 40(%rsp)
movq $0, 32(%rsp)
movb $0, 44(%rsp)
movaps %xmm0, (%rsp)
movaps %xmm0, 16(%rsp)
call do_something@PLT
movq 56(%rsp), %rax
xorq %fs:40, %rax
jne .L5
xorl %eax, %eax
addq $72, %rsp
.cfi_remember_state
.cfi_def_cfa_offset 8
ret
.L5:
.cfi_restore_state
call __stack_chk_fail@PLT
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Arch Linux 9.3.0-1) 9.3.0"
.section .note.GNU-stack,"",@progbits

Note the absence of a call to memset. So the core of this is using a
zeroed 16 byte wide register:

movaps %xmm0, (%rsp)
movaps %xmm0, 16(%rsp)

for the lower part of the space and a selection of other lengths for
the remainder:

movl $0, 40(%rsp)
movq $0, 32(%rsp)
movb $0, 44(%rsp)

I did wonder about the possibility of setting up gcc as a
cross-compiler but that doesn't seem trivial to do.

Steve.