If you have problems during the execution of MRCC, please attach the output with an adequate description of your case as well as the followings:

the way mrcc was invoked
the way build.mrcc was invoked
the output of build.mrcc
compiler version (for example: ifort -V, gfortran -v)
blas/lapack versions
as well as gcc and glibc versions

This information really helps us during troubleshooting

OpenMP performance for CCSDTQ, etc.

TiborGY
Offline
Junior Member

3 years 11 months ago - 3 years 11 months ago #1253 by TiborGY

Replied by TiborGY on topic OpenMP performance for CCSDTQ, etc.

Dear all,

For what its worth, on systems with plenty of RAM and a fast NVMe SSD, the parallel scaling does not appear to be limited by disk access speed, but the amount of time spent outside of parallel regions (or inside critical sections, serial either way).

Running a fairly large CCSDT calculation on 12 OMP threads, I observed the CPU usage alternating between 1 core and 12 cores.

I have not looked at the code yet, but as a first guess finding a way to parallelize the remaining serial sections should improve parallel scaling, even if heavy IO is involved. Modern SSDs not only do not mind receiving parallel IO, but many SSDs can only reach their advertised peak throughput when servicing multiple IO requests in parallel. (Queue Depth > 1)

Of course, as it is always, easier said than done

Last edit: 3 years 11 months ago by TiborGY.

Please Log in or Create an account to join the conversation.

kipeters
Topic Author
Offline
Senior Member

3 years 10 months ago #1256 by kipeters

Replied by kipeters on topic OpenMP performance for CCSDTQ, etc.

Just an update to this thread - it seems the issues I've been seeing with OpenMP performance is due to the interface with Molpro. Just invoking mrcc by itself has the expected thread activity. No fix yet, but now I know where to look.

Please Log in or Create an account to join the conversation.

Time to create page: 0.037 seconds

OpenMP performance for CCSDTQ, etc.

MRCC Login