- Posts: 34
- Thank you received: 0
If you have problems during the execution of MRCC, please attach the output with an adequate description of your case as well as the followings:
This information really helps us during troubleshooting
- the way mrcc was invoked
- the way build.mrcc was invoked
- the output of build.mrcc
- compiler version (for example: ifort -V, gfortran -v)
- blas/lapack versions
- as well as gcc and glibc versions
This information really helps us during troubleshooting
poor performance of OpenMP version
- kipeters
- Topic Author
- Offline
- Junior Member
Less
More
9 years 3 weeks ago #213
by kipeters
poor performance of OpenMP version was created by kipeters
I've just downloaded the most recent version of MRCC and built it with the Intel compiler (v.14) and the -pOMP option. Previously I had built an older version with the PGI compiler and had been using that for some time. Running from the Molpro interface (but setting the value of OMP_NUM_THREADS externally and using serial Molpro), the Intel-compiled version sees a significant slow-down when the number of threads are greater than 1 in CCSDT calculations (simple test). This is completely contrary to my previous PGI version where I did get some speedup. I've used both the default Intel build (uses mkl=parallel) and one where I've inserted openmp throughout and then linked to mkl as in Molpro (this is similar to what I had done before with the PGI compiler).
Since my cpus are all Intel SandyBridge processors, I had expected a strong advantage with using the Intel compiler.
thanks in advance, -Kirk
Since my cpus are all Intel SandyBridge processors, I had expected a strong advantage with using the Intel compiler.
thanks in advance, -Kirk
Please Log in or Create an account to join the conversation.
- kipeters
- Topic Author
- Offline
- Junior Member
Less
More
- Posts: 34
- Thank you received: 0
9 years 3 weeks ago #214
by kipeters
Replied by kipeters on topic poor performance of OpenMP version
A quick addition to this post. Tonight I also successfully got the newest version of MRCC to build using the PGF compiler (v.12.10) in conjunction with gcc, also using -pOMP. In that case I see reasonable scaling as I increase the number of threads just as in my older 2012 version of MRCC.
-Kirk
-Kirk
Please Log in or Create an account to join the conversation.
- kallay
- Offline
- Administrator
- Mihaly Kallay
9 years 3 weeks ago #215
by kallay
Best regards,
Mihaly Kallay
Replied by kallay on topic poor performance of OpenMP version
Dear Kirk,
It is strange. I have just run a quick test, H2O2 CCSDT/cc-pVTZ, and I have got a speedup of about two on a quad core Intel i7-4700MQ CPU. I use Intel compiler version 14.0.2.144.
Are you sure that the OMP_NUM_THREADS and MKL_NUM_THREADS environmental variables are correctly set?
It is strange. I have just run a quick test, H2O2 CCSDT/cc-pVTZ, and I have got a speedup of about two on a quad core Intel i7-4700MQ CPU. I use Intel compiler version 14.0.2.144.
Are you sure that the OMP_NUM_THREADS and MKL_NUM_THREADS environmental variables are correctly set?
Best regards,
Mihaly Kallay
Please Log in or Create an account to join the conversation.
- kipeters
- Topic Author
- Offline
- Junior Member
Less
More
- Posts: 34
- Thank you received: 0
9 years 2 weeks ago #216
by kipeters
Replied by kipeters on topic poor performance of OpenMP version
Dear Mihaly,
actually I've never set MKL_NUM_THREADS, only the OMP one (I thought the latter too precedence). What should the MKL environment variable be set to?
best, -Kirk
actually I've never set MKL_NUM_THREADS, only the OMP one (I thought the latter too precedence). What should the MKL environment variable be set to?
best, -Kirk
Please Log in or Create an account to join the conversation.
- kipeters
- Topic Author
- Offline
- Junior Member
Less
More
- Posts: 34
- Thank you received: 0
9 years 2 weeks ago #217
by kipeters
Replied by kipeters on topic poor performance of OpenMP version
Dear Mihaly,
so if I set MKL_NUM_THREADS=1 , a 4-core calculation is 2x slower than a single-core (as specified by using OMP_NUM_THREADS). Obviously I have something not set correctly.
-Kirk
so if I set MKL_NUM_THREADS=1 , a 4-core calculation is 2x slower than a single-core (as specified by using OMP_NUM_THREADS). Obviously I have something not set correctly.
-Kirk
Please Log in or Create an account to join the conversation.
- kipeters
- Topic Author
- Offline
- Junior Member
Less
More
- Posts: 34
- Thank you received: 0
9 years 2 weeks ago #218
by kipeters
Replied by kipeters on topic poor performance of OpenMP version
Dear Mihaly,
I think actually everything is fine after all. It seems that Molpro prints the incorrect CPU times. I just happened to see what it prints for wall times and these show the correct scaling, not great but a factor of 2 speed-up when going from 1 core to 4.
sorry for stirring this up for nothing.
-Kirk
I think actually everything is fine after all. It seems that Molpro prints the incorrect CPU times. I just happened to see what it prints for wall times and these show the correct scaling, not great but a factor of 2 speed-up when going from 1 core to 4.
sorry for stirring this up for nothing.
-Kirk
Please Log in or Create an account to join the conversation.
Time to create page: 0.041 seconds