- Posts: 97
- Thank you received: 3
If you have problems during the execution of MRCC, please attach the output with an adequate description of your case as well as the followings:
This information really helps us during troubleshooting
- the way mrcc was invoked
- the way build.mrcc was invoked
- the output of build.mrcc
- compiler version (for example: ifort -V, gfortran -v)
- blas/lapack versions
- as well as gcc and glibc versions
This information really helps us during troubleshooting
Problems requesting large memory for CCSDT(Q)
- Nike
- Offline
- Premium Member
Less
More
8 years 2 months ago #288
by Nike
Replied by Nike on topic Problems requesting large memory for CCSDT(Q)
Dear Mihaly,
Thank you very much for the prompt reply.
gives:
But your first suggestion is probably what happened.
There is something strange going on:
I started using the stand-alone MRCC because with the MOLPRO interface, the job ran for 30 days and the end of the output file is still the same:
I also opened the file /mrcc_9011/iface, and it just says:
but no entries filled in after 30 days.
In CFOUR I got a bit further:
followed by the iterations, but it took 5 days of wall time for the first iteration.
Maybe it was slow because MRCC really wanted 1TB of RAM:
But since I only allocated 98GB in the CFOUR ZMAT, it kept doing many iterations of goldstone/xmrcc reporting less and less RAM requirements until it got to 64GB, which ran successfully.
Maybe I could have tried to allocate 700GB in CFOUR, but I decided to just try with the stand-alone MRCC binaries. What I got was amazing:
At the FIRST iteration of xmrcc, it only needs 23GB (in CFOUR it needed 1TB and then kept doing iterations of goldstone and xmrcc to bring it down to 64GB):
The iterations are taking only 17 minutes now!
So with CFOUR it first requires 1TB of RAM, then 64GB of RAM, and each iteration takes 5 days.
With standalone MRCC it first requires 23GB of RAM, only 1 execution of goldstone and xmrcc, no memory reduction, and only 17 minutes per iteration.
Does using CFOUR/MOLPRO really have that much overhead ?
Thank you very much for the prompt reply.
Code:
which integ
gives:
Code:
~/MRCC/integ,
But your first suggestion is probably what happened.
There is something strange going on:
I started using the stand-alone MRCC because with the MOLPRO interface, the job ran for 30 days and the end of the output file is still the same:
Code:
MRCC Input:
4 1 0 0 0 0 0 1 0 1 1 1 0 0 0 7 0 0 0.000 0 67200
ex.lev,nsing,ntrip, rest,method,dens,conver,symm, diag, CS ,spatial, HF, ndoub,nacto,nactv, tol, maxex, sacc, freq, dboc, mem
2 2 2 2 2 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
MRCC Input end
Variable memory released
I also opened the file /mrcc_9011/iface, and it just says:
Code:
#property method sym st mul value CPU(sec) Wall(sec)
but no entries filled in after 30 days.
In CFOUR I got a bit further:
Code:
OpenMP parallel version is running.
Number of CPUs: 24
Allocation of64533.8 Mbytes of memory...
followed by the iterations, but it took 5 days of wall time for the first iteration.
Maybe it was slow because MRCC really wanted 1TB of RAM:
Code:
Memory requirements /Mbyte/:
Minimal Optimal
Real*8: 1059696.3996 1059696.3996
But since I only allocated 98GB in the CFOUR ZMAT, it kept doing many iterations of goldstone/xmrcc reporting less and less RAM requirements until it got to 64GB, which ran successfully.
Maybe I could have tried to allocate 700GB in CFOUR, but I decided to just try with the stand-alone MRCC binaries. What I got was amazing:
At the FIRST iteration of xmrcc, it only needs 23GB (in CFOUR it needed 1TB and then kept doing iterations of goldstone and xmrcc to bring it down to 64GB):
Code:
Memory requirements /Mbyte/:
Minimal Optimal
Real*8: 23390.7539 23390.7539,
The iterations are taking only 17 minutes now!
So with CFOUR it first requires 1TB of RAM, then 64GB of RAM, and each iteration takes 5 days.
With standalone MRCC it first requires 23GB of RAM, only 1 execution of goldstone and xmrcc, no memory reduction, and only 17 minutes per iteration.
Does using CFOUR/MOLPRO really have that much overhead ?
Please Log in or Create an account to join the conversation.
- kallay
- Offline
- Administrator
- Mihaly Kallay
8 years 2 months ago #289
by kallay
Best regards,
Mihaly Kallay
Replied by kallay on topic Problems requesting large memory for CCSDT(Q)
Dear Nike,
Did you use different number of CPU cores in the calculations? If so, it explains the problem. Unfortunately the iterative CC part requires a couple of big arrays in as many copies as the number of CPUs, so you should also increase the memory when increasing the number of cores.
Did you use different number of CPU cores in the calculations? If so, it explains the problem. Unfortunately the iterative CC part requires a couple of big arrays in as many copies as the number of CPUs, so you should also increase the memory when increasing the number of cores.
Best regards,
Mihaly Kallay
Please Log in or Create an account to join the conversation.
- Nike
- Offline
- Premium Member
Less
More
- Posts: 97
- Thank you received: 3
8 years 2 months ago #290
by Nike
Replied by Nike on topic Problems requesting large memory for CCSDT(Q)
Dear Mihaly,
Thank you very much for your reply.
When using stand-alone MRCC, after just 1 iteration of goldstone and xmrcc, the output says:
When using MRCC inside CFOUR, goldstone first wants 270,564 MB, but then after several iterations of goldstone and xmrcc, the main code starts and says:
That's more RAM than stand-alone MRCC, but same number of cores.
I know the stand-alone MRCC case is indeed using 24 cores because I see:
I have attached the files MRCCinCFOUR.txt and MRCCstandAlone.txt.
Is there any temporary solution to the memory allocation problem?
We keep getting the error below:
where the last 2 lines repeat several thousand times before we have the chance to kill the job.
We find that by allocating much less memory than the amount on the node, we can get MRCC to work, but this is after a lot of trial and error, and production of these huge output files.
With best wishes,
Nike Dattani
Thank you very much for your reply.
When using stand-alone MRCC, after just 1 iteration of goldstone and xmrcc, the output says:
Code:
OpenMP parallel version is running.
Number of CPUs: 24
Allocation of23390.8 Mbytes of memory...
When using MRCC inside CFOUR, goldstone first wants 270,564 MB, but then after several iterations of goldstone and xmrcc, the main code starts and says:
Code:
OpenMP parallel version is running.
Number of CPUs: 24
Allocation of64533.8 Mbytes of memory...
That's more RAM than stand-alone MRCC, but same number of cores.
I know the stand-alone MRCC case is indeed using 24 cores because I see:
Code:
CPU time [min]: 33155.161
Wall time [min]: 1614.613
I have attached the files MRCCinCFOUR.txt and MRCCstandAlone.txt.
Is there any temporary solution to the memory allocation problem?
We keep getting the error below:
Code:
Executing integ...
Allocation of 700.0 Gbytes of memory...
Fatal error in which dmrcc > mrccjunk1.
Program will stop.
Fatal error in
echo " ************************ "`date +"%F %T"`" *************************".
Program will stop.
echo " ************************ "`date +"%F %T"`" *************************".
Program will stop.
echo " ************************ "`date +"%F %T"`" *************************".
Program will stop.
echo " ************************ "`date +"%F %T"`" *************************".
Program will stop.
where the last 2 lines repeat several thousand times before we have the chance to kill the job.
We find that by allocating much less memory than the amount on the node, we can get MRCC to work, but this is after a lot of trial and error, and production of these huge output files.
With best wishes,
Nike Dattani
Attachments:
Please Log in or Create an account to join the conversation.
- kallay
- Offline
- Administrator
- Mihaly Kallay
8 years 2 months ago #294
by kallay
Best regards,
Mihaly Kallay
Replied by kallay on topic Problems requesting large memory for CCSDT(Q)
Dear Nike,
Please note that in the standalone MRCC run you used cc-pVDZ basis with 42 orbitals, while in the Cfour-MRCC run you had aug-cc-pVDZ with 69 orbitals, thus the memory requirement is higher for the latter calculation.
Concerning the memory allocation problem on the big machine we have no idea. We can allocate up to 128GB without any problem (unfortunately this is the largest machine we have here). What compiler did you use for the compilation of mrcc or did you use the binary version?
Please note that in the standalone MRCC run you used cc-pVDZ basis with 42 orbitals, while in the Cfour-MRCC run you had aug-cc-pVDZ with 69 orbitals, thus the memory requirement is higher for the latter calculation.
Concerning the memory allocation problem on the big machine we have no idea. We can allocate up to 128GB without any problem (unfortunately this is the largest machine we have here). What compiler did you use for the compilation of mrcc or did you use the binary version?
Best regards,
Mihaly Kallay
Please Log in or Create an account to join the conversation.
Time to create page: 0.041 seconds