- Posts: 97
- Thank you received: 3
Feature Request: Binary integrals!
- Nike
- Topic Author
- Offline
- Premium Member
Less
More
5 years 2 months ago #740
by Nike
Feature Request: Binary integrals! was created by Nike
In this output file you can see that it took 60 days to read in the fort.55 file (2.4T in size), and it's taking about 1 day for each CCSD iteration:
github.com/HPQC-LABS/AI_ENERGIES/blob/ma...9z_MRCC.txt?raw=true
CFOUR stores the MOABCD file in binary and it's only 415G, and the entire CCSD(T) (12 iterations + perturbative correction!) finished within 1 day: github.com/HPQC-LABS/AI_ENERGIES/blob/ma...2O/9z_CFOUR_xvcc.txt
So each MRCC iteration is taking as long as CFOUR's entire 12 iterations plus (T) in CCSD(T).
At least the energies are the same:
CFOUR: -76.36465872
MRCC: -76.36465869 (still converging).
In the past I suggested using Parallel-HDF5 (an MPI library that allows the I/O to be done with hundreds of cores rather than 1): www.mrcc.hu/index.php/forum/general-ques...ng-of-data-with-hdf5
However, even just allowing MRCC to read a binary MOABCD file (or a binary version of fort.55) would be enough to speed up the reading and sorting by 60 days, and the CCSD calculation would be sped up by at least a factor of 12 in this case.
I figure that even with density fitting or Cholesky decomposition, eventually people will want to do calculations with enough integrals to make a binary format worth it. In my case density fitting didn't make sense because it was cc-pV9Z and density fitting might defeat the purpose of getting within micro-Hartree of the CBS limit (and I don't have an auxiliary basis set such as cc-pV9Z-JKfit).
With best wishes!
Nike
github.com/HPQC-LABS/AI_ENERGIES/blob/ma...9z_MRCC.txt?raw=true
CFOUR stores the MOABCD file in binary and it's only 415G, and the entire CCSD(T) (12 iterations + perturbative correction!) finished within 1 day: github.com/HPQC-LABS/AI_ENERGIES/blob/ma...2O/9z_CFOUR_xvcc.txt
So each MRCC iteration is taking as long as CFOUR's entire 12 iterations plus (T) in CCSD(T).
At least the energies are the same:
CFOUR: -76.36465872
MRCC: -76.36465869 (still converging).
In the past I suggested using Parallel-HDF5 (an MPI library that allows the I/O to be done with hundreds of cores rather than 1): www.mrcc.hu/index.php/forum/general-ques...ng-of-data-with-hdf5
However, even just allowing MRCC to read a binary MOABCD file (or a binary version of fort.55) would be enough to speed up the reading and sorting by 60 days, and the CCSD calculation would be sped up by at least a factor of 12 in this case.
I figure that even with density fitting or Cholesky decomposition, eventually people will want to do calculations with enough integrals to make a binary format worth it. In my case density fitting didn't make sense because it was cc-pV9Z and density fitting might defeat the purpose of getting within micro-Hartree of the CBS limit (and I don't have an auxiliary basis set such as cc-pV9Z-JKfit).
With best wishes!
Nike
Please Log in or Create an account to join the conversation.
- nagypeter
- Offline
- Premium Member
- MRCC developer
5 years 2 months ago #748
by nagypeter
Replied by nagypeter on topic Feature Request: Binary integrals!
Dear Nike,
There is a very efficient, hand-optimized CCSD(T) implementation in MRCC via the ccsd executable, which indeed use binary integral files. The mrcc executable, that you invoke via the interface, should only be used for higher-order CC methods.
For up to CCSD(T) you should use MRCC standalone, run dmrcc with input keywords in the MINP files including:
calc=ccsd(t)
ccprog=ccsd
dfbasis_cor=none
Let us know if that solves your issue.
As you mention, DF could significantly decrease the I/O demand which we also take advantage of within the ccsd executable. I think it might also worth a try to setup a proper fitting basis for your calculations, since it might be possible to reach microH accuracy for such large basis sets. Let me know if you need help with that.
Best wishes,
Peter
There is a very efficient, hand-optimized CCSD(T) implementation in MRCC via the ccsd executable, which indeed use binary integral files. The mrcc executable, that you invoke via the interface, should only be used for higher-order CC methods.
For up to CCSD(T) you should use MRCC standalone, run dmrcc with input keywords in the MINP files including:
calc=ccsd(t)
ccprog=ccsd
dfbasis_cor=none
Let us know if that solves your issue.
As you mention, DF could significantly decrease the I/O demand which we also take advantage of within the ccsd executable. I think it might also worth a try to setup a proper fitting basis for your calculations, since it might be possible to reach microH accuracy for such large basis sets. Let me know if you need help with that.
Best wishes,
Peter
Please Log in or Create an account to join the conversation.
Time to create page: 0.037 seconds