Feature Request: Binary integrals!

More
2 months 2 weeks ago #740 by Nike
In this output file you can see that it took 60 days to read in the fort.55 file (2.4T in size), and it's taking about 1 day for each CCSD iteration:

github.com/HPQC-LABS/AI_ENERGIES/blob/ma...9z_MRCC.txt?raw=true

CFOUR stores the MOABCD file in binary and it's only 415G, and the entire CCSD(T) (12 iterations + perturbative correction!) finished within 1 day: github.com/HPQC-LABS/AI_ENERGIES/blob/ma...2O/9z_CFOUR_xvcc.txt

So each MRCC iteration is taking as long as CFOUR's entire 12 iterations plus (T) in CCSD(T).

At least the energies are the same:
CFOUR: -76.36465872
MRCC: -76.36465869 (still converging).

In the past I suggested using Parallel-HDF5 (an MPI library that allows the I/O to be done with hundreds of cores rather than 1): www.mrcc.hu/index.php/forum/general-ques...ng-of-data-with-hdf5

However, even just allowing MRCC to read a binary MOABCD file (or a binary version of fort.55) would be enough to speed up the reading and sorting by 60 days, and the CCSD calculation would be sped up by at least a factor of 12 in this case.

I figure that even with density fitting or Cholesky decomposition, eventually people will want to do calculations with enough integrals to make a binary format worth it. In my case density fitting didn't make sense because it was cc-pV9Z and density fitting might defeat the purpose of getting within micro-Hartree of the CBS limit (and I don't have an auxiliary basis set such as cc-pV9Z-JKfit).

With best wishes!
Nike

Please Log in or Create an account to join the conversation.

More
2 months 1 week ago #748 by nagypeter
Dear Nike,

There is a very efficient, hand-optimized CCSD(T) implementation in MRCC via the ccsd executable, which indeed use binary integral files. The mrcc executable, that you invoke via the interface, should only be used for higher-order CC methods.
For up to CCSD(T) you should use MRCC standalone, run dmrcc with input keywords in the MINP files including:

calc=ccsd(t)
ccprog=ccsd
dfbasis_cor=none

Let us know if that solves your issue.

As you mention, DF could significantly decrease the I/O demand which we also take advantage of within the ccsd executable. I think it might also worth a try to setup a proper fitting basis for your calculations, since it might be possible to reach microH accuracy for such large basis sets. Let me know if you need help with that.

Best wishes,
Peter

Please Log in or Create an account to join the conversation.

Time to create page: 0.022 seconds
Powered by Kunena Forum