If you have problems during the execution of MRCC, please attach the output with an adequate description of your case as well as the followings:
  • the way mrcc was invoked
  • the way build.mrcc was invoked
  • the output of build.mrcc
  • compiler version (for example: ifort -V, gfortran -v)
  • blas/lapack versions
  • as well as gcc and glibc versions

This information really helps us during troubleshooting :)

Slow AO to MO transformation

  • Nike
  • Topic Author
  • Offline
  • Premium Member
  • Premium Member
More
6 years 4 months ago #616 by Nike
Slow AO to MO transformation was created by Nike
Greetings!
In post 531 I complained about CFOUR finishing the AO-MO transformation in 2 hours while MRCC took 6 days, even with ABCDTYPE=STANDARD (which means we are not using the AO-based algorithm with better scaling):

nike wrote: 3) I think what we're seeing here is the difference between I/O cost vs CPU cost. A routine heavy in I/O might take much longer than a routine dominated by CPU even when the I/O routine has a much better formal scaling. In the case below, the AO-MO transformation took more time than the entire 30 iterations of CCSDT.

Here you can see that:

30 iterations of CCSDT were done after 6352.998 minutes,
then the 3 spin cases for (Q) were done after 26496.881 minutes.

Here you can see that 6 days (8480 minutes) were spent in ovirt.f:

ovirt.f begins at: 2017-10-15 22:21:06
ovirt.f ends at: 2017-10-21 20:23:26

The first of these files suggests that CFOUR finished xvtran in 3346.44 seconds and xintprc in 2649.99 seconds (with ABCDTYPE=STANDARD, not ABCDTYPE=AOBASIS), so maybe you are right that the CCSDT is the bottleneck and not the AO-MO transformation, but the 6 days spent in ovirt.f compared to the 1.3 days it took for mrcc to converge CCSDT to micro-Hartree precision, suggests that ovirt.f is the major bottleneck here. I thought this could be improved by using an AO-based algorithm, but the fact that CFOUR finishes xvtran and xintprc in just a couple hours (instead of 6 days) even with ABCDTYPE=STANDARD, makes me think there's something in ovirt.f that can be majorly sped-up without even switching to an AO-based algorithm. Do you have any idea what might be going on here?


I received a reply from Kallay here

kallay wrote: Dear Nike,
different ways are only identical if the SCF is well converged.
3) We will look at this problem.


I wonder if any progress has been made on this?

The reason is because I am optimizing the exponents of some huge basis sets and when every AO-MO transformation takes 6 days, the optimization takes several months just to do 15 iterations of BFGS or Nelder-Mead. We have to do this many times for many atoms, so when it takes several months to optimize one set of exponents, the AO-MO transformation really what is stopping us from getting the project done.

Up to 6Z it was ok to optimize the basis sets in MOLPRO where the integrals are faster, but MOLPRO cannot do k-functions so can't do 7Z. Dalton can only do closed-shell, and Gaussian is slow. MOLCAS and Psi4 can do k-functions but not CISD for big basis sets. The only solution seems to be to use Psi4 for the integrals and MRCC for the CISD, but after more than year it still hasn't been possible to get a working version of Psi4 that supports 7Z: forum.psicode.org/t/building-with-high-am/936/15

In understand if the slow AO-MO transformation is not a priority for the MRCC developers, but I wonder if I could help in any way to get the speed matching CFOUR and MOLPRO? I am also in Europe now on sabbatical until August 14th and can visit in Hungary the programming support would be helpful.

With best wishes!
Nike

Please Log in or Create an account to join the conversation.

Time to create page: 0.037 seconds
Powered by Kunena Forum