If you run into troubles, it is always a good habit to report the following information:
  • the way build.mrcc was invoked
  • the output of build.mrcc
  • compiler version (for example: ifort -V, gfortran -v)
  • blas/lapack versions
  • as well as gcc and glibc versions

as well as the value of the relevant environmental variables, such OMP_NUM_THREADS, etc.

This information helps us a lot when figuring out what is going on with your compilation :)

MPI version of mrcc

  • hajgato
  • Topic Author
  • Offline
  • New Member
  • New Member
More
9 years 11 months ago #103 by hajgato
MPI version of mrcc was created by hajgato
Dear All,

I am trying to compile an MPI version of mrcc. Can anybody report what compiler/mpi combo works?

I have tried so far:
Intel 13.1.3 (2013.5.192) and IMPI 4.1.3 (-i64 -32)
intel 13.1.3 (2013.5.192) and OpenMPI 1.6.5 (-i64)
GNU 4.8.3 and OpenMPI 1.8.1 (-i64)
GNU 4.4.7 and OpenMPI 1.6.5 (-i64)
GNU 4.4.7 and OpenMPI 1.4.5 (-i64)

Sequential versions work except for GNU 4.8.3

Of the MPI version works I always get the following:

**********************************************************************
MPI parallel version is running.
Number of CPUs: 2
**********************************************************************

Then a 2 individual job runs writing to one file.

Any suggestions?

Faithfully yours,

Balazs

Please Log in or Create an account to join the conversation.

  • kallay
  • Offline
  • Administrator
  • Administrator
  • Mihaly Kallay
More
9 years 11 months ago #104 by kallay
Replied by kallay on topic MPI version of mrcc
Dear Balazs,
You are probably running the MPI job on the same machine. It does not work because the two processes use the same files. Currently the MPI execution is only possible if each process is executed on a separate node with separate file system.

Best regards,
Mihaly Kallay
The following user(s) said Thank You: vlad

Please Log in or Create an account to join the conversation.

  • hajgato
  • Topic Author
  • Offline
  • New Member
  • New Member
More
9 years 11 months ago - 9 years 11 months ago #105 by hajgato
Replied by hajgato on topic MPI version of mrcc
Dear Mihaly,

I think I have to be more specific. So what I did, I compiled twice mrcc, once sequential, and once mpi (in different directories). Then I moved the mpi version of mrcc to the directory of sequential version with a name mrcci.mpi
Then I made a "mrcc" script in the sequential version as following:
mrcc:
Code:
NP=`grep -c nic /tmp/hajgato/hosts` for i in `tail -n +2 /tmp/hajgato/hosts` do scp -r /tmp/hajgato/mrcc $i:/tmp/hajgato done mpirun -hostfile /tmp/hajgato/hosts -wdir /tmp/hajgato/mrcc -np $NP mrcc.mpi
The script to run my cacluation was:
submit:
Code:
cd /tmp/hajgato cat <<END>/tmp/hajgato/hosts nic151 nic152 END NP=`grep -c nic /tmp/hajgato/hosts` mkdir mrcc cd mrcc cp /u/hajgato/calc/freija/Q_dz/$1 MINP cp /u/hajgato/src/mrcc.14.07.10.mpi/GENBAS . dmrcc > /u/hajgato/calc/freija/Q_dz/${1%minp}mrccout cd /tmp/hajgato mpirun -hostfile /tmp/hajgato/hosts rm -rf /tmp/hajgato/mrcc
(The /tmp/ directory is the local disk of the node.)
I have checked that only one copy on both node nic151 and node nic152 was running.
Then the relevant part of the output is
output:
Code:
********************************************************************** MPI parallel version is running. Number of CPUs: 2 ********************************************************************** CCSDT(Q) calculation Allocation of 668.1 Mbytes of memory... Number of spinorbitals: 114 Number of alpha electrons: 13 Number of beta electrons: 13 Spin multiplicity: 1 z-component of spin: 0.0 Spatial symmetry: 1 Convergence criterion: 1.0E-06 Construction of occupation graphs... ********************************************************************** CCSDT(Q) calculation Allocation of 668.1 Mbytes of memory... Number of spinorbitals: 114 Number of alpha electrons: 13 Number of beta electrons: 13 Spin multiplicity: 1 z-component of spin: 0.0 Spatial symmetry: 1 Convergence criterion: 1.0E-06 Construction of occupation graphs... Number of 0-fold excitations: 1 Number of 1-fold excitations: 602 Number of 2-fold excitations: 237774 Number of 3-fold excitations: 45993540 Total number of determinants: 46231917 Calculation of coupling coefficients... Number of 0-fold excitations: 1 Number of 1-fold excitations: 602 Number of 2-fold excitations: 237774 Number of 3-fold excitations: 45993540 Total number of determinants: 46231917 Calculation of coupling coefficients... Initial cluster amplitudes are generated. Initial cluster amplitudes are generated. Length of intermediate file (Mbytes): 817.6 Length of intermediate file (Mbytes): 817.6 Reading integral list from unit 55... Reading integral list from unit 55... Sorting integrals... Sorting integrals... Energy of reference determinant [au]: -336.788506858264 Calculation of MP denominators... Energy of reference determinant [au]: -336.788506858264 Calculation of MP denominators... Starting CC iteration... ====================================================================== Starting CC iteration... ====================================================================== Norm of residual vector: 1.96621997 CPU time [min]: 3.183 Wall time [min]: 3.159 Iteration 1 CC energy: -337.47958067 Energy decrease: 0.69107381 ====================================================================== Norm of residual vector: 1.96621997 CPU time [min]: 3.190 Wall time [min]: 3.164 Iteration 1 CC energy: -337.47958067 Energy decrease: 0.69107381 ====================================================================== 2 total processes killed (some possibly by mpirun during cleanup)
(i was killed the job)
It looks like that after wrining the number of CPUs, the mpi processes do not communicate with each other anymore.

Sincerely,

Balazs
Last edit: 9 years 11 months ago by hajgato. Reason: unecessary extra word

Please Log in or Create an account to join the conversation.

  • hajgato
  • Topic Author
  • Offline
  • New Member
  • New Member
More
9 years 11 months ago #106 by hajgato
Replied by hajgato on topic MPI version of mrcc
Well, i completely messed up, i think message #103 should be reverted to the original version (if it is possible)

Please Log in or Create an account to join the conversation.

  • kallay
  • Offline
  • Administrator
  • Administrator
  • Mihaly Kallay
More
9 years 11 months ago #107 by kallay
Replied by kallay on topic MPI version of mrcc
Dear Balazs,
I would first try the Intel + OpenMPI combination, then the GNU + OpenMPI. Those should work, though we have not tested the versions you have. The IMPI has never been tested.

Best regards,
Mihaly Kallay

Please Log in or Create an account to join the conversation.

  • hajgato
  • Topic Author
  • Offline
  • New Member
  • New Member
More
9 years 11 months ago #108 by hajgato
Replied by hajgato on topic MPI version of mrcc
Dear Mihaly,

The GNU. 4.4.7 + OpenMPI 1.6.5 works, just the output is repeated by the number of MPI tasks.

Sincerely,

Balazs

Please Log in or Create an account to join the conversation.

Time to create page: 0.043 seconds
Powered by Kunena Forum