TABLE OF CONTENTS
ABINIT/energy [ Functions ]
NAME
energy
FUNCTION
Compute electronic energy terms energies%e_eigenvalues, ek and enl from arbitrary (orthonormal) provided wf, ehart, enxc, and eei from provided density and potential, energies%e_eigenvalues=Sum of the eigenvalues - Band energy (Hartree) energies%e_zeeman=Zeeman spin energy from applied magnetic field -m.B ek=kinetic energy, ehart=Hartree electron-electron energy, enxc,enxcdc=exchange-correlation energies, eei=local pseudopotential energy, enl=nonlocal pseudopotential energy Also, compute new density from provided wfs, after the evaluation of ehart, enxc, and eei. WARNING XG180913 : At present, Fock energy not computed ! NOTE that this routine is callned in m_scfcv_core only when nstep == 0
INPUTS
[add_tfw]=flag controling the addition of Weiszacker gradient correction to Thomas-Fermi kin energy cg(2,mpw*nspinor*mband*mkmem*nsppol)=<G|Cnk>=Fourier coefficients of wavefunction dtset <type(dataset_type)>=all input variables for this dataset | mband=maximum number of bands | mgfft=maximum size of 1D FFTs | mkmem=number of k points treated by this node. | mpw=maximum dimension for number of planewaves | natom=number of atoms in unit cell | nfft=(effective) number of FFT grid points (for this processor) | nkpt=number of k points | nspden=number of spin-density components | nsppol=1 for unpolarized, 2 for polarized | nspinor=number of spinorial components | nsym=number of symmetry elements in space group (at least 1) | occopt=option for occupancies | tsmear=smearing energy or temperature (if metal) eigen(mband*nkpt*nsppol)=array for holding eigenvalues (hartree) gsqcut=G^2 cutoff from gsqcut=ecut/(2 Pi^2) indsym(4,nsym,natom)=indirect indexing array for atom labels irrzon(nfft**(1-1/nsym),2,(nspden/nsppol)-3*(nspden/4))=irreducible zone data kg(3,mpw*mkmem)=work array for coordinates of G vectors in basis mcg=size of wave-functions array (cg) =mpw*nspinor*mband*mkmem*nsppol mpi_enreg=information about MPI parallelization my_natom=number of atoms treated by current processor nfftf= -PAW ONLY- number of FFT grid points for the fine grid (nfftf=nfft for norm-conserving potential runs) ngfftf(18)= -PAW ONLY- contain all needed information about 3D FFT for the fine grid (ngfftf=ngfft for norm-conserving potential runs) nhatgr(nfft,nspden,3*nhatgrdim)= -PAW only- cartesian gradients of compensation density nhatgrdim= -PAW only- 0 if nhatgr array is not used ; 1 otherwise npwarr(nkpt)=number of planewaves at each k point, and boundary n3xccc=dimension of the xccc3d array (0 or nfftf). occ(mband*nkpt*nsppol)=occupation numbers of bands (usually 2) at each k point optene=option for the computation of total energy (direct scheme or double-counting scheme) paw_ij(natom*usepaw) <type(paw_ij_type)>=paw arrays given on (i,j) channels pawang <type(pawang_type)>=paw angular mesh and related data pawfgr(natom) <type(pawfgr_type)>=fine grid parameters and related data pawfgrtab(natom) <type(pawfgrtab_type)>=atomic data given on fine rectangular grid pawtab(ntypat*usepaw) <type(pawtab_type)>=paw tabulated starting data phnons(2,nfft**(1-1/nsym),(nspden/nsppol)-3*(nspden/4))=nonsymmorphic translation phases ph1d(2,3*(2*mgfft+1)*natom)=phase information related to structure factor psps <type(pseudopotential_type)>=variables related to pseudopotentials | mpsang= 1+maximum angular momentum for nonlocal pseudopotentials | ntypat=number of types of atoms in cell rprimd(3,3)=dimensional real space primitive translations (bohr) symrec(3,3,nsym)=symmetry operations in reciprocal space usexcnhat= -PAW only- flag controling use of compensation density in Vxc vpsp(nfftf)=local pseudopotential in real space (hartree) wfs <type(wvl_projector_type)>=wavefunctions information for wavelets. wvl <type(wvl_internal_type)>=wavelets internal data xccc3d(n3xccc)=3D core electron density for XC core correction (bohr^-3) xred(3,natom)=reduced coordinates of atoms (dimensionless) ylm(mpw*mkmem,mpsang*mpsang*useylm)= real spherical harmonics for each G and k point
OUTPUT
compch_fft=-PAW only- compensation charge inside spheres computed over fine fft grid etotal=total energy (hartree): - computed by direct scheme if optene=0 or 2 - computed by double-counting scheme if optene=1 or 3 resid(mband*nkpt*nsppol)=residuals for each band over all k points (hartree^2) strsxc(6)=exchange-correlation contribution to stress tensor vhartr(nfftf)=work space to hold Hartree potential in real space (hartree) vtrial(nfftf,nspden)=total local potential (hartree) vxc(nfftf,nspden)=work space to hold Vxc(r) in real space (hartree) [vxctau(nfft,nspden,4*usekden)]=(only for meta-GGA): derivative of XC energy density with respect to kinetic energy density (depsxcdtau). The arrays vxctau contains also the gradient of vxctau (gvxctau) in vxctau(:,:,2:4)
SIDE EFFECTS
electronpositron <type(electronpositron_type)>=quantities for the electron-positron annihilation (optional argument) energies <type(energies_type)>=all part of total energy. | entropy(IN)=entropy due to the occupation number smearing (if metal) | e_chempot(IN)=energy from spatially varying chemical potential (hartree) | e_ewald(IN)=Ewald energy (hartree) | e_vdw_dftd(IN)=VdW DFT-D energy | e_corepsp(IN)=psp core-core energy | e_paw(IN)=PAW spherical part energy | e_pawdc(IN)=PAW spherical part double-counting energy | e_eigenvalues(OUT)=Sum of the eigenvalues - Band energy (Hartree) | e_hartree(OUT)=Hartree part of total energy (hartree units) | e_kinetic(OUT)=kinetic energy part of total energy. | e_nlpsp_vfock(OUT)=nonlocal psp + potential Fock ACE part of total energy. | e_xc(OUT)=exchange-correlation energy (hartree) ==== if optene==0, 2 or 3 | e_localpsp(OUT)=local psp energy (hartree) ==== if optene==1, 2 or 3 | e_xcdc(OUT)=exchange-correlation double-counting energy (hartree) rhog(2,nfftf)=work space for rho(G); save intact on return (? MT 08-12-2008: is that true now ?) rhor(nfftf,nspden)=work space for rho(r); save intact on return (? MT 08-12-2008: is that true now ?) taug(2,nfftf*dtset%usekden)=array for Fourier transform of kinetic energy density taur(nfftf,nspden*dtset%usekden)=array for kinetic energy density nspinor should not be modified in the call of rdnpw === if psps%usepaw==1 === nhat(nfftf,nspden*usepaw)= compensation charge density pawrhoij(natom) <type(pawrhoij_type)>= paw rhoij occupancies and related data
NOTES
Be careful to the meaning of nfft (size of FFT grids): - In case of norm-conserving calculations the FFT grid is the usual FFT grid. - In case of PAW calculations: Two FFT grids are used; one with nfft points (coarse grid) for the computation of wave functions ; one with nfftf points (fine grid) for the computation of total density. There is a large amount of overhead in the way this routine do the computation of the energy ! For example, the density has already been precomputed, so why to compute it again here ??
SOURCE
221 subroutine energy(cg,compch_fft,constrained_dft,dtset,electronpositron,& 222 & energies,eigen,etotal,gsqcut,extfpmd,indsym,irrzon,kg,mcg,mpi_enreg,my_natom,nfftf,ngfftf,nhat,& 223 & nhatgr,nhatgrdim,npwarr,n3xccc,occ,optene,paw_dmft,paw_ij,pawang,pawfgr,& 224 & pawfgrtab,pawrhoij,pawtab,phnons,ph1d,psps,resid,rhog,rhor,rprimd,strsxc,symrec,& 225 & taug,taur,usexcnhat,vhartr,vtrial,vpsp,vxc,wfs,wvl,wvl_den,wvl_e,xccc3d,xred,ylm,& 226 & add_tfw,vxctau) ! optional argument 227 228 !Arguments ------------------------------------ 229 !scalars 230 integer,intent(in) :: mcg,my_natom,n3xccc,nfftf,nhatgrdim,optene,usexcnhat 231 logical,intent(in),optional :: add_tfw 232 real(dp),intent(in) :: gsqcut 233 real(dp),intent(out) :: compch_fft,etotal 234 type(MPI_type),intent(inout) :: mpi_enreg 235 type(constrained_dft_t),intent(in) :: constrained_dft 236 type(dataset_type),intent(in) :: dtset 237 type(electronpositron_type),pointer :: electronpositron 238 type(energies_type),intent(inout) :: energies 239 type(extfpmd_type),pointer,intent(inout) :: extfpmd 240 type(paw_dmft_type), intent(inout) :: paw_dmft 241 type(pawang_type),intent(in) :: pawang 242 type(pawfgr_type),intent(in) :: pawfgr 243 type(pseudopotential_type),intent(in) :: psps 244 type(wvl_internal_type), intent(in) :: wvl 245 type(wvl_wf_type),intent(inout) :: wfs 246 type(wvl_denspot_type), intent(inout) :: wvl_den 247 type(wvl_energy_terms),intent(inout) ::wvl_e 248 !arrays 249 !nfft**(1-1/nsym) is 1 if nsym==1, and nfft otherwise 250 integer, intent(in) :: indsym(4,dtset%nsym,dtset%natom) 251 integer :: irrzon(dtset%nfft**(1-1/dtset%nsym),2,(dtset%nspden/dtset%nsppol)-3*(dtset%nspden/4)),kg(3,dtset%mpw*dtset%mkmem) 252 integer, intent(in) :: ngfftf(18),npwarr(dtset%nkpt),symrec(3,3,dtset%nsym) 253 real(dp), intent(in) :: cg(2,mcg),eigen(dtset%mband*dtset%nkpt*dtset%nsppol) 254 real(dp), intent(in) :: occ(dtset%mband*dtset%nkpt*dtset%nsppol),ph1d(2,3*(2*dtset%mgfft+1)*dtset%natom) 255 real(dp), intent(inout) :: nhat(nfftf,dtset%nspden*psps%usepaw) 256 real(dp),intent(in) :: nhatgr(nfftf,dtset%nspden,3*nhatgrdim) 257 !nfft**(1-1/nsym) is 1 if nsym==1, and nfft otherwise 258 real(dp), intent(in) :: phnons(2,dtset%nfft**(1-1/dtset%nsym),(dtset%nspden/dtset%nsppol)-3*(dtset%nspden/4)) 259 real(dp), intent(inout) :: resid(dtset%mband*dtset%nkpt*dtset%nsppol) 260 real(dp), intent(inout) :: rhog(2,nfftf),rhor(nfftf,dtset%nspden) 261 real(dp), intent(inout) :: taug(2,nfftf*dtset%usekden),taur(nfftf,dtset%nspden*dtset%usekden) 262 real(dp), intent(out) :: strsxc(6) 263 real(dp), intent(in) :: rprimd(3,3),vpsp(nfftf),xccc3d(n3xccc),xred(3,dtset%natom) 264 real(dp), intent(out) :: vhartr(nfftf),vtrial(nfftf,dtset%nspden),vxc(nfftf,dtset%nspden) 265 real(dp),intent(out),optional,target :: vxctau(nfftf,dtset%nspden,4*dtset%usekden) 266 real(dp), intent(in) :: ylm(dtset%mpw*dtset%mkmem,psps%mpsang*psps%mpsang*psps%useylm) 267 type(paw_ij_type), intent(in) :: paw_ij(my_natom*psps%usepaw) 268 type(pawfgrtab_type),intent(inout) :: pawfgrtab(my_natom) 269 type(pawrhoij_type),target,intent(inout) :: pawrhoij(my_natom*psps%usepaw) 270 type(pawtab_type),intent(in) :: pawtab(psps%ntypat*psps%usepaw) 271 272 !Local variables------------------------------- 273 !scalars 274 integer :: bdtot_index,blocksize,choice,cplex,cplex_rhoij,cpopt,dimffnl 275 integer :: iband,iband_last,iblock,iblocksize,icg,ider,idir,ierr,ifft,ikg,ikpt,ilm 276 integer :: ipert,ipositron,iresid,ispden,isppol,istwf_k,izero 277 integer :: me_distrb,mpi_comm_sphgrid,my_ikpt,my_nspinor,n1,n2,n3,n4,n5,n6 278 integer :: nband_k,nblockbd,nfftotf,nkpg,nkxc,nk3xc,nnlout,npw_k,nspden_rhoij,option 279 integer :: option_rhoij,paw_opt,signs,spaceComm,tim_mkrho,tim_nonlop 280 logical :: add_tfw_,paral_atom,use_timerev,use_zeromag,with_vxctau 281 logical :: non_magnetic_xc,wvlbigdft=.false. 282 real(dp) :: dotr,doti,eeigk,ekk,enlk,evxc,e_xcdc_vxctau,ucvol,ucvol_local,vxcavg 283 !character(len=500) :: message 284 type(gs_hamiltonian_type) :: gs_hamk 285 type(xcdata_type) :: xcdata 286 !arrays 287 integer,allocatable :: kg_k(:,:) 288 real(dp) :: gmet(3,3),gprimd(3,3),kpg_dum(0,0),kpoint(3),nonlop_out(1,1) 289 real(dp) :: qpt(3),rhodum(1),rmet(3,3),tsec(2),ylmgr_dum(1,1,1),vzeeman(4) 290 real(dp) :: magvec(dtset%nspden) 291 real(dp),target :: vxctau_dum(0,0,0) 292 real(dp),allocatable :: buffer(:) 293 real(dp),allocatable :: cwavef(:,:),eig_k(:),enlout(:),ffnl(:,:,:,:),ffnl_sav(:,:,:,:) 294 real(dp),allocatable :: kinpw(:),kinpw_sav(:),kxc(:,:),occ_k(:),occblock(:) 295 real(dp),allocatable :: ph3d(:,:,:),ph3d_sav(:,:,:) 296 real(dp),allocatable :: resid_k(:),rhowfg(:,:),rhowfr(:,:),vlocal(:,:,:,:) 297 real(dp),allocatable :: vxctaulocal(:,:,:,:,:),ylm_k(:,:),v_constr_dft_r(:,:) 298 real(dp),pointer :: vxctau_(:,:,:) 299 type(bandfft_kpt_type),pointer :: my_bandfft_kpt => null() 300 type(pawcprj_type),target,allocatable :: cwaveprj(:,:) 301 type(pawcprj_type),pointer :: cwaveprj_gat(:,:) 302 type(pawrhoij_type),pointer :: pawrhoij_unsym(:) 303 304 ! ************************************************************************* 305 306 DBG_ENTER("COLL") 307 308 !Check that usekden is not 0 if want to use vxctau 309 with_vxctau = (present(vxctau).and.dtset%usekden/=0) 310 vxctau_ => vxctau_dum ; if (with_vxctau) vxctau_ => vxctau 311 312 !Test size of FFT grids (1 grid in norm-conserving, 2 grids in PAW) 313 nfftotf=PRODUCT(ngfftf(1:3)) 314 if ((psps%usepaw==1.and.pawfgr%nfft/=nfftf).or.(psps%usepaw==0.and.dtset%nfft/=nfftf)) then 315 ABI_BUG('wrong values for nfft, nfftf!') 316 end if 317 318 !If usewvl: wvlbigdft indicates that the BigDFT workflow will be followed 319 wvlbigdft=(dtset%usewvl==1 .and. dtset%wvl_bigdft_comp==1) 320 321 call timab(59,1,tsec) 322 323 !Data for parallelism 324 spaceComm=mpi_enreg%comm_cell 325 if(mpi_enreg%paral_kgb==1) spaceComm=mpi_enreg%comm_kpt 326 if(mpi_enreg%paral_hf==1) spaceComm=mpi_enreg%comm_kpt 327 mpi_comm_sphgrid=mpi_enreg%comm_fft 328 if(dtset%usewvl==1) then 329 spaceComm=mpi_enreg%comm_wvl 330 mpi_comm_sphgrid=mpi_enreg%comm_wvl 331 end if 332 me_distrb=mpi_enreg%me_kpt 333 my_nspinor=max(1,dtset%nspinor/mpi_enreg%nproc_spinor) 334 paral_atom=(my_natom/=dtset%natom) 335 336 !Compute gmet, gprimd and ucvol from rprimd 337 call metric(gmet,gprimd,-1,rmet,rprimd,ucvol) 338 if (dtset%usewvl == 0) then 339 ucvol_local = ucvol 340 #if defined HAVE_BIGDFT 341 else 342 ! We need to tune the volume when wavelets are used because, not 343 ! all FFT points are used. 344 ! ucvol_local = (half * dtset%wvl_hgrid) ** 3 * ngfft(1)*ngfft(2)*ngfft(3) 345 ucvol_local = product(wvl_den%denspot%dpbox%hgrids) * real(product(wvl_den%denspot%dpbox%ndims), dp) 346 #endif 347 end if 348 349 !Compute Hxc potential from density 350 option=1;nkxc=0 351 ipositron=electronpositron_calctype(electronpositron) 352 add_tfw_=.false.;if (present(add_tfw)) add_tfw_=add_tfw 353 non_magnetic_xc=(dtset%usepaw==1.and.mod(abs(dtset%usepawu),10)==4) 354 355 if (ipositron/=1) then 356 357 if (dtset%icoulomb == 0) then 358 ! Use the periodic solver to compute Hxc. 359 call hartre(1,gsqcut,dtset%icutcoul,psps%usepaw,mpi_enreg,nfftf,ngfftf,& 360 &dtset%nkpt,dtset%rcut,rhog,rprimd,dtset%vcutgeo,vhartr) 361 call xcdata_init(xcdata,dtset=dtset) 362 ABI_MALLOC(kxc,(1,nkxc)) 363 ! to be adjusted for the call to rhotoxc 364 nk3xc=1 365 if (ipositron==0) then 366 call rhotoxc(energies%e_xc,kxc, & 367 & mpi_enreg,nfftf,ngfftf,nhat,psps%usepaw,nhatgr,nhatgrdim, & 368 & nkxc,nk3xc,non_magnetic_xc,n3xccc,option,rhor,rprimd,strsxc, & 369 & usexcnhat,vxc,vxcavg,xccc3d,xcdata,taur=taur,vhartr=vhartr, & 370 & vxctau=vxctau_,exc_vdw_out=energies%e_xc_vdw,add_tfw=add_tfw_) 371 else 372 call rhotoxc(energies%e_xc,kxc, & 373 & mpi_enreg,nfftf,ngfftf,nhat,psps%usepaw,nhatgr,nhatgrdim, & 374 & nkxc,nk3xc,non_magnetic_xc,n3xccc,option,rhor,rprimd,strsxc, & 375 & usexcnhat,vxc,vxcavg,xccc3d,xcdata, & 376 & electronpositron=electronpositron,taur=taur,vhartr=vhartr, & 377 & vxctau=vxctau_,exc_vdw_out=energies%e_xc_vdw,add_tfw=add_tfw_) 378 end if 379 ABI_FREE(kxc) 380 else if (dtset%usewvl == 0) then 381 ! Use the free boundary solver. 382 call psolver_rhohxc(energies%e_hartree, energies%e_xc, evxc, & 383 & dtset%icoulomb, dtset%ixc, mpi_enreg, nfftf, & 384 & ngfftf,nhat,psps%usepaw,& 385 & dtset%nscforder,dtset%nspden,n3xccc,rhor,rprimd, & 386 & usexcnhat,psps%usepaw,dtset%usewvl,& 387 & vhartr, vxc, vxcavg, wvl,wvl_den,wvl_e,& 388 & xccc3d,dtset%xclevel,dtset%xc_denpos) 389 end if 390 else 391 energies%e_xc=zero 392 call rhohxcpositron(electronpositron,gprimd,kxc,mpi_enreg,nfftf,ngfftf,nhat,nkxc,dtset%nspden,n3xccc,& 393 & dtset%paral_kgb,rhor,strsxc,ucvol,usexcnhat,psps%usepaw,vhartr,vxc,vxcavg,xccc3d,dtset%xc_denpos) 394 end if 395 if (ipositron/=0) then 396 call dotprod_vn(1,rhor,electronpositron%e_hartree,doti,nfftf,nfftotf,1,1,electronpositron%vha_ep,& 397 & ucvol,mpi_comm_sphgrid=mpi_comm_sphgrid) 398 vhartr=vhartr+electronpositron%vha_ep 399 end if 400 401 !Total local potential (for either spin channel) is 402 !Hartree + local psp + Vxc(spin), minus its mean 403 !(Note : this potential should agree with the input vtrial) 404 do ispden=1,min(dtset%nspden,2) 405 do ifft=1,nfftf 406 vtrial(ifft,ispden)=vhartr(ifft)+vpsp(ifft)+vxc(ifft,ispden) 407 end do 408 end do 409 if (dtset%nspden==4) then 410 do ifft=1,nfftf 411 vtrial(ifft,3:4)=vxc(ifft,3:4) 412 end do 413 end if 414 415 !Add the vzeeman pot in the trial pot 416 !Vzeeman might have to be allocated correctly --> to be checked 417 if (any(abs(dtset%zeemanfield(:))>tol8)) then 418 vzeeman(:) = zero 419 if(dtset%nspden==2)then 420 !TODO: check this against rhotov and setvtr, where the potential is -1/2 and +1/2 for the 2 spin components. 421 ! see comment by SPr in rhotov 422 ! TODO: check this 1/2 factor is for the electron spin magnetic moment. 423 vzeeman(1) = -half*dtset%zeemanfield(3) ! For collinear ispden=1 potential is v_upup 424 vzeeman(2) = +half*dtset%zeemanfield(3) ! For collinear ispden=2 potential is v_dndn 425 end if 426 if(dtset%nspden==4)then 427 vzeeman(1)=-half*dtset%zeemanfield(3) 428 vzeeman(2)= half*dtset%zeemanfield(3) 429 vzeeman(3)=-half*dtset%zeemanfield(1) 430 vzeeman(4)= half*dtset%zeemanfield(2) 431 end if 432 magvec = zero 433 do ispden=1,dtset%nspden 434 do ifft=1,nfftf 435 !TODO: the full cell magnetization will need extra PAW terms, and is certainly calculated elsewhere. 436 !The calculation of the zeeman energy can be moved there 437 magvec(ispden) = magvec(ispden) + rhor(ifft,ispden) 438 vtrial(ifft,ispden)=vtrial(ifft,ispden)+vzeeman(ispden) 439 end do 440 end do 441 if(dtset%nspden==2)then 442 energies%e_zeeman = -half*dtset%zeemanfield(3)*(two*magvec(2)-magvec(1)) ! diff rho = rhoup-rhodown = 2 rhoup - rho 443 else if(dtset%nspden==4)then 444 energies%e_zeeman = -half * (dtset%zeemanfield(1)*magvec(2)& ! x 445 & +dtset%zeemanfield(2)*magvec(3)& ! y 446 & +dtset%zeemanfield(3)*magvec(4)) ! z 447 end if 448 end if 449 450 !Compute the constrained potential for the magnetic moments 451 !NOTE: here in energy.F90 rhor and vtrial are given on nfftf grid 452 !the values coming from mag_penalty may be different from those calculated 453 !calling mag_penalty with nfft in setvtr and rhotov 454 if (dtset%magconon==1.or.dtset%magconon==2) then 455 ABI_MALLOC(v_constr_dft_r, (nfftf,dtset%nspden)) 456 v_constr_dft_r = zero 457 call mag_penalty(constrained_dft,mpi_enreg,rhor,v_constr_dft_r,xred) 458 ! call mag_penalty(dtset%natom, dtset%spinat, dtset%nspden, dtset%magconon, dtset%magcon_lambda, rprimd, & 459 !& mpi_enreg, nfftf, dtset%ngfft, dtset%ntypat, dtset%ratsph, rhor, & 460 !& dtset%typat, v_constr_dft_r, xred) 461 do ispden=1,dtset%nspden 462 do ifft=1,nfftf 463 vtrial(ifft,ispden)=vtrial(ifft,ispden)+v_constr_dft_r(ifft,ispden) 464 end do 465 end do 466 ABI_FREE(v_constr_dft_r) 467 end if 468 469 !Compute Hartree energy - use up+down rhor 470 if (ipositron/=1) then 471 call dotprod_vn(1,rhor,energies%e_hartree ,doti,nfftf,nfftotf,1,1,vhartr,& 472 & ucvol_local,mpi_comm_sphgrid=mpi_comm_sphgrid) 473 if (ipositron==0) energies%e_hartree=half*energies%e_hartree 474 if (ipositron==2) energies%e_hartree = half *(energies%e_hartree-electronpositron%e_hartree) 475 else 476 energies%e_hartree=zero 477 end if 478 479 !Compute local psp energy - use up+down rhor 480 if (optene/=1) then 481 call dotprod_vn(1,rhor,energies%e_localpsp,doti,nfftf,nfftotf,1,1,vpsp,& 482 & ucvol_local,mpi_comm_sphgrid=mpi_comm_sphgrid) 483 end if 484 485 !Compute DC-xc energy - use up+down rhor 486 if (optene>0) then 487 if (ipositron/=1) then 488 if (psps%usepaw==0.or.usexcnhat/=0) then 489 call dotprod_vn(1,rhor,energies%e_xcdc,doti,nfftf,nfftotf,dtset%nspden,1,vxc,& 490 & ucvol_local,mpi_comm_sphgrid=mpi_comm_sphgrid) 491 if (with_vxctau)then 492 call dotprod_vn(1,taur,e_xcdc_vxctau,doti,nfftf,nfftotf,dtset%nspden,1,vxctau(:,:,1),& 493 & ucvol_local,mpi_comm_sphgrid=mpi_comm_sphgrid) 494 energies%e_xcdc=energies%e_xcdc+e_xcdc_vxctau 495 end if 496 else 497 ABI_MALLOC(rhowfr,(nfftf,dtset%nspden)) 498 rhowfr=rhor-nhat 499 call dotprod_vn(1,rhowfr,energies%e_xcdc,doti,nfftf,nfftotf,dtset%nspden,1,vxc,& 500 & ucvol_local,mpi_comm_sphgrid=mpi_comm_sphgrid) 501 ABI_FREE(rhowfr) 502 end if 503 if (ipositron==2) energies%e_xcdc=energies%e_xcdc-electronpositron%e_xcdc 504 else 505 energies%e_xcdc=zero 506 end if 507 end if 508 509 energies%e_eigenvalues=zero 510 energies%e_kinetic=zero 511 energies%e_nlpsp_vfock=zero 512 energies%e_fock0=zero 513 bdtot_index=0 514 icg=0 515 516 n1=dtset%ngfft(1) ; n2=dtset%ngfft(2) ; n3=dtset%ngfft(3) 517 n4=dtset%ngfft(4) ; n5=dtset%ngfft(5) ; n6=dtset%ngfft(6) 518 519 !============================================ 520 !==== Initialize most of the Hamiltonian ==== 521 !============================================ 522 !1) Allocate all arrays and initialize quantities that do not depend on k and spin. 523 !2) Perform the setup needed for the non-local factors: 524 !* Norm-conserving: Constant kleimann-Bylander energies are copied from psps to gs_hamk. 525 !* PAW: Initialize the overlap coefficients and allocate the Dij coefficients. 526 527 call init_hamiltonian(gs_hamk,psps,pawtab,dtset%nspinor,dtset%nsppol,dtset%nspden,& 528 & dtset%natom,dtset%typat,xred,dtset%nfft,dtset%mgfft,dtset%ngfft,rprimd,dtset%nloalg,& 529 & comm_atom=mpi_enreg%comm_atom,mpi_atmtab=mpi_enreg%my_atmtab,mpi_spintab=mpi_enreg%my_isppoltab,& 530 & paw_ij=paw_ij,ph1d=ph1d,electronpositron=electronpositron,& 531 & nucdipmom=dtset%nucdipmom,gpu_option=dtset%gpu_option) 532 533 ABI_MALLOC(vlocal,(n4,n5,n6,gs_hamk%nvloc)) 534 if (with_vxctau) then 535 ABI_MALLOC(vxctaulocal,(n4,n5,n6,gs_hamk%nvloc,4)) 536 end if 537 538 !PAW: additional initializations 539 if (psps%usepaw==1) then 540 ABI_MALLOC(cwaveprj,(dtset%natom,my_nspinor)) 541 call pawcprj_alloc(cwaveprj,0,gs_hamk%dimcprj) 542 if (mpi_enreg%paral_spinor==1) then 543 ABI_MALLOC(cwaveprj_gat,(dtset%natom,dtset%nspinor)) 544 call pawcprj_alloc(cwaveprj_gat,0,gs_hamk%dimcprj) 545 else 546 cwaveprj_gat => cwaveprj 547 end if 548 if (paral_atom) then 549 ABI_MALLOC(pawrhoij_unsym,(dtset%natom)) 550 call pawrhoij_inquire_dim(cplex_rhoij=cplex_rhoij,nspden_rhoij=nspden_rhoij,& 551 & nspden=dtset%nspden,spnorb=dtset%pawspnorb,cpxocc=dtset%pawcpxocc) 552 call pawrhoij_alloc(pawrhoij_unsym,cplex_rhoij,nspden_rhoij,dtset%nspinor,& 553 & dtset%nsppol,dtset%typat,pawtab=pawtab,use_rhoijp=0,use_rhoij_=1) 554 else 555 pawrhoij_unsym => pawrhoij 556 call pawrhoij_init_unpacked(pawrhoij_unsym) 557 end if 558 option_rhoij=1 559 use_timerev=(dtset%kptopt>0.and.dtset%kptopt<3) 560 use_zeromag=(pawrhoij_unsym(1)%nspden==4.and.dtset%nspden==1) 561 else 562 ABI_MALLOC(cwaveprj,(0,0)) 563 end if 564 565 !LOOP OVER SPINS 566 do isppol=1,dtset%nsppol 567 ikg=0 568 569 ! Set up local potential vlocal on the coarse FFT mesh from vtrial taking into account the spin. 570 ! Also take into account the spin. 571 572 call gspot_transgrid_and_pack(isppol, psps%usepaw, dtset%paral_kgb, dtset%nfft, dtset%ngfft, nfftf, & 573 dtset%nspden, gs_hamk%nvloc, 1, pawfgr, mpi_enreg, vtrial, vlocal) 574 call gs_hamk%load_spin(isppol,vlocal=vlocal,with_nonlocal=.true.) 575 576 if (with_vxctau) then 577 call gspot_transgrid_and_pack(isppol, psps%usepaw, dtset%paral_kgb, dtset%nfft, dtset%ngfft, nfftf, & 578 dtset%nspden, gs_hamk%nvloc, 4, pawfgr, mpi_enreg, vxctau, vxctaulocal) 579 call gs_hamk%load_spin(isppol, vxctaulocal=vxctaulocal) 580 end if 581 582 ! Loop over k points 583 do ikpt=1,dtset%nkpt 584 nband_k=dtset%nband(ikpt+(isppol-1)*dtset%nkpt) 585 istwf_k=dtset%istwfk(ikpt) 586 npw_k=npwarr(ikpt) 587 588 ! Skip this k-point if not the proper processor 589 if(proc_distrb_cycle(mpi_enreg%proc_distrb,ikpt,1,nband_k,isppol,me_distrb)) then 590 resid(1+bdtot_index : nband_k+bdtot_index) = zero 591 bdtot_index=bdtot_index+nband_k 592 cycle 593 end if 594 595 ! Parallelism over FFT and/or bands: define sizes and tabs 596 if (mpi_enreg%paral_kgb==1) then 597 my_ikpt=mpi_enreg%my_kpttab(ikpt) 598 nblockbd=nband_k/(mpi_enreg%nproc_band*mpi_enreg%bandpp) 599 my_bandfft_kpt => bandfft_kpt(my_ikpt) 600 else 601 my_ikpt=ikpt 602 nblockbd=nband_k/mpi_enreg%bandpp 603 !if (nband_k/=nblockbd*mpi_enreg%nproc_fft) nblockbd=nblockbd+1 604 end if 605 blocksize=nband_k/nblockbd 606 607 ABI_MALLOC(eig_k,(nband_k)) 608 ABI_MALLOC(occ_k,(nband_k)) 609 ABI_MALLOC(resid_k,(nband_k)) 610 ABI_MALLOC(cwavef,(2,npw_k*my_nspinor*blocksize)) 611 resid_k(:)=zero 612 kpoint(:)=dtset%kptns(:,ikpt) 613 occ_k(:)=occ(1+bdtot_index:nband_k+bdtot_index) 614 eig_k(:)=eigen(1+bdtot_index:nband_k+bdtot_index) 615 if (minval(eig_k)>1.d100) eig_k=zero 616 eeigk=zero ; ekk=zero ; enlk=zero 617 618 ABI_MALLOC(kg_k,(3,npw_k)) 619 kg_k(:,1:npw_k)=kg(:,1+ikg:npw_k+ikg) 620 621 ABI_MALLOC(ylm_k,(npw_k,psps%mpsang*psps%mpsang*psps%useylm)) 622 if (psps%useylm==1) then 623 do ilm=1,psps%mpsang*psps%mpsang 624 ylm_k(1:npw_k,ilm)=ylm(1+ikg:npw_k+ikg,ilm) 625 end do 626 end if 627 628 ! Compute kinetic energy 629 ABI_MALLOC(kinpw,(npw_k)) 630 call mkkin(dtset%ecut,dtset%ecutsm,dtset%effmass_free,gmet,kg_k,kinpw,kpoint,npw_k,0,0) 631 632 ! Compute kinetic energy of each band 633 do iblock=1,nblockbd 634 do iblocksize=1,blocksize 635 iband=(iblock-1)*blocksize+iblocksize 636 if (abs(occ_k(iband))>tol8) then 637 cwavef(1:2,1:npw_k*my_nspinor)= & 638 & cg(:,1+(iband-1)*npw_k*my_nspinor+icg:iband*npw_k*my_nspinor+icg) 639 call meanvalue_g(dotr,kinpw,0,istwf_k,mpi_enreg,npw_k,my_nspinor,cwavef,cwavef,0) 640 energies%e_kinetic=energies%e_kinetic+dtset%wtk(ikpt)*occ_k(iband)*dotr 641 end if 642 end do 643 end do 644 645 ! Compute nonlocal form factors ffnl at all (k+G): 646 ider=0;dimffnl=1;nkpg=0 647 ABI_MALLOC(ffnl,(npw_k,dimffnl,psps%lmnmax,psps%ntypat)) 648 call mkffnl(psps%dimekb,dimffnl,psps%ekb,ffnl,psps%ffspl,& 649 & gmet,gprimd,ider,ider,psps%indlmn,kg_k,kpg_dum,kpoint,psps%lmnmax,& 650 & psps%lnmax,psps%mpsang,psps%mqgrid_ff,nkpg,& 651 & npw_k,psps%ntypat,psps%pspso,psps%qgrid_ff,rmet,& 652 & psps%usepaw,psps%useylm,ylm_k,ylmgr_dum) 653 654 ! Load k-dependent part in the Hamiltonian datastructure 655 ! - Compute 3D phase factors 656 ! - Prepare various tabs in case of band-FFT parallelism 657 ! - Load k-dependent quantities in the Hamiltonian 658 ABI_MALLOC(ph3d,(2,npw_k,gs_hamk%matblk)) 659 call gs_hamk%load_k(kpt_k=dtset%kptns(:,ikpt),istwf_k=istwf_k,npw_k=npw_k,& 660 & kinpw_k=kinpw,kg_k=kg_k,ffnl_k=ffnl,ph3d_k=ph3d,& 661 & compute_ph3d=.true.,compute_gbound=(mpi_enreg%paral_kgb/=1)) 662 663 ! Load band-FFT tabs (transposed k-dependent arrays) 664 if (mpi_enreg%paral_kgb==1) then 665 call bandfft_kpt_savetabs(my_bandfft_kpt,ffnl=ffnl_sav,ph3d=ph3d_sav,kinpw=kinpw_sav) 666 call prep_bandfft_tabs(gs_hamk,ikpt,dtset%mkmem,mpi_enreg) 667 call gs_hamk%load_k(npw_fft_k=my_bandfft_kpt%ndatarecv, & 668 & gbound_k =my_bandfft_kpt%gbound, & 669 & kinpw_k =my_bandfft_kpt%kinpw_gather, & 670 & kg_k =my_bandfft_kpt%kg_k_gather, & 671 & ffnl_k =my_bandfft_kpt%ffnl_gather, & 672 & ph3d_k =my_bandfft_kpt%ph3d_gather) 673 end if 674 675 ! If OpenMP GPU, load "hamiltonian" on GPU device 676 if (dtset%gpu_option == ABI_GPU_OPENMP) then 677 if(dtset%paral_kgb==0) then 678 call ompgpu_load_hamilt_buffers(gs_hamk%kg_k,gs_hamk%kg_kp) 679 else if(istwf_k==1) then 680 call ompgpu_load_hamilt_buffers(gs_hamk%kg_k,gs_hamk%kg_kp,kg_k_gather=my_bandfft_kpt%kg_k_gather) 681 else if(istwf_k==2) then 682 call ompgpu_load_hamilt_buffers(gs_hamk%kg_k,gs_hamk%kg_kp,kg_k_gather=my_bandfft_kpt%kg_k_gather_sym) 683 else 684 ABI_ERROR("istwfk > 2 is not handled with OpenMP GPU offload mode !") 685 end if 686 end if 687 688 ! Setup gemm_nonlop 689 if (gemm_nonlop_use_gemm) then 690 gemm_nonlop_ikpt_this_proc_being_treated = my_ikpt 691 if(dtset%gpu_option==ABI_GPU_DISABLED) then 692 call make_gemm_nonlop(my_ikpt,gs_hamk%npw_fft_k,gs_hamk%lmnmax, & 693 gs_hamk%ntypat, gs_hamk%indlmn, gs_hamk%nattyp, gs_hamk%istwf_k, & 694 gs_hamk%ucvol, gs_hamk%ffnl_k, & 695 gs_hamk%ph3d_k, gs_hamk%kpt_k, gs_hamk%kg_k, gs_hamk%kpg_k) 696 else if(dtset%gpu_option==ABI_GPU_LEGACY .or. dtset%gpu_option==ABI_GPU_KOKKOS) then 697 call make_gemm_nonlop_gpu(my_ikpt,gs_hamk%npw_fft_k,gs_hamk%lmnmax, & 698 gs_hamk%ntypat, gs_hamk%indlmn, gs_hamk%nattyp, gs_hamk%istwf_k, & 699 gs_hamk%ucvol, gs_hamk%ffnl_k, & 700 gs_hamk%ph3d_k, gs_hamk%kpt_k, gs_hamk%kg_k, gs_hamk%kpg_k) 701 else if(dtset%gpu_option==ABI_GPU_OPENMP) then 702 call make_gemm_nonlop_ompgpu(my_ikpt,gs_hamk%npw_fft_k,gs_hamk%lmnmax, & 703 gs_hamk%ntypat, gs_hamk%indlmn, gs_hamk%nattyp, gs_hamk%istwf_k, & 704 gs_hamk%ucvol, gs_hamk%ffnl_k, & 705 gs_hamk%ph3d_k, gs_hamk%kpt_k, gs_hamk%kg_k, gs_hamk%kpg_k) 706 end if 707 end if 708 709 #if defined HAVE_GPU_CUDA 710 if (dtset%gpu_option==ABI_GPU_LEGACY .or. dtset%gpu_option==ABI_GPU_KOKKOS) then 711 call gpu_update_ffnl_ph3d( & 712 & ph3d, INT(size(ph3d),c_int64_t), & 713 & ffnl, INT(size(ffnl),c_int64_t) ) 714 end if 715 #endif 716 717 ! Compute nonlocal psp energy (NCPP) or Rhoij (PAW) 718 ABI_MALLOC(enlout,(blocksize)) 719 ABI_MALLOC(occblock,(blocksize)) 720 do iblock=1,nblockbd 721 iband=(iblock-1)*blocksize+1;iband_last=min(iband+blocksize-1,nband_k) 722 if(proc_distrb_cycle(mpi_enreg%proc_distrb,ikpt,iband,iband_last,isppol,me_distrb)) cycle 723 724 ! Select occupied bands 725 occblock(:)=occ(1+(iblock-1)*blocksize+bdtot_index:iblock*blocksize+bdtot_index) 726 if(abs(maxval(occblock))>=tol8 ) then 727 cwavef(:,1:npw_k*my_nspinor*blocksize)=& 728 & cg(:,1+(iblock-1)*npw_k*my_nspinor*blocksize+icg:iblock*npw_k*my_nspinor*blocksize+icg) 729 730 choice=1-gs_hamk%usepaw ; signs=1 ; idir=0 ; nnlout=blocksize 731 paw_opt=gs_hamk%usepaw;cpopt=gs_hamk%usepaw-1 732 733 if (mpi_enreg%paral_kgb/=1) then 734 tim_nonlop=3 735 call nonlop(choice,cpopt,cwaveprj,enlout,gs_hamk,idir,(/zero/),mpi_enreg,blocksize,nnlout,& 736 & paw_opt,signs,nonlop_out,tim_nonlop,cwavef,cwavef) 737 else 738 tim_nonlop=14 739 call prep_nonlop(choice,cpopt,cwaveprj,enlout,gs_hamk,idir,(/zero/),blocksize,& 740 & mpi_enreg,nnlout,paw_opt,signs,nonlop_out,tim_nonlop,cwavef,cwavef) 741 end if 742 743 do iblocksize=1,blocksize 744 iband=(iblock-1)*blocksize+iblocksize 745 energies%e_eigenvalues=energies%e_eigenvalues+dtset%wtk(ikpt)*occ_k(iband)*eig_k(iband) 746 ! WARNING : the Fock contribution is NOT computed !!! 747 energies%e_nlpsp_vfock=energies%e_nlpsp_vfock+dtset%wtk(ikpt)*occ_k(iband)*enlout(iblocksize) 748 end do 749 750 ! PAW: accumulate rhoij 751 if (psps%usepaw==1) then 752 cplex=merge(1,2,istwf_k>1) 753 if (mpi_enreg%paral_spinor==1) then 754 call pawcprj_gather_spin(cwaveprj,cwaveprj_gat,dtset%natom,1,my_nspinor,dtset%nspinor,& 755 & mpi_enreg%comm_spinor,ierr) 756 call pawaccrhoij(gs_hamk%atindx,cplex,cwaveprj_gat,cwaveprj_gat,0,isppol,dtset%natom,dtset%natom,& 757 & dtset%nspinor,occ_k(iband),option_rhoij,pawrhoij_unsym,use_timerev,use_zeromag,dtset%wtk(ikpt)) 758 else 759 call pawaccrhoij(gs_hamk%atindx,cplex,cwaveprj,cwaveprj,0,isppol,dtset%natom,dtset%natom,& 760 & dtset%nspinor,occ_k(iband),option_rhoij,pawrhoij_unsym,use_timerev,use_zeromag,dtset%wtk(ikpt)) 761 end if 762 end if 763 764 ! End loop on bands 765 end if 766 end do 767 768 ! Compute residual of each band (for informative purposes) 769 call mkresi(cg,eig_k,gs_hamk,icg,ikpt,isppol,mcg,mpi_enreg,nband_k,dtset%prtvol,resid_k) 770 resid(1+bdtot_index : nband_k+bdtot_index) = resid_k(:) 771 772 ! Restore the bandfft tabs 773 if (mpi_enreg%paral_kgb==1) then 774 call bandfft_kpt_restoretabs(my_bandfft_kpt,ffnl=ffnl_sav,ph3d=ph3d_sav,kinpw=kinpw_sav) 775 end if 776 777 ! Incremente indexes 778 bdtot_index=bdtot_index+nband_k 779 if (dtset%mkmem/=0) then 780 icg=icg+npw_k*my_nspinor*nband_k 781 ikg=ikg+npw_k 782 end if 783 784 #if defined HAVE_GPU_CUDA 785 if(dtset%gpu_option==ABI_GPU_LEGACY .or. dtset%gpu_option==ABI_GPU_KOKKOS) then 786 call gpu_finalize_ffnl_ph3d() 787 end if 788 #endif 789 790 ABI_FREE(eig_k) 791 ABI_FREE(occ_k) 792 ABI_FREE(resid_k) 793 ABI_FREE(enlout) 794 ABI_FREE(occblock) 795 ABI_FREE(ffnl) 796 ABI_FREE(kinpw) 797 ABI_FREE(ph3d) 798 ABI_FREE(cwavef) 799 ABI_FREE(kg_k) 800 ABI_FREE(ylm_k) 801 802 ! End loops on isppol and ikpt 803 end do 804 end do 805 806 call gs_hamk%free() 807 if ( dtset%gpu_option == ABI_GPU_OPENMP) then 808 call ompgpu_free_hamilt_buffers() 809 end if 810 811 if(xmpi_paral==1)then 812 ! Accumulate enl eeig and ek on all proc. 813 ABI_MALLOC(buffer,(3+dtset%mband*dtset%nkpt*dtset%nsppol)) 814 buffer(1)=energies%e_nlpsp_vfock ; buffer(2)=energies%e_kinetic ; buffer(3)=energies%e_eigenvalues 815 do iresid=1,dtset%mband*dtset%nkpt*dtset%nsppol 816 buffer(iresid+3)=resid(iresid) 817 end do 818 call timab(48,1,tsec) 819 call xmpi_sum(buffer,spaceComm,ierr) 820 call timab(48,2,tsec) 821 energies%e_nlpsp_vfock=buffer(1) ; energies%e_kinetic=buffer(2) ; energies%e_eigenvalues=buffer(3) 822 do iresid=1,dtset%mband*dtset%nkpt*dtset%nsppol 823 resid(iresid)=buffer(iresid+3) 824 end do 825 ABI_FREE(buffer) 826 ! Accumulate rhoij_ 827 if (psps%usepaw==1) then 828 call pawrhoij_mpisum_unpacked(pawrhoij_unsym,spaceComm,comm2=mpi_enreg%comm_band) 829 end if 830 end if 831 832 !Compute total (free) energy 833 if (optene==0.or.optene==2) then 834 etotal = energies%e_kinetic + energies%e_hartree + energies%e_xc + & 835 !& energies%e_nlpsp_vfock - energies%e_fock0 + 836 ! Should compute the e_fock0 energy !! Also, the Fock contribution to e_nlpsp_vfock 837 & energies%e_nlpsp_vfock + energies%e_localpsp + energies%e_corepsp 838 if (psps%usepaw==1) etotal=etotal + energies%e_paw 839 else if (optene==1.or.optene==3) then 840 etotal = energies%e_eigenvalues - energies%e_hartree + energies%e_xc - & 841 & energies%e_xcdc + energies%e_corepsp - energies%e_corepspdc 842 if (psps%usepaw==1) etotal=etotal + energies%e_pawdc 843 end if 844 etotal = etotal + energies%e_ewald + energies%e_chempot + energies%e_vdw_dftd 845 !Add the contribution of extfpmd to the entropy 846 if(associated(extfpmd)) then 847 energies%entropy=energies%entropy+extfpmd%entropy 848 energies%e_extfpmd=extfpmd%e_kinetic 849 energies%edc_extfpmd=extfpmd%edc_kinetic 850 if(optene==0.or.optene==2) etotal=etotal+energies%e_extfpmd 851 if(optene==1.or.optene==3) etotal=etotal+energies%edc_extfpmd 852 end if 853 if(dtset%occopt>=3 .and. dtset%occopt<=8) etotal=etotal-dtset%tsmear*energies%entropy 854 855 !Additional stuff for electron-positron 856 if (dtset%positron/=0) then 857 if (ipositron==0) then 858 energies%e_electronpositron =zero 859 energies%edc_electronpositron=zero 860 else 861 energies%e_electronpositron =electronpositron%e_hartree+electronpositron%e_xc 862 energies%edc_electronpositron=electronpositron%e_hartree+electronpositron%e_xcdc 863 if (psps%usepaw==1) then 864 energies%e_electronpositron =energies%e_electronpositron +electronpositron%e_paw 865 energies%edc_electronpositron=energies%edc_electronpositron+electronpositron%e_pawdc 866 end if 867 end if 868 if (optene==0.or.optene==2) electronpositron%e0=etotal 869 if (optene==1.or.optene==3) electronpositron%e0=etotal-energies%edc_electronpositron 870 etotal=electronpositron%e0+energies%e0_electronpositron+energies%e_electronpositron 871 end if 872 873 !Compute new charge density based on incoming wf 874 !Keep rhor and rhog intact for later use e.g. in stress. (? MT 08-12-2008: is that true now ?) 875 !=== Norm-conserving psps: simply compute rho from WFs 876 !paw_dmft%use_dmft=0 ! dmft not used here 877 !paw_dmft%use_sc_dmft=0 ! dmft not used here 878 if (psps%usepaw==0) then 879 tim_mkrho=3 880 call mkrho(cg,dtset,gprimd,irrzon,kg,mcg,mpi_enreg,& 881 & npwarr,occ,paw_dmft,phnons,rhog,rhor,rprimd,tim_mkrho,ucvol,wvl_den,wfs,& 882 & extfpmd=extfpmd) 883 if(dtset%usekden==1)then 884 call mkrho(cg,dtset,gprimd,irrzon,kg,mcg,mpi_enreg,& 885 & npwarr,occ,paw_dmft,phnons,taug,taur,rprimd,tim_mkrho,ucvol,wvl_den,wfs,option=1) 886 end if 887 else 888 ! === PAW case: symmetrize rhoij and add compensation charge density 889 tim_mkrho=3;option=1;choice=1 890 call pawrhoij_symrhoij(pawrhoij,pawrhoij_unsym,choice,gprimd,indsym,0,dtset%natom,dtset%nsym,& 891 & dtset%ntypat,option,pawang,dtset%pawprtvol,pawtab,rprimd,dtset%symafm,symrec,dtset%typat,& 892 & comm_atom=mpi_enreg%comm_atom,mpi_atmtab=mpi_enreg%my_atmtab) 893 call pawrhoij_free_unpacked(pawrhoij_unsym) 894 if (paral_atom) then 895 call pawrhoij_free(pawrhoij_unsym) 896 ABI_FREE(pawrhoij_unsym) 897 end if 898 ider=0;izero=0;cplex=1;ipert=0;idir=0;qpt(:)=zero 899 call pawmknhat(compch_fft,cplex,ider,idir,ipert,izero,gprimd,& 900 & my_natom,dtset%natom,nfftf,ngfftf,& 901 & 0,dtset%nspden,dtset%ntypat,pawang,pawfgrtab,rhodum,nhat,pawrhoij,pawrhoij,& 902 & pawtab,qpt,rprimd,ucvol_local,dtset%usewvl,xred,& 903 & comm_fft=mpi_enreg%comm_fft,paral_kgb=dtset%paral_kgb,me_g0=mpi_enreg%me_g0,& 904 & comm_atom=mpi_enreg%comm_atom,mpi_atmtab=mpi_enreg%my_atmtab,& 905 & distribfft=mpi_enreg%distribfft,mpi_comm_wvl=mpi_enreg%comm_wvl) 906 907 ABI_MALLOC(rhowfr,(dtset%nfft,dtset%nspden)) 908 ABI_MALLOC(rhowfg,(2,dtset%nfft)) 909 rhowfr(:,:)=zero 910 call mkrho(cg,dtset,gprimd,irrzon,kg,mcg,mpi_enreg,& 911 & npwarr,occ,paw_dmft,phnons,rhowfg,rhowfr,rprimd,tim_mkrho,ucvol_local,wvl_den,wfs,& 912 & extfpmd=extfpmd) 913 914 call transgrid(1,mpi_enreg,dtset%nspden,+1,1,0,dtset%paral_kgb,pawfgr,rhowfg,rhodum,rhowfr,rhor) 915 rhor(:,:)=rhor(:,:)+nhat(:,:) 916 call fourdp(1,rhog,rhor(:,1),-1,mpi_enreg,nfftf,1,ngfftf,0) 917 if(dtset%usekden==1)then 918 call mkrho(cg,dtset,gprimd,irrzon,kg,mcg,mpi_enreg,npwarr,occ,paw_dmft,phnons,& 919 & rhowfg,rhowfr,rprimd,tim_mkrho,ucvol,wvl_den,wfs,option=1) 920 call transgrid(1,mpi_enreg,dtset%nspden,+1,1,1,dtset%paral_kgb,pawfgr,rhowfg,taug,rhowfr,taur) 921 end if 922 ABI_FREE(rhowfr) 923 ABI_FREE(rhowfg) 924 end if 925 926 ABI_COMMENT('New density rho(r) made from input wfs') 927 928 call timab(59,2,tsec) 929 930 ABI_FREE(vlocal) 931 if (with_vxctau) then 932 ABI_FREE(vxctaulocal) 933 end if 934 935 if (psps%usepaw==1) then 936 call pawcprj_free(cwaveprj) 937 ABI_FREE(cwaveprj) 938 if (mpi_enreg%paral_spinor==1) then 939 call pawcprj_free(cwaveprj_gat) 940 ABI_FREE(cwaveprj_gat) 941 else 942 nullify(cwaveprj_gat) 943 end if 944 end if 945 946 DBG_EXIT("COLL") 947 948 end subroutine energy
ABINIT/m_dft_energy [ Modules ]
NAME
m_dft_energy
FUNCTION
COPYRIGHT
Copyright (C) 1998-2024 ABINIT group (DCA, XG, GMR, AR, MB, MT, EB) This file is distributed under the terms of the GNU General Public License, see ~abinit/COPYING or http://www.gnu.org/copyleft/gpl.txt .
SOURCE
16 #if defined HAVE_CONFIG_H 17 #include "config.h" 18 #endif 19 20 #include "abi_common.h" 21 22 module m_dft_energy 23 24 use defs_basis 25 use defs_wvltypes 26 use m_abicore 27 use m_hamiltonian 28 use m_errors 29 use m_xmpi 30 use m_gemm_nonlop 31 use m_gemm_nonlop_gpu 32 use m_gemm_nonlop_ompgpu 33 use m_xcdata 34 use m_cgtools 35 use m_dtset 36 use m_extfpmd 37 use m_ompgpu_utils 38 39 use defs_datatypes, only : pseudopotential_type 40 use defs_abitypes, only : MPI_type 41 use m_time, only : timab 42 use m_geometry, only : metric 43 use m_kg, only : mkkin 44 use m_energies, only : energies_type 45 use m_electronpositron, only : electronpositron_type, electronpositron_calctype, rhohxcpositron 46 use m_bandfft_kpt, only : bandfft_kpt, bandfft_kpt_type, prep_bandfft_tabs, & 47 bandfft_kpt_savetabs, bandfft_kpt_restoretabs 48 use m_pawang, only : pawang_type 49 use m_pawtab, only : pawtab_type 50 use m_paw_ij, only : paw_ij_type 51 use m_pawfgrtab, only : pawfgrtab_type 52 use m_pawrhoij, only : pawrhoij_type, pawrhoij_alloc, pawrhoij_free, pawrhoij_init_unpacked, & 53 pawrhoij_mpisum_unpacked, pawrhoij_free_unpacked, pawrhoij_inquire_dim, & 54 & pawrhoij_symrhoij 55 use m_pawcprj, only : pawcprj_type,pawcprj_alloc,pawcprj_free,pawcprj_gather_spin 56 use m_pawfgr, only : pawfgr_type 57 use m_paw_dmft, only : paw_dmft_type 58 use m_paw_nhat, only : pawmknhat 59 use m_paw_occupancies, only : pawaccrhoij 60 use m_fft, only : fftpac, fourdp 61 use m_spacepar, only : meanvalue_g, hartre 62 use m_dens, only : constrained_dft_t,mag_penalty 63 use m_mkrho, only : mkrho 64 use m_mkffnl, only : mkffnl 65 use m_getghc, only : getghc 66 use m_rhotoxc, only : rhotoxc 67 use m_mpinfo, only : proc_distrb_cycle 68 use m_nonlop, only : nonlop 69 use m_fourier_interpol, only : transgrid 70 use m_prep_kgb, only : prep_getghc, prep_nonlop 71 use m_psolver, only : psolver_rhohxc 72 73 #ifdef HAVE_FC_ISO_C_BINDING 74 use, intrinsic :: iso_c_binding, only : c_int64_t 75 #endif 76 77 #if defined HAVE_GPU_CUDA 78 use m_manage_cuda 79 #endif 80 81 implicit none 82 83 private
ABINIT/mkresi [ Functions ]
NAME
mkresi
FUNCTION
Make residuals from knowledge of wf in G space and application of Hamiltonian.
INPUTS
cg(2,mcg)=<G|Cnk>=Fourier coefficients of wavefunction gs_hamk <type(gs_hamiltonian_type)>=all data for the Hamiltonian at k icg=shift to be applied on the location of data in the array cg ikpt=index of k-point isppol=index of spin mcg=second dimension of the cg array mpi_enreg=information about MPI parallelization nband=number of bands involved in subspace matrix. npw=number of planewaves in basis sphere at this k point. prtvol=control print volume and debugging output usepaw= 0 for non paw calculation; =1 for paw calculation
OUTPUT
eig_k(nband)$= \langle C_n \mid H \mid C_n \rangle $ for each band. resid_k(nband)=residual for each band $= \langle C_n \mid H H \mid C_n \rangle- \langle C_n \mid H \mid C_n \rangle^2 $.
SOURCE
978 subroutine mkresi(cg,eig_k,gs_hamk,icg,ikpt,isppol,mcg,mpi_enreg,nband,prtvol,resid_k) 979 980 !Arguments ------------------------------------ 981 !scalars 982 integer,intent(in) :: icg,ikpt,isppol,mcg,nband,prtvol 983 type(MPI_type),intent(inout) :: mpi_enreg 984 type(gs_hamiltonian_type),intent(inout) :: gs_hamk 985 !arrays 986 real(dp),intent(in) :: cg(2,mcg) 987 real(dp),intent(out) :: eig_k(nband),resid_k(nband) 988 989 !Local variables------------------------------- 990 !scalars 991 integer,parameter :: tim_getghc=3 992 integer :: blocksize,cpopt,iband,iband_last,iblock,iblocksize,ipw,ipw_shift 993 integer :: my_nspinor,nblockbd,npw_k 994 real(dp) :: doti,dotr 995 !arrays 996 real(dp) :: tsec(2) 997 real(dp),allocatable,target :: cwavef(:,:),ghc(:,:),gsc(:,:),gvnlxc(:,:) 998 real(dp), ABI_CONTIGUOUS pointer :: cwavef_ptr(:,:),ghc_ptr(:,:),gsc_ptr(:,:) 999 type(pawcprj_type) :: cwaveprj(1,1) 1000 1001 ! ************************************************************************* 1002 1003 !Keep track of total time spent in mkresi 1004 call timab(13,1,tsec) 1005 1006 !Parallelism over FFT and/or bands: define sizes and tabs 1007 my_nspinor=max(1,gs_hamk%nspinor/mpi_enreg%nproc_spinor) 1008 if (mpi_enreg%paral_kgb==1) then 1009 nblockbd=nband/(mpi_enreg%nproc_band*mpi_enreg%bandpp) 1010 else 1011 nblockbd=nband/mpi_enreg%nproc_fft 1012 if (nband/=nblockbd*mpi_enreg%nproc_fft) nblockbd=nblockbd+1 1013 end if 1014 blocksize=nband/nblockbd 1015 1016 npw_k=gs_hamk%npw_k 1017 ABI_MALLOC(cwavef,(2,npw_k*my_nspinor)) 1018 ABI_MALLOC(ghc,(2,npw_k*my_nspinor)) 1019 ABI_MALLOC(gvnlxc,(2,npw_k*my_nspinor)) 1020 if (gs_hamk%usepaw==1) then 1021 ABI_MALLOC(gsc,(2,npw_k*my_nspinor)) 1022 else 1023 ABI_MALLOC(gsc,(0,0)) 1024 end if 1025 1026 !Loop over (blocks of) bands 1027 do iblock=1,nblockbd 1028 iband=(iblock-1)*blocksize+1;iband_last=min(iband+blocksize-1,nband) 1029 if(proc_distrb_cycle(mpi_enreg%proc_distrb,ikpt,iband,iband_last,isppol,mpi_enreg%me_kpt)) cycle 1030 1031 ! Load |Cn> 1032 ipw_shift=(iblock-1)*npw_k*my_nspinor*blocksize+icg 1033 !$OMP PARALLEL DO 1034 do ipw=1,npw_k*my_nspinor*blocksize 1035 cwavef(1,ipw)=cg(1,ipw+ipw_shift) 1036 cwavef(2,ipw)=cg(2,ipw+ipw_shift) 1037 end do 1038 1039 ! Compute H|Cn> 1040 cpopt=-1 1041 if (mpi_enreg%paral_kgb==0) then 1042 call getghc(cpopt,cwavef,cwaveprj,ghc,gsc,gs_hamk,gvnlxc,zero,mpi_enreg,1,& 1043 & prtvol,gs_hamk%usepaw,tim_getghc,0) 1044 else 1045 call prep_getghc(cwavef,gs_hamk,gvnlxc,ghc,gsc,zero,nband,mpi_enreg,& 1046 & prtvol,gs_hamk%usepaw,cpopt,cwaveprj,& 1047 & already_transposed=.false.) 1048 end if 1049 1050 !call cg_get_eigens(usepaw, istwf_k, npwsp, nband, cg, ghc, gsc, eig, me_g0, comm_bsf) 1051 !call cg_get_residvecs(usepaw, npwsp, nband, eig, cg, ghc, gsc, gwork) 1052 !call cg_norm2g(istwf_k, npwsp, nband, gwork, resid, me_g0, comm_bsf) 1053 ! MG: Communicators are wrongi if paral_kgb. One should use mpi_enreg%comm_bandspinorfft 1054 1055 ! Compute the residual, <Cn|(H-<Cn|H|Cn>)**2|Cn>: 1056 do iblocksize=1,blocksize 1057 iband=(iblock-1)*blocksize+iblocksize 1058 ipw_shift=(iblocksize-1)*npw_k*my_nspinor 1059 cwavef_ptr => cwavef(:,1+ipw_shift:npw_k*my_nspinor+ipw_shift) 1060 ghc_ptr => ghc (:,1+ipw_shift:npw_k*my_nspinor+ipw_shift) 1061 1062 ! First get eigenvalue <Cn|H|Cn>: 1063 call dotprod_g(dotr,doti,gs_hamk%istwf_k,npw_k*my_nspinor,1,cwavef_ptr,ghc_ptr,& 1064 & mpi_enreg%me_g0,mpi_enreg%comm_spinorfft) 1065 eig_k(iband)=dotr 1066 1067 ! Next need <G|(H-S<Cn|H|Cn>)|Cn> (in ghc): 1068 if (gs_hamk%usepaw==0) then 1069 !$OMP PARALLEL DO PRIVATE(ipw) SHARED(cwavef_ptr,ghc_ptr,eig_k,iband,npw_k,my_nspinor) 1070 do ipw=1,npw_k*my_nspinor 1071 ghc_ptr(1,ipw)=ghc_ptr(1,ipw)-eig_k(iband)*cwavef_ptr(1,ipw) 1072 ghc_ptr(2,ipw)=ghc_ptr(2,ipw)-eig_k(iband)*cwavef_ptr(2,ipw) 1073 end do 1074 else 1075 gsc_ptr => gsc(:,1+ipw_shift:npw_k*my_nspinor+ipw_shift) 1076 !$OMP PARALLEL DO PRIVATE(ipw) SHARED(gsc_ptr,ghc_ptr,eig_k,iband,npw_k,my_nspinor) 1077 do ipw=1,npw_k*my_nspinor 1078 ghc_ptr(1,ipw)=ghc_ptr(1,ipw)-eig_k(iband)*gsc_ptr(1,ipw) 1079 ghc_ptr(2,ipw)=ghc_ptr(2,ipw)-eig_k(iband)*gsc_ptr(2,ipw) 1080 end do 1081 end if 1082 1083 ! Then simply square the result: 1084 call sqnorm_g(dotr,gs_hamk%istwf_k,npw_k*my_nspinor,ghc_ptr,& 1085 & mpi_enreg%me_g0,mpi_enreg%comm_fft) 1086 resid_k(iband)=dotr 1087 1088 end do ! iblocksize 1089 1090 end do ! iblock 1091 1092 ABI_FREE(cwavef) 1093 ABI_FREE(ghc) 1094 ABI_FREE(gvnlxc) 1095 ABI_FREE(gsc) 1096 1097 call timab(13,2,tsec) 1098 1099 end subroutine mkresi