Commit afe6599b authored by Marcello Sega's avatar Marcello Sega

Merge branch 'master' of github.com:espressomd/espresso into shanchen

parents 69eec8f8 5a3eb5c2
......@@ -29,6 +29,8 @@ New user-visible features
* New command time_integration to get the runtime of the integration
loop.
* New harmoic force that runs on the GPU.
User-visible changes
--------------------
* Comfixed now works with periodic boundary conditions.
......
......@@ -300,10 +300,10 @@ AS_IF([test x$with_cuda != xno],[
# NVCC compile check
AC_MSG_CHECKING([whether CUDA compiles])
# if no other compute capability is defined by the user, we require at least 1.1
# if no other compute capability is defined by the user, we default to 2.0
case "$NVCCFLAGS" in
*-arch=*) ;;
*) NVCCFLAGS="$NVCCFLAGS --ptxas-options=-v -gencode arch=compute_11,code=compute_11 -gencode arch=compute_20,code=compute_20"
*) NVCCFLAGS="$NVCCFLAGS --ptxas-options=-v -gencode arch=compute_20,code=compute_20"
esac
# use nvcc
......
......@@ -473,6 +473,21 @@ use the \lit{electrokinetics reaction region} command, one must first set up
a reaction, as described above. To successfully specify a region all the
relevant arguments that go with the shape constraints must be provided.
\subsubsection{\label{sssec:ek-pdb-parse}Parsing PDB Files}
\begin{essyntax}
\require{1 or 2 or 3}{electrokinetics}
\require{2}{pdb-parse}
\var{pdb\_filename}
\var{itp\_filename}
\begin{features}
\required[1]{ELECTROKINETICS}
\required[2]{EK_BOUNDARIES}
\required[3]{EK_REACTIONS}
\end{features}
\end{essyntax}
The \lit{electrokinetics pdb-parse} feature allows the user to parse simple PDB files, a file format introduced by the protein database to encode molecular structures. Together with a topology file (here \var{itp\_filename}) the structure gets interpolated to the \lit{electrokinetics} grid. For the input you will need to prepare a PDB file with a \lit{gromacs} force field to generate the topology file. Normally the PDB file extension is \lit{.pdb}, the topology file extension is \lit{.itp}. Obviously the PDB file is placed instead of \var{pdb\_filename} and the topology file instead of \var{itp\_filename}. \todo{At the moment this fails badly, if you try to parse incorrectly formatted files. This will be fixed in the future.}
\subsection{\label{ssec:ek-reac-output}Reaction-Specific Output}
\begin{essyntax}
......
......@@ -155,9 +155,10 @@ configure --help
be used to define compiler flags for the NVIDIA CUDA-compiler
\texttt{nvcc}. For example, \texttt{NVCCFLAGS = "{}-gencode
arch=compute_20,code=sm_20"{}} will compile code only for Fermi
cards. Default is to compile for compute models 1.1 and 2.0,
i.e. everything with a G90 chip or newer. Note that we require at
least compute model 1.1.
cards. Default is to compile for compute model 2.0,
i.e. everything with a Fermi chip or newer. Note that we require at
least compute model 1.1, that is G90. However, to use G90 (e.\,g.~Tesla
C1060), you need to manually specificy compute model 1.1.
\end{description}
\section{\texttt{make}: Compiling, testing and installing \es}
......
......@@ -865,6 +865,20 @@ constraint \opt{\var{num}}
Prints out all constraint information. If \var{num} is specified only this
constraint is displayed, otherwise all constraints will be printed.
\subsection{\texttt{harmonic_force}: Creating a harmonic trap}
\newescommand{harmonic-force}
\begin{essyntax}
harmonic_force \{ \var{x} \var{y} \var{z} \} \var{k}
\begin{features}
\required{HARMONICFORCE}
\required{CUDA}
\end{features}
\end{essyntax}
Calculates a spring force for all particles, where the equilibrium position
of the spring is at \var{x y z} and it's force constant is \var{k}. A more
flexible trap can be constructed with constraints, but this one runs on the GPU.
\section{Virtual sites}
\label{sec:virtual}
\index{virtual sites|mainindex}
......
......@@ -24,7 +24,7 @@
\newescommand{integrate}
\begin{essyntax}
\variant{1} integrate \var{steps}
\variant{1} integrate \opt{reuse_forces} \var{steps}
\variant{2} integrate set \opt{nvt}
\variant{3} integrate set npt_isotropic \var{p_{ext}} \var{piston} \opt{\var{x\: y\: z}} \opt{-cubic_box}
\end{essyntax}
......@@ -34,6 +34,27 @@ equations of motion. The command \texttt{integrate} with an integer
\texttt{steps} as parameter integrates the system for \texttt{steps}
time steps.
Note that this implementation of the Velocity Verlet algorithm reuses forces,
that is, they are computed once in the middle of the time step, but used twice,
at the beginning and end. However, in the first time step after setting up,
there are no forces present yet. Therefore, \es{} has to compute them before the
first time step. That has two consequences: first, random forces are redrawn,
resulting in a narrower distribution of the random forces, which we compensate
by stretching. Second, coupling forces of e.\,g. the Lattice Boltzmann fluid cannot
be computed and are therefore lacking in the first half time step. In order to
minimize these effects, \es{} has a quite conservative heuristics to decide whether
a change makes it necessary to recompute forces before the first time step. Therefore,
calling hundred times \texttt{integrate 1} does the same as \texttt{integrate 100},
apart from some small calling overhead.
However, for checkpointing, there is no way for \es{} to tell that the forces that you
read back in actually match the parameters that are set. Therefore, \es{} would recompute
the forces before the first time step, which makes it essentially impossible to checkpoint
LB simulations, where it is vital to keep the coupling forces. Therefore, \texttt{integrate}
has an additional parameter \opt{reuse_forces}, which tells integrate to not recalculate
the forces for the first time step, but use that the values still stored with the particles.
Use this only if you are absolutely sure that the forces stored match your current setup!
Two methods for the integration can be set: For an NVT ensemble
(thermostat) and for an NPT isotropic ensemble (barostat). The current
method can be detected with the command \texttt{integrate set} without
......
......@@ -36,6 +36,7 @@
#define ELECTROKINETICS
#define EK_BOUNDARIES
#define EK_REACTION
#define HARMONICFORCE
#endif
#define AREA_FORCE_GLOBAL
......
......@@ -195,8 +195,20 @@ proc writevsf { file args } {
lappend aids "$from:$to"
}
unset from
puts $file "atom [join $aids ,] $desc"
# let's group atom ranges, so that there are no more than 8 per line
# This fixes a problem with the vmd plugin, and makes the vtf more
# readable anyway.
set start 0
set maxlen 8
set ll [llength [lrange $aids $start end ]]
while { $ll >= $maxlen } {
puts $file "atom [join [lrange $aids $start [expr $start + $maxlen -1]] ,] $desc"
incr start $maxlen
set ll [llength [lrange $aids $start end ]]
}
if { $start < [llength $aids ] } {
puts $file "atom [join [lrange $aids $start end] ,] $desc"
}
}
# Print bond data
......
......@@ -6,10 +6,16 @@
#include <iostream>
/* Need explicite specialization, otherwise some compilers do not produce the objects. */
template class EspressoSystemInterface::const_iterator<SystemInterface::Real>;
template class EspressoSystemInterface::const_iterator<SystemInterface::Vector3>;
template class EspressoSystemInterface::const_iterator<int>;
/********************************************************************************************/
template<class value_type>
const value_type EspressoSystemInterface::const_iterator<value_type>::operator*() const {
value_type EspressoSystemInterface::const_iterator<value_type>::operator*() const {
return (*m_const_iterator);
}
......@@ -53,7 +59,9 @@ void EspressoSystemInterface::gatherParticles() {
if (m_gpu)
{
if(gpu_get_global_particle_vars_pointer_host()->communication_enabled) {
ESIF_TRACE(puts("Calling copy_part_data_to_gpu()"));
copy_part_data_to_gpu();
reallocDeviceMemory(gpu_get_global_particle_vars_pointer_host()->number_of_particles);
if(m_splitParticleStructGpu && (this_node == 0))
split_particle_struct();
}
......
#ifndef ESPRESSOSYSTEMINTERFACE_H
#define ESPRESSOSYSTEMINTERFACE_H
#define ESIF_TRACE(A)
#include "SystemInterface.hpp"
#include "particle_data.hpp"
#include "cuda_interface.hpp"
class EspressoSystemInterface : public SystemInterface {
......@@ -20,7 +21,7 @@ public:
template<class value_type>
class const_iterator : public SystemInterface::const_iterator<value_type> {
public:
const value_type operator*() const;
value_type operator*() const;
SystemInterface::const_iterator<value_type> &operator=(const SystemInterface::const_iterator<value_type> &rhs);
EspressoSystemInterface::const_iterator<value_type> &operator=(typename std::vector<value_type>::const_iterator rhs);
bool operator==(SystemInterface::const_iterator<value_type> const &rhs) const;
......@@ -45,6 +46,7 @@ public:
bool hasQ() { return true; };
#endif
#ifdef CUDA
float *rGpuBegin() { return m_r_gpu_begin; };
float *rGpuEnd() { return m_r_gpu_end; };
bool hasRGpu() { return true; };
......@@ -52,6 +54,8 @@ public:
m_needsRGpu = hasRGpu();
m_splitParticleStructGpu |= m_needsRGpu;
m_gpu |= m_needsRGpu;
if(m_gpu)
enableParticleCommunication();
return m_needsRGpu;
};
......@@ -62,6 +66,8 @@ public:
m_needsVGpu = hasVGpu();
m_splitParticleStructGpu |= m_needsVGpu;
m_gpu |= m_needsVGpu;
if(m_gpu)
enableParticleCommunication();
return m_needsVGpu;
};
......@@ -72,18 +78,55 @@ public:
m_needsQGpu = hasQGpu();
m_splitParticleStructGpu |= m_needsQGpu;
m_gpu |= m_needsQGpu;
if(m_gpu)
enableParticleCommunication();
return m_needsQGpu;
};
bool requestParticleStructGpu() {
m_needsParticleStructGpu = true;
m_gpu |= m_needsParticleStructGpu;
if(m_gpu)
enableParticleCommunication();
return true;
}
float *fGpuBegin() { return (float *)gpu_get_particle_force_pointer(); };
float *fGpuEnd() { return (float *)(gpu_get_particle_force_pointer()) + 3*m_gpu_npart; };
bool hasFGpu() { return true; };
bool requestFGpu() {
m_needsFGpu = hasFGpu();
m_gpu |= m_needsFGpu;
if(m_gpu)
enableParticleCommunication();
return m_needsFGpu;
};
#endif
unsigned int npart_gpu() {
#ifdef CUDA
return m_gpu_npart;
#else
return 0;
#endif
};
protected:
void gatherParticles();
void split_particle_struct();
#ifdef CUDA
void enableParticleCommunication() {
if(!gpu_get_global_particle_vars_pointer_host()->communication_enabled) {
ESIF_TRACE(puts("gpu communication not enabled;"));
ESIF_TRACE(puts("enableParticleCommunication"));
gpu_init_particle_comm();
cuda_bcast_global_part_params();
reallocDeviceMemory(gpu_get_global_particle_vars_pointer_host()->number_of_particles);
}
};
void reallocDeviceMemory(int n);
#endif
Vector3Container R;
#ifdef ELECTROSTATICS
......@@ -115,12 +158,6 @@ protected:
bool m_splitParticleStructGpu;
};
/* Need explicite specialization, otherwise some compilers do not produce the objects. */
template class EspressoSystemInterface::const_iterator<SystemInterface::Real>;
template class EspressoSystemInterface::const_iterator<SystemInterface::Vector3>;
template class EspressoSystemInterface::const_iterator<int>;
extern EspressoSystemInterface espressoSystemInterface;
#endif
#include "EspressoSystemInterface.hpp"
#include "cuda_interface.hpp"
#include "cuda_utils.hpp"
......@@ -57,34 +58,39 @@ __global__ void split_kernel_v(CUDA_particle_data *particles, float *v, int n) {
v[idx + 2] = p.v[2];
}
void EspressoSystemInterface::reallocDeviceMemory(int n) {
if(m_needsRGpu && ((n != m_gpu_npart) || (m_r_gpu_begin == 0))) {
if(m_r_gpu_begin != 0)
cuda_safe_mem(cudaFree(m_r_gpu_begin));
cuda_safe_mem(cudaMalloc(&m_r_gpu_begin, 3*n*sizeof(float)));
m_r_gpu_end = m_r_gpu_begin + 3*n;
}
if(m_needsVGpu && ((n != m_gpu_npart) || (m_v_gpu_begin == 0))) {
if(m_v_gpu_begin != 0)
cuda_safe_mem(cudaFree(m_v_gpu_begin));
cuda_safe_mem(cudaMalloc(&m_v_gpu_begin, 3*n*sizeof(float)));
m_v_gpu_end = m_v_gpu_begin + 3*n;
}
if(m_needsQGpu && ((n != m_gpu_npart) || (m_q_gpu_begin == 0))) {
if(m_q_gpu_begin != 0)
cuda_safe_mem(cudaFree(m_q_gpu_begin));
cuda_safe_mem(cudaMalloc(&m_q_gpu_begin, 3*n*sizeof(float)));
m_q_gpu_end = m_q_gpu_begin + 3*n;
}
m_gpu_npart = n;
}
void EspressoSystemInterface::split_particle_struct() {
int n = gpu_get_global_particle_vars_pointer_host()->number_of_particles;
if(n == 0)
return;
if( (n != m_gpu_npart) ) {
if(m_needsRGpu) {
if(m_r_gpu_begin != 0)
cuda_safe_mem(cudaFree(m_r_gpu_begin));
cuda_safe_mem(cudaMalloc(&m_r_gpu_begin, 3*n*sizeof(float)));
m_r_gpu_end = m_r_gpu_begin + 3*n;
}
if(m_needsVGpu) {
if(m_v_gpu_begin != 0)
cuda_safe_mem(cudaFree(m_v_gpu_begin));
cuda_safe_mem(cudaMalloc(&m_v_gpu_begin, 3*n*sizeof(float)));
m_v_gpu_end = m_v_gpu_begin + 3*n;
}
if(m_needsQGpu) {
if(m_q_gpu_begin != 0)
cuda_safe_mem(cudaFree(m_q_gpu_begin));
cuda_safe_mem(cudaMalloc(&m_q_gpu_begin, n*sizeof(float)));
m_q_gpu_end = m_q_gpu_begin + n;
}
}
m_gpu_npart = n;
ESIF_TRACE(printf("n = %d, m_gpu_npart = %d\n", n, m_gpu_npart));
dim3 grid(n/512+1,1,1);
dim3 block(512,1,1);
......
#include "HarmonicForce.hpp"
#include "cuda_utils.hpp"
#include <stdio.h>
#ifdef HARMONICFORCE
HarmonicForce *harmonicForce = 0;
__global__ void HarmonicForce_kernel(float x, float y, float z, float k,
int n, float *pos, float *f) {
int id = blockIdx.x * blockDim.x + threadIdx.x;
if(id >= n)
return;
f[3*id + 0] += k*(x - pos[3*id + 0]);
f[3*id + 1] += k*(y - pos[3*id + 1]);
f[3*id + 2] += k*(z - pos[3*id + 2]);
}
void HarmonicForce_kernel_wrapper(float x, float y, float z, float k, int n, float *pos, float *f) {
dim3 grid(1,1,1);
dim3 block(1,1,1);
if(n == 0)
return;
if(n <= 512) {
grid.x = 1;
block.x = n;
} else {
grid.x = n/512 + 1;
block.x = 512;
}
KERNELCALL(HarmonicForce_kernel,grid,block,(x, y, z, k, n, pos, f))
}
#endif
#include "config.hpp"
#ifdef HARMONICFORCE
#include "SystemInterface.hpp"
#include <iostream>
void HarmonicForce_kernel_wrapper(float x, float y, float z, float k,
int n, float *pos, float *f);
class HarmonicForce {
public:
HarmonicForce(float x1, float x2, float x3, float _k, SystemInterface &s) {
x = x1;
y = x2;
z = x3;
k = _k;
if(!s.requestFGpu())
std::cerr << "HarmonicForce needs access to forces on GPU!" << std::endl;
if(!s.requestRGpu())
std::cerr << "HarmonicForce needs access to positions on GPU!" << std::endl;
};
void calc(SystemInterface &s) {
HarmonicForce_kernel_wrapper(x,y,z,k,s.npart_gpu(),
s.rGpuBegin(), s.fGpuBegin());
};
protected:
float x,y,z;
float k;
};
extern HarmonicForce *harmonicForce;
#endif
......@@ -45,6 +45,7 @@ libEspresso_la_SOURCES = \
cuda_init.hpp \
debug.cpp debug.hpp \
domain_decomposition.cpp domain_decomposition.hpp \
electrokinetics_pdb_parse.cpp electrokinetics_pdb_parse.hpp \
energy.cpp energy.hpp \
errorhandling.cpp errorhandling.hpp \
fft.cpp fft.hpp \
......@@ -105,7 +106,8 @@ libEspresso_la_SOURCES = \
ghmc.cpp ghmc.hpp \
Vector.hpp \
SystemInterface.hpp \
EspressoSystemInterface.hpp EspressoSystemInterface.cpp
EspressoSystemInterface.hpp EspressoSystemInterface.cpp \
HarmonicForce.hpp
# nonbonded potentials and forces
libEspresso_la_SOURCES += \
......@@ -194,7 +196,8 @@ CUDA_SOURCES = \
electrokinetics.hpp \
lbgpu_cuda.cu \
p3m_gpu_cuda.cu \
EspressoSystemInterface_cuda.cu
EspressoSystemInterface_cuda.cu \
HarmonicForce.cu
libEspresso_la_SOURCES += $(CUDA_SOURCES)
EXTRA_DIST += $(CUDA_SOURCES)
......
......@@ -18,7 +18,7 @@ public:
template<class value_type>
class const_iterator {
public:
virtual const value_type operator*() const = 0;
virtual value_type operator*() const = 0;
virtual const_iterator<value_type> &operator=(const const_iterator<value_type> &rhs) = 0;
virtual bool operator==(const_iterator<value_type> const &rhs) const = 0;
virtual bool operator!=(const_iterator<value_type> const &rhs) const = 0;
......@@ -27,7 +27,7 @@ public:
class null_vec_iterator : public SystemInterface::const_iterator<Vector3> {
public:
const Vector3 operator*() const { return Vector3(); };
Vector3 operator*() const { return Vector3(); };
const_iterator<Vector3> &operator=(const const_iterator<Vector3> &rhs) { return *this; };
bool operator==(const_iterator<Vector3> const &rhs) const { return true; };
bool operator!=(const_iterator<Vector3> const &rhs) const { return false; };
......@@ -38,7 +38,7 @@ public:
class null_real_iterator : public SystemInterface::const_iterator<Real> {
public:
const Real operator*() const { return Real(); };
Real operator*() const { return Real(); };
const_iterator<Real> &operator=(const const_iterator<Real> &rhs) { return *this; };
bool operator==(const_iterator<Real> const &rhs) const { return true; };
bool operator!=(const_iterator<Real> const &rhs) const { return false; };
......@@ -109,18 +109,25 @@ public:
virtual bool hasQGpu() { return false; };
virtual bool requestQGpu() { m_needsQGpu = hasQGpu(); return m_needsQGpu; }
virtual float *fGpuBegin() { return 0; };
virtual float *fGpuEnd() { return 0; };
virtual bool hasFGpu() { return false; };
virtual bool requestFGpu() { m_needsFGpu = hasFGpu(); return m_needsFGpu; }
virtual const_real_iterator &qBegin() { return null_scalar; };
virtual const const_real_iterator &qEnd() { return null_scalar; };
virtual bool hasQ() { return false; };
virtual bool requestQ() { m_needsQ = hasQ(); return m_needsQ; }
virtual unsigned int npart() = 0;
virtual unsigned int npart_gpu() = 0;
virtual Vector3 box() = 0;
virtual bool needsR() { return m_needsR; };
virtual bool needsRGpu() { return m_needsRGpu;};
virtual bool needsQGpu() { return m_needsQGpu;};
virtual bool needsQ() { return m_needsQ;};
virtual bool needsFGpu() { return m_needsFGpu; };
protected:
bool m_needsR;
......@@ -129,7 +136,7 @@ protected:
bool m_needsRGpu;
bool m_needsVGpu;
bool m_needsQGpu;
bool m_needsFGpu;
};
#endif
......@@ -1172,18 +1172,19 @@ void mpi_remove_particle_slave(int pnode, int part)
}
/********************* REQ_INTEGRATE ********/
int mpi_integrate(int n_steps)
int mpi_integrate(int n_steps, int reuse_forces)
{
if (!correlations_autoupdate) {
mpi_call(mpi_integrate_slave, -1, n_steps);
integrate_vv(n_steps);
mpi_call(mpi_integrate_slave, n_steps, reuse_forces);
integrate_vv(n_steps, reuse_forces);
COMM_TRACE(fprintf(stderr, "%d: integration task %d done.\n", \
this_node, n_steps));
return check_runtime_errors();
} else {
for (int i=0; i<n_steps; i++) {
mpi_call(mpi_integrate_slave, -1, 1);
integrate_vv(1);
mpi_call(mpi_integrate_slave, 1, reuse_forces);
integrate_vv(1, reuse_forces);
reuse_forces = 0; // makes even less sense after the first time step
COMM_TRACE(fprintf(stderr, "%d: integration task %d done.\n", \
this_node, i));
if (check_runtime_errors())
......@@ -1195,10 +1196,10 @@ int mpi_integrate(int n_steps)
return 0;
}
void mpi_integrate_slave(int pnode, int task)
void mpi_integrate_slave(int n_steps, int reuse_forces)
{
integrate_vv(task);
COMM_TRACE(fprintf(stderr, "%d: integration task %d done.\n", this_node, task));
integrate_vv(n_steps, reuse_forces);
COMM_TRACE(fprintf(stderr, "%d: integration for %d n_steps with %d reuse_forces done.\n", this_node, n_steps, reuse_forces));
check_runtime_errors();
}
......
......@@ -310,9 +310,10 @@ void mpi_recv_part(int node, int part, Particle *part_data);
/** Issue REQ_INTEGRATE: start integrator.
@param n_steps how many steps to do.
@param reuse_forces whether to trust the old forces for the first half step
@return nonzero on error
*/
int mpi_integrate(int n_steps);
int mpi_integrate(int n_steps, int reuse_forces);
/** Issue REQ_BCAST_IA: send new ia params.
Also calls \ref on_short_range_ia_change.
......
......@@ -180,6 +180,7 @@ void gpu_change_number_of_part_to_comm() {
#endif
cuda_safe_mem(cudaMalloc((void**)&particle_forces_device, global_part_vars_host.number_of_particles * sizeof(CUDA_particle_force)));
cuda_safe_mem(cudaMalloc((void**)&particle_data_device, global_part_vars_host.number_of_particles * sizeof(CUDA_particle_data)));
cuda_safe_mem(cudaMalloc((void**)&particle_seeds_device, global_part_vars_host.number_of_particles * sizeof(CUDA_particle_seed)));
#ifdef SHANCHEN
......@@ -249,8 +250,8 @@ CUDA_fluid_composition* gpu_get_fluid_composition_pointer() {
}
void copy_part_data_to_gpu() {
COMM_TRACE(printf("global_part_vars_host.communication_enabled = %d && global_part_vars_host.number_of_particles = %d\n", global_part_vars_host.communication_enabled, global_part_vars_host.number_of_particles));
if ( global_part_vars_host.communication_enabled == 1 && global_part_vars_host.number_of_particles ) {
cuda_mpi_get_particles(particle_data_host);
/** get espresso md particle values*/
......
......@@ -38,7 +38,6 @@ void cuda_mpi_get_particles(CUDA_particle_data *particle_data_host)
sizes = (int*) malloc(sizeof(int)*n_nodes);
n_part = cells_get_n_particles();
/* first collect number of particles on each node */
MPI_Gather(&n_part, 1, MPI_INT, sizes, 1, MPI_INT, 0, comm_cart);
......
This diff is collapsed.
/* vim: set ts=8 sts=2 sw=2 et: */
#ifndef _PDB_PARSER_H
#define _PDB_PARSER_H
#include "electrokinetics.hpp"
#ifdef ELECTROKINETICS
extern float* pdb_charge_lattice;
extern int* pdb_boundary_lattice;
/* Returns 0/1 if reading the files was successful/unsuccessful */
int pdb_parse(char* pdb_filename, char* itp_filename);
int print_charge_field(char* filename);
int print_boundary_lattice(char* filename);
#endif
#endif
......@@ -46,7 +46,7 @@ void calc_long_range_energies();
void energy_calc(double *result)
{
if (!check_obs_calc_initialized())
if (!interactions_sanity_checks())
return;
init_energies(&energy);
......
......@@ -28,6 +28,8 @@ NPT
GHMC
CATALYTIC_REACTIONS
HARMONICFORCE requires CUDA
/* Rotation */
ROTATION
ROTATIONAL_INERTIA implies ROTATION
......@@ -133,6 +135,7 @@ OLD_DIHEDRAL notest
/* turn off nonbonded interactions within molecules */
NO_INTRA_NB notest
/* Debugging */
ADDITIONAL_CHECKS
ASYNC_BARRIER
......
......@@ -54,6 +54,7 @@
#include "iccp3m.hpp"
#include "p3m_gpu.hpp"
#include "cuda_interface.hpp"
#include "HarmonicForce.hpp"
#include "EspressoSystemInterface.hpp"
......@@ -72,9 +73,35 @@ void init_forces();
void force_calc()
{
// Communication step: distribute ghost positions
cells_update_ghosts();
// VIRTUAL_SITES pos (and vel for DPD) update for security reason !!!
#ifdef VIRTUAL_SITES
update_mol_vel_pos();
ghost_communicator(&cell_structure.update_ghost_pos_comm);
#endif
#if defined(VIRTUAL_SITES_RELATIVE) && defined(LB)
// This is on a workaround stage:
// When using virtual sites relative and LB at the same time, it is necessary
// to reassemble the cell lists after all position updates, also of virtual
// particles.
if ((lattice_switch & LATTICE_LB) && cell_structure.type == CELL_STRUCTURE_DOMDEC && (!dd.use_vList) )
cells_update_ghosts();
#endif
#ifdef COLLISION_DETECTION
prepare_collision_queue();
#endif
espressoSystemInterface.update();
#ifdef HARMONICFORCE
if(harmonicForce)
harmonicForce->calc(espressoSystemInterface);
#endif
#ifdef LB_GPU
#ifdef SHANCHEN
if (lattice_switch & LATTICE_LB_GPU && this_node == 0) lattice_boltzmann_calc_shanchen_gpu();
......@@ -82,7 +109,7 @@ void force_calc()
// transfer_momentum_gpu check makes sure the LB fluid doesn't get updated on integrate 0
// this_node==0 makes sure it is the master node where the gpu exists
if (lattice_switch & LATTICE_LB_GPU && transfer_momentum_gpu && this_node==0 ) lb_calc_particle_lattice_ia_gpu();
if (lattice_switch & LATTICE_LB_GPU && transfer_momentum_gpu && (this_node == 0) ) lb_calc_particle_lattice_ia_gpu();
#endif // LB_GPU
#ifdef ELECTROSTATICS
......@@ -146,10 +173,36 @@ void force_calc()
meta_perform();
#endif
#if defined(LB_GPU) || (defined(ELECTROSTATICS) && defined(CUDA))
#ifdef CUDA
copy_forces_from_GPU();
#endif
// VIRTUAL_SITES distribute forces
#ifdef VIRTUAL_SITES
ghost_communicator(&cell_structure.collect_ghost_force_comm);
init_forces_ghosts();
distribute_mol_force();
#endif
// Communication Step: ghost forces