you can’t take eight lawnmower engines, put them together and now claim you have an eight-cylinder Ferrari. –阿南德·钱德拉塞克(Anand Chandrasekher)高通副总裁
eMD用户手册–Python扩展
eMD一个重要的设计目标:形成一套适合分子动力学模拟的编程框架库。纵观现有的成熟MD软件:LAMMPS设计了一套特有的input脚本编程逻辑,但要熟练掌握并不容易,而且受限很大。NAMD可以识别Tcl脚本的配置选项,但所提供的编程逻辑很有限;GROMACS基于命令行工具集以便构建一套“编程”操作环境,但只限于计算模拟前后阶段处理,对计算模拟阶段缺少干预,而这种干预常常是需要的。eMD选择Python脚本语言作为构建编程框架库的基础,相比上述软件提供了更多的灵活操作,这也是新型MD软件如HoodMD、OpenMM等代表软件的选择趋势。
eMD的Python编程框架的几大优势: 1. 采用SWIG包装了底层C/C++代码框架,Python层接近C/C++层的计算速度 2. 与底层C/C++同一套API,提供Python的面向对象类和函数库,可以实现类的继承重载,控制更为细粒度 3. 可对接numpy库的ndarray数据结构,原子信息的数组无需复制可借用ndarray的强大操作 4. 可与MPI4Py库混用,eMD内部MPI通讯可以直接切换为MPI4Py的高级操作
eMD采用JSON格式的配置参数,在Python的使用模式中,封装了jsoncpp库,JSON的这种配置可以划分得很为精细,与Python的字典和列表等数据结构可以混用。因此JSON配置参数有静态和动态之分: 1. 静态方式:通过emdrun使用完整的配置方式,这种配置是静态的,面向用户无需通过参数配置来干预模拟的场合 2. 动态方式:使用eMD的Python模块,动态地配置JSON参数,即参数可以在运行时动态设置,适合对MD模拟过程需要很多自定义的场合。
动态配置的模拟可方便实现分子团作用力的特殊操作,如施加动态拖拉力,模拟参数提取,实现更高级抽样算法等等,这需要灵活的MD“编程”过程,eMD提供的Python模块完全支持这种扩展的使用场景。当前eMD提供的python3模块emd,支持预处理、计算模拟和后处理三大阶段的“编程”处理,并且支持MPI的并行化和GPU异构加速计算。eMD的Python动态模式和emdrun静态模式为MD用户构建一套完整的应用解决方案。
安装简介
eMD python模块编译依赖于Python3、Numpy和Swig工具,另外Python3的MPI4PY模块可选,但推荐先安装。需要常用的MPICXX编译器(如OpenMPI),且支持C++11;如果需编译GPU加速模块,则需要CUDA 8.0+或HIP 2.0+的SDK环境。
eMD python模块在编译时,会自动获取python3、numpy、mpi4py库路径;编译过程会调用swig生成emd_wrap.cpp包装文件,这个文件最后会编译成名为_emd.so的Python模块文件。
编译通过CMake方式,需要添加-DUSE_PYTHON=ON选项,进入eMD源代码根目录之后,一般编译过程为:
mkdir -p build && cd build
rm -rf *
cmake -L -DCMAKE_BUILD_TYPE=Debug -DUSE_PYTHON=ON -DCMAKE_INSTALL_PREFIX=${EMD_ROOT} ..
make -j2 VERBOSE=1
make install
${EMD_ROOT}变量指定安装目录。同时也可以组合GPU编译方式-DUSE_CUDA=ON或-DUSE_HIP=ON
mkdir -p build && cd build
rm -rf *
cmake -L -DCMAKE_BUILD_TYPE=Release -DUSE_PYTHON=ON -DCUDA_NVCC_FLAGS="-std=c++11;-Wno-deprecated-gpu-targets" \
-DUSE_CUDA=ON -DCMAKE_BUILD_TESTING=ON -DCMAKE_INSTALL_PREFIX=${EMD_ROOT} ..
make -j4 VERBOSE=1
make install
make install安装将emd的Python模块正确安装之后,就可以在任意目录运行测试实例:
使用简介
eMD的Python3模块名为emd,其中提供了多个实例文件,使用时推荐参考这些实例,这些实例覆盖大量应用场景。
- lj_from_file.py 动态创建完整的json文件,并基于该配置文件启动Lennard Jones模拟
- lj_from_json.py 使用eMD模块级的Json对象启动Lennard Jones模拟
- lj_from_func.py 使用eMD函数调用方式启动Lennard Jones模拟
- lj_from_class.py 通过构建eMD继承类调用方式启动Lennard Jones模拟
- lj_ave_time.py 在Lennard Jones模拟中基于原子tag和group 来抽样的实例
- lj_pair_one.py 获取Lennard Jones的两种原子类型i和j之间作用势的离散值列表
- lj_graph.py 使用eMD模块和Tkinter构建MD模拟过程的统计量图示化
- mpi_check.py 测试emd.CMPIComm和mpi4py.MPI.Intracomm的MPI通讯和混用模式
- jsoncpp_check.py 测试JSON参数与Python对象之间的互操作
- array_check.py 测试
std::vector<T>,emd::carray<T>和hup::ndvector<T>的C++模版类在Python中的使用,以及与numpy.ndarray的转换。eMD模块中大量使用了以上模版类。
导入emd库的方式有如:
在和MPI4PY混合使用时,考虑到MPI初始化依赖,MPI4PY需要先于emd导入
use_mpi4py = True
try:
from mpi4py import MPI
except ImportError:
use_mpi4py = False
if __package__ or '.' in __name__:
from .emd import *
else:
from emd import *
在EMD对象构建之前,需要生成ContextInitializer的对象,用于初始化运行环境:
from sys import argv
initializer = ContextInitializer(argv, CMPIComm.NeedInitialize())
rank = CMPIComm.getInstance().get_rank()
emd_obj = EMD()
if rank == 0:
emd_obj.help()
emd模块提供了几个测试实例,分别为mpi_check.py、jsoncpp_check.py和array_check.py。执行它们可以用于测试emd模块是否正确,也可做使用参考。
这里给一个较完整的例子(见emd/lj_ave_time.py),实现EMD类的继承,构建了MDVV类,定制化了run_step函数;模拟为Lennard Jones体系,可实现基于atom tag或group的部分原子抽样。以下为脚本内容:
#!/usr/bin/env python3
# -*- coding: UTF-8
help_str = """eMD python interface.
Run by:
export PYTHONPATH=$PYTHONPATH:`pwd`
$ mpirun -n 2 python3 emd/lj_ave_time.py
$ mpirun -n 2 python3 -m emd.lj_ave_time
----------------------------------------------------------------------
Shun Xu <xushun@cnic.cn>
"""
if __package__ or '.' in __name__:
from .emd import *
else:
from emd import *
import numpy as np
class PyMechanics(MDVV):
def __init__(self, ptr):
MDVV.__init__(self, ptr)
def init_all(self, nsteps):
self.init({
"run_steps": nsteps,
"timestep": 0.01,
"slots":{
"type" : "temp_berendsen",
"tstart" : 1.44,
"tstop" : 1.44,
"tau" : 0.12,
"seed" : 3434 }
})
self.pos_time = []
def run_step(self, emd_obj, timer):
self.first_step = 0
self.last_step = emd_obj.nsteps
self.end_step = self.timestep + emd_obj.nsteps
# usually set group after particles.setup()
# gid = new_std_string("wat")
s, gid = emd_obj.particles.group_list.assign_one({"type" : "atom_id",
"list" : [1], "id":"wat" },
emd_obj.comm.comm_world)
emd_obj.particles.group_list.print_state(emd_obj.comm.comm_world);
self.gbit = emd_obj.particles.group_list.group_bit(gid)
emd_obj.comm.comm_world.Barrier()
self.itimestep = self.first_step
while self.itimestep < self.last_step:
self.timestep += 1
self.ev_setup(self.timestep)
timer.stamp()
self.initial_integrate(self.v_req)
self.post_integrate()
timer.stamp(T_MECHANICS)
if emd_obj.force_field.neighbor_decide(self.timestep):
self.build_neighbor(timer)
else:
timer.stamp()
emd_obj.comm.sd_inform_ghosts(emd_obj.force_field.ghost_vel)
timer.stamp(T_COMM)
# calculate the new force with neighbor list
if not self.external_force_clear:
emd_obj.force_field.clear()
timer.stamp()
self.pre_force(self.v_req, 0)
timer.stamp(T_MECHANICS)
emd_obj.force_field.compute_pair(self.e_req, self.v_req)
timer.stamp(T_PAIR);
# force_field->compute=compute_bond+compute_pair
# separated here to do the timing for them
emd_obj.force_field.compute_bond(self.e_req, self.v_req)
timer.stamp(T_BOND);
emd_obj.force_field.compute_long(self.e_req, self.v_req)
timer.stamp(T_LONG);
if emd_obj.force_field.newton_flag:
emd_obj.comm.sd_reverse_ghosts()
timer.stamp(T_COMM)
self.post_force(self.v_req, 0)
self.final_integrate()
self.end_of_step()
timer.stamp(T_MECHANICS)
emd_obj.sample.write(self.timestep)
timer.stamp(T_OUTPUT)
# self.sample_by_tag(emd_obj)
self.sample_by_group(emd_obj)
self.itimestep += 1
# end run nsteps
emd_obj.particles.lost_check(emd_obj.comm.comm_world)
return 0
def sample_by_tag(self, emd_obj):
atom_vec = emd_obj.particles.atom_vec
nlocal = atom_vec.natom_local
atom_tag = 0 # sample the first atom by tag(id)=0 (in json file tag id=1)
i = atom_vec.tagmap_index(atom_tag)
if i >= 0 and i < nlocal:
# print(atom_vec.pos[i])
# pos_np = atom_vec.pos.as_array2d() # return numpy.ndarray
# print(pos_np[i, :])
# print(pos_np[i, 0], pos_np[i, 1], pos_np[i, 2])
self.pos_time.append([atom_vec.pos[i].x, atom_vec.pos[i].y, atom_vec.pos[i].z])
if self.itimestep + 1 == self.last_step:
self.sample_reduce(emd_obj)
def sample_by_group(self, emd_obj):
atom_vec = emd_obj.particles.atom_vec
lst = vector_int()
emd_obj.particles.group_list.group_index(self.gbit, lst)
if lst.size():
id_lst = lst.as_array() # return numpy.ndarray
# print(id_lst)
pos_np = atom_vec.pos.as_array2d()
# save dtype for real or use hup_real_dtype
# self.real_dtype = pos_np.dtype
# print(self.real_dtype)
xmean = pos_np[id_lst, :].mean(axis=0)
# print(xmean)
self.pos_time.append(xmean)
if self.itimestep + 1 == self.last_step:
self.sample_reduce(emd_obj)
def sample_reduce(self, emd_obj):
nsize = np.zeros(1, np.int32)
if len(self.pos_time):
pos_time = np.asarray(self.pos_time, dtype=hup_real_dtype)
nsize[0] = 1
else:
pos_time = np.zeros([1, 3], dtype=hup_real_dtype)
pos_time_mean = pos_time.mean(axis=0)
print(pos_time_mean, pos_time.std(axis=0))
emd_obj.comm.comm_world.NPY_Allreduce(nsize, HUP_MPI_SUM)
emd_obj.comm.comm_world.NPY_Allreduce(pos_time_mean, HUP_MPI_SUM)
# print(nsize[0], nsize.dtype)
if nsize[0] > 0:
print(pos_time_mean / nsize[0])
class PyEMD(EMD):
def __init__(self):
EMD.__init__(self)
# set self variables
self.context = self.get_context()
self.comm = self.get_comm()
self.particles = self.get_particles()
self.sample = self.get_sample()
def init_all(self):
# initiate frist context and comm
self.context.init(Value())
self.comm.init(Value())
# get rank after comm.init
self.mpi_rank = self.comm.comm_world.get_rank()
self.mpi_size = self.comm.comm_world.get_size()
# get logfile after context.init
self.logfile = self.context.get_logfile()
self.particles.lattice_list.append({
"units" : "lj",
"type" : "fcc",
"scale" : 0.8442 }
)
self.particles.region_list.append({
"type" : "block",
"args" : [ 0, 4, 0, 4, 0, 4 ]},
self.particles.domain
)
# region id=1, lattice id=1
self.particles.init({
"position" : {
"dim" : 3,
"boundary" : ["p", "p", "p" ] ,
"create_atoms" : {
"region" : 1,
"lattice" :1,
"basis_types" : [ 1, 1, 2, 2]
}
}
,
"topology" : {
"atom_type" :[
{
"type" : "Kr",
"mass" : 1.0 }
,
[
"Ar",
1.0
]
]
},
"velocity":{
"type" : "create",
"rndseed" : 87287,
"temp" : 1.44,
"dist" : "uniform",
"loop" : "geom" }
})
self.set_force_field("lammps")
self.force_field = self.get_force_field()
self.force_field.init({
"version" : 1.0,
"units" : "lj",
"newton": True,
"ghost_vel": False,
"pair":{
"type" : "lj_cut",
"cutoff" : 2.5,
"shift" : 0,
"tail" : 0 ,
"coeff" : [
{
"itype" : 1,
"jtype" : 1,
"epsilon" : 1,
"sigma" : 1 }
,
[
2, 2, 1, 1
]
]
},
"neighbor":{
"every_step" : 2,
"skin" : 0.3,
"pair_search" : "bin" }
}
)
self.force_field.print_summary(self.logfile)
self.mdvv = PyMechanics(self)
self.set_mechanics_shared(self.mdvv.get_shared_from_this())
self.nsteps = 2000
self.mdvv.init_all(self.nsteps)
self.sample.init_statis([
{
"type" : "force",
"every_step" : 100,
"fields" : [
"temp",
"epot",
"ekin",
"etotal",
"press"
] }
,
{
"type" : "domain",
"every_step" : 100,
"start_step" : 20,
"fields" : [
"vol",
"density",
"lx"
] }
]
)
self.sample.init_trajectory({
"every": 10,
"sort_tag":1,
"file": "traj_out.pdb"}
)
def run_all(self):
timer = self.sample.timer
time_start = timer.stamp_local()
time_cputime = timer.cpu_time()
Context.log_puts("MD velocity verlet running ...\n")
s = self.mdvv.run_step(self, timer)
timer.stamp(T_TOTAL, time_start)
time_end = timer.cpu_time() - time_cputime
timer.add_elapsed(T_CPU_TIME, time_end)
return s
if __name__ == '__main__':
from sys import argv
initializer = ContextInitializer(argv, CMPIComm.NeedInitialize())
if Comm.is_master_rank():
EMD.print_config()
emd_obj = PyEMD()
emd_obj.init_all()
emd_obj.setup();
i = emd_obj.run_all();
emd_obj.final(i);
最后修改: 2024年9月7日,