見出し画像

DeePMD-kitとDP-GENで機械学習力場を構築する

用語: DeepMD-kit, DP-GEN, 第一原理計算, ニューラルネットワーク, 分子動力学, 機械学習力場

原理と第一原理計算などについては割愛し、実装のみを取り上げる。

機械学習力場の構築には多数の構造とエネルギーを対応させた教師データが必須である。座標を自前の実装で作成するのは困難であるため、DP-GENというツールを利用することで効率的に構造探索とラベル付けを行うことができる。

DP-GENは大まかに3つのステップを実行する。

  1. 学習: 初期構造を利用して機械学習力場の学習を行う。

  2. シード値が異なる機械学習力場を利用した分子動力学計算と偏差の計算: 偏差が大きいものは値が収束していない->すなわちデータセットに加える必要があるデータであると判断され、3.の第一原理計算に回される

  3. 第一原理計算: VASPやGaussian, QEなどを利用して第一原理計算を行う。

DP-GENはこの3ステップを繰り返して効率的に構造探索とラベル付けを行う。実装の手順は以下の通り。

  1. 手動、もしくはDP-GENのinit機能を使い、数百程度の初期構造を用意、第一原理計算を行ってラベル付けする。

  2. dpdataを利用して計算済みのファイルを読み込み、DeePMD-kitで使えるnumpyフォーマットに変換する。

  3. DP-GENのパラメータ(param.json/yaml)に2.で用意した初期構造を指定する。

  4. DP-GENのパラメータにDeepMD-kitの学習パラメータ、偏差の範囲、一度に行う第一原理計算の数、など各種パラメータを適切に設定する。

  5. machine.json/yamlにリソースを記述する。

  6. dpgen run param.json machine.jsonで実行し、放置

param.yaml

type_map:
  - Na
init_data_prefix: ''
init_data_sys:
  - /home/hogehoge/Documents/DP-GEN_Sodium/GEN_1/data/Na7/
sys_format: lammps/lmp
sys_configs:
  - - /home/hogehoge/Documents/DP-GEN_Sodium/GEN_1/sys_configs/*.lmp
numb_models: 4
default_training_param:
  model:
    type_map:
    - Na
    data_stat_nbatch: 1
    type: standard
    descriptor:
      type: se_e2_a
      activation_function: relu
      sel:
      - 60
      rcut: 6.0
      rcut_smth: 1.8
      neuron:
      - 25
      - 50
      - 100
      axis_neuron: 8
      resnet_dt: false
      precision: float64
      seed: 1
    fitting_net:
      type: ener
      neuron:
      - 240
      - 240
      - 240
      precision: float64
      resnet_dt: true
      seed: 1
  learning_rate:
    type: exp
    start_lr: 0.001
    stop_lr: 1.0e-08
    decay_steps: 5000
  loss:
    type: ener
    start_pref_e: 0.02
    limit_pref_e: 1
    start_pref_f: 1000
    limit_pref_f: 1
    start_pref_v: 0
    limit_pref_v: 0
  training:
    numb_steps: 1000000
    seed: 1
    disp_file: lcurve.out
    disp_freq: 100
    save_freq: 1000
    save_ckpt: model.ckpt
    disp_training: true
    time_training: true
    profiling: false
    profiling_file: timeline.json
model_devi_engine: lammps
model_devi_jobs:
  - sys_idx:
      - 0
    temps:
      - 300
    press:
      - 1
    nsteps: 1000
    ensemble: nvt
    trj_freq: 10
    _idx: "00"
  - sys_idx:
      - 0
    temps:
      - 300
    press:
      - 1
    nsteps: 1000
    ensemble: nvt
    trj_freq: 10
    _idx: "01"
  - sys_idx:
      - 0
    temps:
      - 300
    press:
      - 1
    nsteps: 1000
    ensemble: nvt
    trj_freq: 10
    _idx: "02"
  - sys_idx:
      - 0
    temps:
      - 300
    press:
      - 1
    nsteps: 1000
    ensemble: nvt
    trj_freq: 10
    _idx: "03"
  - sys_idx:
      - 0
    temps:
      - 300
    press:
      - 1
    nsteps: 1000
    ensemble: nvt
    trj_freq: 10
    _idx: "04"
model_devi_dt: 0.002
model_devi_skip: 0
model_devi_f_trust_lo: 0.05
model_devi_f_trust_hi: 0.15
model_devi_clean_traj: false
model_devi_merge_traj: false
model_devi_nopbc: true
ratio_failed: 0.6
fp_style: gaussian
fp_task_max: 5000
fp_task_min: 1
use_clusters: false
fp_params:
  keywords: P B3LYP/6-311+g(d) force
  nproc: 15
  charge: 0

init_data_sysに初期データセットを指定。sys_configsは分子動力学で初期構造として使うデータを指定する。sys_configsにはnumpyフォーマットが実装されていなかったためdpdataでlammpsのインプットファイルに変換したものを正規表現ですべて指定している。sys_idxはsys_configsのインデックスを指定する。正規表現で捕捉しているものも含めて1まとめにされているのでここではすべて0を指定している。

machine.yaml

train:
  command: dp
  machine:
    batch_type: Shell
    context_type: local
    local_root: ./
    remote_root: /home/hogehoge/Documents/DP-GEN_Sodium/GEN_1
  resources:
    cpu_per_node: 14
    gpu_per_node: 1
    group_size: 4
    batch_type: Shell
model_devi:
  command: lmp_mpi
  machine:
    batch_type: Shell
    local_root: ./
    remote_root: /home/hogehoge/Documents/DP-GEN_Sodium/GEN_1
    context_type: local
  resources:
    number_node: 1
    cpu_per_node: 14
    gpu_per_node: 1
    group_size: 0
    batch_type: Shell
fp:
  command: "g16 -m=14GB < input > output || :"
  machine:
    batch_type: Shell
    local_root: ./
    remote_root: /home/hogehoge/Documents/DP-GEN_Sodium/GEN_1
    context_type: local
  resources:
    #group_size  must be 0 for one-by-one processing
    group_size: 0
    number_node: 1
    gpu_per_node: 0
    cpu_per_node: 15
    batch_type: Shell

ローカルで実行する場合にはbatch_typeはShell, local_rootは./ , context_typeはlocalを指定する。remote_rootはDP-GENのタスクが実際に行われるディレクトリである。重要なのはgroup_sizeで、これを0にしておかないと無限にリソースを食いつぶすので(生成された構造が全て一度に計算されてしまう)必ず0にすること。0にすると1つずつ実行される。gpu_per_nodeで使うgpuの数を指定する。Gaussianによる第一原理計算ではgpuを使わないので0にしている。Gaussianで第一原理計算を行う場合、結果が収束しない構造に当たるとセグメンテーション違反を起こしてDP-GENごと強制終了させられてしまうため、必ずreturn 0になるように"|| :"をつける必要がある。(以下を参照)
https://docs.deepmodeling.com/projects/dpdispatcher/en/latest/examples/g16.html

こんな感じでパラメータを設定して実行すると、以下のようなログができる。

2023-12-02 23:20:57,222 - INFO : start running
2023-12-02 23:20:57,238 - INFO : =============================iter.000000==============================
2023-12-02 23:20:57,238 - INFO : -------------------------iter.000000 task 00--------------------------
2023-12-02 23:20:57,240 - INFO : -------------------------iter.000000 task 01--------------------------
2023-12-03 07:34:25,916 - INFO : -------------------------iter.000000 task 02--------------------------
2023-12-03 07:34:25,916 - INFO : -------------------------iter.000000 task 03--------------------------
2023-12-03 07:34:26,382 - INFO : -------------------------iter.000000 task 04--------------------------
2023-12-03 11:37:42,041 - INFO : -------------------------iter.000000 task 05--------------------------
2023-12-03 11:37:42,047 - INFO : -------------------------iter.000000 task 06--------------------------
2023-12-03 11:37:44,228 - INFO : system 000 candidate : 789221 in 835835  94.42 %
2023-12-03 11:37:44,228 - INFO : system 000 failed    :  45378 in 835835   5.43 %
2023-12-03 11:37:44,228 - INFO : system 000 accurate  :   1236 in 835835   0.15 %
2023-12-03 11:37:45,105 - INFO : system 000 accurate_ratio:   0.0015    thresholds: 1.0000 and 1.0000   eff. task min and max   -1 3000   number of fp tasks:   3000
2023-12-03 11:37:48,369 - INFO : -------------------------iter.000000 task 07--------------------------
2023-12-04 05:46:50,925 - INFO : -------------------------iter.000000 task 08--------------------------
2023-12-04 05:46:52,613 - INFO : failed tasks:     25 in   3000    0.83 % 
2023-12-04 05:46:52,614 - INFO : =============================iter.000001==============================
2023-12-04 05:46:52,614 - INFO : -------------------------iter.000001 task 00--------------------------
2023-12-04 05:46:52,624 - INFO : -------------------------iter.000001 task 01--------------------------
2023-12-04 14:01:19,907 - INFO : -------------------------iter.000001 task 02--------------------------
2023-12-04 14:01:19,907 - INFO : -------------------------iter.000001 task 03--------------------------
2023-12-04 14:33:48,472 - INFO : start running
2023-12-04 14:33:48,488 - INFO : continue from iter 001 task 02
2023-12-04 14:33:48,489 - INFO : =============================iter.000000==============================
2023-12-04 14:33:48,489 - INFO : =============================iter.000001==============================
2023-12-04 14:33:48,489 - INFO : -------------------------iter.000001 task 03--------------------------
2023-12-04 14:33:48,908 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-04 20:42:57,157 - INFO : start running
2023-12-04 20:42:57,175 - INFO : continue from iter 001 task 03
2023-12-04 20:42:57,175 - INFO : =============================iter.000000==============================
2023-12-04 20:42:57,175 - INFO : =============================iter.000001==============================
2023-12-04 20:42:57,175 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-04 20:47:33,092 - INFO : start running
2023-12-04 20:47:33,108 - INFO : continue from iter 001 task 03
2023-12-04 20:47:33,109 - INFO : =============================iter.000000==============================
2023-12-04 20:47:33,109 - INFO : =============================iter.000001==============================
2023-12-04 20:47:33,109 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-04 20:50:07,491 - INFO : start running
2023-12-04 20:50:07,508 - INFO : continue from iter 001 task 03
2023-12-04 20:50:07,508 - INFO : =============================iter.000000==============================
2023-12-04 20:50:07,508 - INFO : =============================iter.000001==============================
2023-12-04 20:50:07,508 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-04 20:53:31,239 - INFO : start running
2023-12-04 20:53:31,256 - INFO : continue from iter 001 task 04
2023-12-04 20:53:31,256 - INFO : =============================iter.000000==============================
2023-12-04 20:53:31,256 - INFO : =============================iter.000001==============================
2023-12-04 20:53:31,256 - INFO : -------------------------iter.000001 task 05--------------------------
2023-12-04 20:53:31,256 - INFO : -------------------------iter.000001 task 06--------------------------
2023-12-04 20:54:23,713 - INFO : start running
2023-12-04 20:54:23,729 - INFO : continue from iter 001 task 03
2023-12-04 20:54:23,730 - INFO : =============================iter.000000==============================
2023-12-04 20:54:23,730 - INFO : =============================iter.000001==============================
2023-12-04 20:54:23,730 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-04 20:55:46,467 - INFO : start running
2023-12-04 20:55:46,484 - INFO : continue from iter 001 task 03
2023-12-04 20:55:46,484 - INFO : =============================iter.000000==============================
2023-12-04 20:55:46,484 - INFO : =============================iter.000001==============================
2023-12-04 20:55:46,484 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-05 09:12:44,441 - INFO : start running
2023-12-05 09:12:44,457 - INFO : continue from iter 001 task 03
2023-12-05 09:12:44,457 - INFO : =============================iter.000000==============================
2023-12-05 09:12:44,458 - INFO : =============================iter.000001==============================
2023-12-05 09:12:44,458 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-05 09:17:44,573 - INFO : start running
2023-12-05 09:17:44,590 - INFO : continue from iter 001 task 03
2023-12-05 09:17:44,590 - INFO : =============================iter.000000==============================
2023-12-05 09:17:44,590 - INFO : =============================iter.000001==============================
2023-12-05 09:17:44,590 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-05 09:19:27,871 - INFO : start running
2023-12-05 09:19:27,887 - INFO : continue from iter 001 task 03
2023-12-05 09:19:27,887 - INFO : =============================iter.000000==============================
2023-12-05 09:19:27,887 - INFO : =============================iter.000001==============================
2023-12-05 09:19:27,887 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-05 09:25:43,983 - INFO : start running
2023-12-05 09:25:43,999 - INFO : continue from iter 001 task 03
2023-12-05 09:25:43,999 - INFO : =============================iter.000000==============================
2023-12-05 09:25:43,999 - INFO : =============================iter.000001==============================
2023-12-05 09:25:44,000 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-05 09:28:08,806 - INFO : start running
2023-12-05 09:28:08,823 - INFO : continue from iter 001 task 03
2023-12-05 09:28:08,823 - INFO : =============================iter.000000==============================
2023-12-05 09:28:08,823 - INFO : =============================iter.000001==============================
2023-12-05 09:28:08,823 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-05 09:29:50,583 - INFO : start running
2023-12-05 09:29:50,600 - INFO : continue from iter 001 task 03
2023-12-05 09:29:50,600 - INFO : =============================iter.000000==============================
2023-12-05 09:29:50,600 - INFO : =============================iter.000001==============================
2023-12-05 09:29:50,600 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-05 09:30:12,746 - INFO : start running
2023-12-05 09:30:12,762 - INFO : continue from iter 001 task 03
2023-12-05 09:30:12,762 - INFO : =============================iter.000000==============================
2023-12-05 09:30:12,762 - INFO : =============================iter.000001==============================
2023-12-05 09:30:12,763 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-05 09:31:06,345 - INFO : start running
2023-12-05 09:31:06,361 - INFO : continue from iter 001 task 03
2023-12-05 09:31:06,361 - INFO : =============================iter.000000==============================
2023-12-05 09:31:06,361 - INFO : =============================iter.000001==============================
2023-12-05 09:31:06,361 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-05 09:32:08,480 - INFO : start running
2023-12-05 09:32:08,496 - INFO : continue from iter 001 task 03
2023-12-05 09:32:08,497 - INFO : =============================iter.000000==============================
2023-12-05 09:32:08,497 - INFO : =============================iter.000001==============================
2023-12-05 09:32:08,497 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-05 09:34:45,741 - INFO : start running
2023-12-05 09:34:45,758 - INFO : continue from iter 001 task 03
2023-12-05 09:34:45,758 - INFO : =============================iter.000000==============================
2023-12-05 09:34:45,758 - INFO : =============================iter.000001==============================
2023-12-05 09:34:45,758 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-05 09:35:23,785 - INFO : start running
2023-12-05 09:35:23,801 - INFO : continue from iter 001 task 03
2023-12-05 09:35:23,801 - INFO : =============================iter.000000==============================
2023-12-05 09:35:23,802 - INFO : =============================iter.000001==============================
2023-12-05 09:35:23,802 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-05 09:35:36,729 - INFO : start running
2023-12-05 09:35:36,745 - INFO : continue from iter 001 task 03
2023-12-05 09:35:36,745 - INFO : =============================iter.000000==============================
2023-12-05 09:35:36,745 - INFO : =============================iter.000001==============================
2023-12-05 09:35:36,745 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-05 09:37:26,621 - INFO : start running
2023-12-05 09:37:26,637 - INFO : continue from iter 001 task 03
2023-12-05 09:37:26,637 - INFO : =============================iter.000000==============================
2023-12-05 09:37:26,638 - INFO : =============================iter.000001==============================
2023-12-05 09:37:26,638 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-05 09:37:46,349 - INFO : start running
2023-12-05 09:37:46,365 - INFO : continue from iter 001 task 03
2023-12-05 09:37:46,365 - INFO : =============================iter.000000==============================
2023-12-05 09:37:46,365 - INFO : =============================iter.000001==============================
2023-12-05 09:37:46,365 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-05 09:44:53,742 - INFO : start running
2023-12-05 09:44:53,759 - INFO : continue from iter 001 task 03
2023-12-05 09:44:53,759 - INFO : =============================iter.000000==============================
2023-12-05 09:44:53,759 - INFO : =============================iter.000001==============================
2023-12-05 09:44:53,759 - INFO : -------------------------iter.000001 task 04--------------------------
2023-12-05 09:44:55,693 - INFO : -------------------------iter.000001 task 05--------------------------
2023-12-05 09:44:55,693 - INFO : -------------------------iter.000001 task 06--------------------------
2023-12-05 09:45:31,573 - INFO : start running
2023-12-05 09:45:31,590 - INFO : continue from iter 001 task 05
2023-12-05 09:45:31,590 - INFO : =============================iter.000000==============================
2023-12-05 09:45:31,590 - INFO : =============================iter.000001==============================
2023-12-05 09:45:31,590 - INFO : -------------------------iter.000001 task 06--------------------------
2023-12-05 09:45:34,119 - INFO : system 000 candidate :  46382 in 834834   5.56 %
2023-12-05 09:45:34,119 - INFO : system 000 failed    :     17 in 834834   0.00 %
2023-12-05 09:45:34,119 - INFO : system 000 accurate  : 788435 in 834834  94.44 %
2023-12-05 09:45:34,854 - INFO : system 000 accurate_ratio:   0.9444    thresholds: 1.0000 and 1.0000   eff. task min and max   -1 5000   number of fp tasks:   5000
2023-12-05 09:45:40,740 - INFO : -------------------------iter.000001 task 07--------------------------
2023-12-06 01:27:02,777 - INFO : -------------------------iter.000001 task 08--------------------------
2023-12-06 01:27:05,109 - INFO : failed tasks:   2520 in   5000   50.40 % 
2023-12-06 08:18:52,201 - INFO : start running
2023-12-06 08:18:52,218 - INFO : continue from iter 001 task 07
2023-12-06 08:18:52,218 - INFO : =============================iter.000000==============================
2023-12-06 08:18:52,218 - INFO : =============================iter.000001==============================
2023-12-06 08:18:52,218 - INFO : -------------------------iter.000001 task 08--------------------------
2023-12-06 08:18:53,636 - INFO : failed tasks:   2520 in   5000   50.40 % 
2023-12-06 08:19:30,166 - INFO : start running
2023-12-06 08:19:30,183 - INFO : continue from iter 001 task 07
2023-12-06 08:19:30,183 - INFO : =============================iter.000000==============================
2023-12-06 08:19:30,183 - INFO : =============================iter.000001==============================
2023-12-06 08:19:30,183 - INFO : -------------------------iter.000001 task 08--------------------------
2023-12-06 08:19:31,575 - INFO : failed tasks:   2520 in   5000   50.40 % 
2023-12-06 08:19:31,576 - INFO : =============================iter.000002==============================
2023-12-06 08:19:31,576 - INFO : -------------------------iter.000002 task 00--------------------------
2023-12-06 08:19:31,595 - INFO : -------------------------iter.000002 task 01--------------------------
2023-12-06 10:23:39,194 - INFO : -------------------------iter.000002 task 02--------------------------
2023-12-06 10:23:39,194 - INFO : -------------------------iter.000002 task 03--------------------------
2023-12-06 10:23:39,630 - INFO : -------------------------iter.000002 task 04--------------------------
2023-12-06 12:12:40,783 - INFO : start running
2023-12-06 12:12:40,801 - INFO : continue from iter 002 task 03
2023-12-06 12:12:40,801 - INFO : =============================iter.000000==============================
2023-12-06 12:12:40,801 - INFO : =============================iter.000001==============================
2023-12-06 12:12:40,801 - INFO : =============================iter.000002==============================
2023-12-06 12:12:40,801 - INFO : -------------------------iter.000002 task 04--------------------------
2023-12-06 12:15:50,447 - INFO : start running
2023-12-06 12:15:50,464 - INFO : continue from iter 002 task 00
2023-12-06 12:15:50,464 - INFO : =============================iter.000000==============================
2023-12-06 12:15:50,464 - INFO : =============================iter.000001==============================
2023-12-06 12:15:50,464 - INFO : =============================iter.000002==============================
2023-12-06 12:15:50,464 - INFO : -------------------------iter.000002 task 01--------------------------
2023-12-06 12:16:05,652 - INFO : start running
2023-12-06 12:16:05,669 - INFO : continue from iter 001 task 08
2023-12-06 12:16:05,669 - INFO : =============================iter.000000==============================
2023-12-06 12:16:05,670 - INFO : =============================iter.000001==============================
2023-12-06 12:16:05,670 - INFO : =============================iter.000002==============================
2023-12-06 12:16:05,670 - INFO : -------------------------iter.000002 task 00--------------------------
2023-12-06 12:16:05,687 - INFO : -------------------------iter.000002 task 01--------------------------
2023-12-06 20:32:03,135 - INFO : -------------------------iter.000002 task 02--------------------------
2023-12-06 20:32:03,135 - INFO : -------------------------iter.000002 task 03--------------------------
2023-12-06 20:32:03,554 - INFO : -------------------------iter.000002 task 04--------------------------
2023-12-06 20:58:06,714 - INFO : -------------------------iter.000002 task 05--------------------------
2023-12-06 20:58:06,714 - INFO : -------------------------iter.000002 task 06--------------------------
2023-12-06 21:18:01,152 - INFO : start running
2023-12-06 21:18:01,169 - INFO : continue from iter 002 task 05
2023-12-06 21:18:01,169 - INFO : =============================iter.000000==============================
2023-12-06 21:18:01,169 - INFO : =============================iter.000001==============================
2023-12-06 21:18:01,169 - INFO : =============================iter.000002==============================
2023-12-06 21:18:01,169 - INFO : -------------------------iter.000002 task 06--------------------------
2023-12-06 21:18:32,789 - INFO : start running
2023-12-06 21:18:32,806 - INFO : continue from iter 002 task 03
2023-12-06 21:18:32,806 - INFO : =============================iter.000000==============================
2023-12-06 21:18:32,806 - INFO : =============================iter.000001==============================
2023-12-06 21:18:32,806 - INFO : =============================iter.000002==============================
2023-12-06 21:18:32,806 - INFO : -------------------------iter.000002 task 04--------------------------
2023-12-06 21:18:52,272 - INFO : start running
2023-12-06 21:18:52,290 - INFO : continue from iter 002 task 02
2023-12-06 21:18:52,290 - INFO : =============================iter.000000==============================
2023-12-06 21:18:52,290 - INFO : =============================iter.000001==============================
2023-12-06 21:18:52,290 - INFO : =============================iter.000002==============================
2023-12-06 21:18:52,290 - INFO : -------------------------iter.000002 task 03--------------------------
2023-12-06 21:18:52,707 - INFO : -------------------------iter.000002 task 04--------------------------
2023-12-06 21:28:39,073 - INFO : start running
2023-12-06 21:28:39,130 - INFO : continue from iter 002 task 03
2023-12-06 21:28:39,130 - INFO : =============================iter.000000==============================
2023-12-06 21:28:39,130 - INFO : =============================iter.000001==============================
2023-12-06 21:28:39,130 - INFO : =============================iter.000002==============================
2023-12-06 21:28:39,130 - INFO : -------------------------iter.000002 task 04--------------------------
2023-12-07 01:39:24,155 - INFO : -------------------------iter.000002 task 05--------------------------
2023-12-07 01:39:24,155 - INFO : -------------------------iter.000002 task 06--------------------------
2023-12-07 01:39:26,564 - INFO : system 000 candidate :   5782 in 835835   0.69 %
2023-12-07 01:39:26,564 - INFO : system 000 failed    :      4 in 835835   0.00 %
2023-12-07 01:39:26,565 - INFO : system 000 accurate  : 830049 in 835835  99.31 %
2023-12-07 01:39:27,317 - INFO : system 000 accurate_ratio:   0.9931    thresholds: 1.0000 and 1.0000   eff. task min and max   -1 5000   number of fp tasks:   5000
2023-12-07 01:39:32,334 - INFO : -------------------------iter.000002 task 07--------------------------
2023-12-08 07:55:16,168 - INFO : -------------------------iter.000002 task 08--------------------------
2023-12-08 07:55:18,969 - INFO : failed tasks:     39 in   5000    0.78 % 
2023-12-08 07:55:18,969 - INFO : =============================iter.000003==============================
2023-12-08 07:55:18,969 - INFO : -------------------------iter.000003 task 00--------------------------
2023-12-08 07:55:19,003 - INFO : -------------------------iter.000003 task 01--------------------------
2023-12-08 16:11:17,235 - INFO : -------------------------iter.000003 task 02--------------------------
2023-12-08 16:11:17,235 - INFO : -------------------------iter.000003 task 03--------------------------
2023-12-08 16:11:17,648 - INFO : -------------------------iter.000003 task 04--------------------------
2023-12-08 16:55:21,573 - INFO : -------------------------iter.000003 task 05--------------------------
2023-12-08 16:55:21,573 - INFO : -------------------------iter.000003 task 06--------------------------
2023-12-08 16:55:21,932 - INFO : system 000 candidate :    533 in  84335   0.63 %
2023-12-08 16:55:21,932 - INFO : system 000 failed    :      2 in  84335   0.00 %
2023-12-08 16:55:21,932 - INFO : system 000 accurate  :  83800 in  84335  99.37 %
2023-12-08 16:55:21,992 - INFO : system 000 accurate_ratio:   0.9937    thresholds: 1.0000 and 1.0000   eff. task min and max   -1 5000   number of fp tasks:    533
2023-12-08 16:55:22,519 - INFO : -------------------------iter.000003 task 07--------------------------
2023-12-08 20:25:16,609 - INFO : -------------------------iter.000003 task 08--------------------------
2023-12-08 20:25:16,909 - INFO : failed tasks:      3 in    533    0.56 % 
2023-12-08 20:25:16,910 - INFO : =============================iter.000004==============================
2023-12-08 20:25:16,910 - INFO : -------------------------iter.000004 task 00--------------------------
2023-12-08 20:25:16,931 - INFO : -------------------------iter.000004 task 01--------------------------

重要なのはsystem 000 accurate : 1236 in 835835 0.15 %のようになっている部分で、これは複数のモデル(ここでは4つ)で分子動力学計算を行ったときにどれだけ4つのモデル間で値が一致しているかを示している。ここの割合が大きくなればある程度データセットが揃ってきていると判断できる。だいたい1万くらいの構造があるとうまくいくっぽい。

イテレーションを回していると複数のファイルの中に複数のデータセットが出来上がるので、これらを回収してまとめる必要がある。実行しているディレクトリにcdしてから

$ dpgen collect -p param.json ./ ./

とするとカレントディレクトリに探索したデータセットが収集されたファイルが出来上がるので、ある程度データセットが揃ったらこれを利用してDeepMD-kitの学習パラメータの吟味を行い機械学習力場の本格的な構築に移行することができる。

この記事が気に入ったらサポートをしてみませんか?