WSL2で4DGaussiansを試してみる

2024年1月17日 22:20

「リアルタイムのレンダリング速度を実現」とのことで某所？で話題の「4D Gaussian Splatting for Real-Time Dynamic Scene Rendering」、略して4DGaussians を試してみます。

使用するPCはドスパラさんの「GALLERIA UL9C-R49」。スペックは
・CPU: Intel® Core™ i9-13900HX Processor
・Mem: 64 GB
・GPU: NVIDIA® GeForce RTX™ 4090 Laptop GPU(16GB)
・GPU: NVIDIA® GeForce RTX™ 4090 (24GB)
・OS: Ubuntu22.04 on WSL2（Windows 11）
です。

1. 準備

前準備

(1) CUDA 11
後々、CUDA 11.7が必要だ！と怒られるので、CUDA11のインストールを行います。

ここでは、update-alternativesは使用せずに、PATHを変更することでCUDA11と認識させます。

export PATH=/usr/local/cuda-11/bin:$PATH

(2) COLMAP
COLMAPは複数の二次元画像から、三次元の点群を得るためのソフトウェアです。Ubuntu 22.04のcolmapパッケージは残念ながらCUDA非対応とのことなので、以下の記事にある手順でcolmapをビルド＆インストールしておきます。

さて、これで前準備は完了です。

4DGaussians

venvを構築します

python3 -m venv 4dgaussian
cd $_
source bin/activate

4DGaussiansのリポジトリをクローンします。

git clone https://github.com/hustvl/4DGaussians
cd 4DGaussians
git submodule update --init --recursive

続いて、pip install。

pip install -r requirements.txt
pip install -e submodules/depth-diff-gaussian-rasterization
pip install -e submodules/simple-knna
pip install nerfstudio

pip listはこちら。

$ pip list
Package                     Version      Editable project location
--------------------------- ------------ -------------------------------------------------------------------------------------------------------
addict                      2.4.0
ansi2html                   1.9.1
asttokens                   2.4.1
attrs                       23.2.0
blinker                     1.7.0
certifi                     2023.11.17
charset-normalizer          3.3.2
click                       8.1.7
comm                        0.2.1
ConfigArgParse              1.7
contourpy                   1.2.0
cycler                      0.12.1
dash                        2.14.2
dash-core-components        2.0.0
dash-html-components        2.0.0
dash-table                  5.0.0
decorator                   5.1.1
diff-gaussian-rasterization 0.0.0        /path/to/venv/4dgaussians/4DGaussians/submodules/depth-diff-gaussian-rasterization
exceptiongroup              1.2.0
executing                   2.0.1
fastjsonschema              2.19.1
Flask                       3.0.0
fonttools                   4.47.2
idna                        3.6
imageio                     2.33.1
imageio-ffmpeg              0.4.9
importlib-metadata          7.0.1
ipython                     8.20.0
ipywidgets                  8.1.1
itsdangerous                2.1.2
jedi                        0.19.1
Jinja2                      3.1.3
joblib                      1.3.2
jsonschema                  4.21.0
jsonschema-specifications   2023.12.1
jupyter_core                5.7.1
jupyterlab-widgets          3.0.9
kiwisolver                  1.4.5
lpips                       0.1.4
MarkupSafe                  2.1.3
matplotlib                  3.8.2
matplotlib-inline           0.1.6
mmcv                        1.6.0
nbformat                    5.9.2
nest-asyncio                1.5.9
numpy                       1.26.3
nvidia-cublas-cu11          11.10.3.66
nvidia-cuda-nvrtc-cu11      11.7.99
nvidia-cuda-runtime-cu11    11.7.99
nvidia-cudnn-cu11           8.5.0.96
open3d                      0.18.0
opencv-python               4.9.0.80
packaging                   23.2
pandas                      2.1.4
parso                       0.8.3
pexpect                     4.9.0
pillow                      10.2.0
pip                         22.0.2
platformdirs                4.1.0
plotly                      5.18.0
plyfile                     1.0.3
prompt-toolkit              3.0.43
psutil                      5.9.7
ptyprocess                  0.7.0
pure-eval                   0.2.2
Pygments                    2.17.2
pyparsing                   3.1.1
pyquaternion                0.9.9
python-dateutil             2.8.2
pytorch-msssim              1.0.0
pytz                        2023.3.post1
PyYAML                      6.0.1
referencing                 0.32.1
requests                    2.31.0
retrying                    1.3.4
rpds-py                     0.17.1
scikit-learn                1.3.2
scipy                       1.11.4
setuptools                  59.6.0
simple-knn                  0.0.0        /path/to/venv/4dgaussians/4DGaussians/submodules/simple-knn
six                         1.16.0
stack-data                  0.6.3
tenacity                    8.2.3
threadpoolctl               3.2.0
tomli                       2.0.1
torch                       1.13.1
torchaudio                  0.13.1
torchvision                 0.14.1
tqdm                        4.66.1
traitlets                   5.14.1
typing_extensions           4.9.0
tzdata                      2023.4
urllib3                     2.1.0
wcwidth                     0.2.13
Werkzeug                    3.0.1
wheel                       0.42.0
widgetsnbextension          4.0.9
yapf                        0.40.2
zipp                        3.17.0

2. 試してみる

READMEのData Preparationセクションには、
・For synthetic scenes:
・For real dynamic scenes
と２つのデータセットの準備が書かれています。とりあえず、両方試してみましょう。

3. synthetic scenesを試してみる

提供されているdata.zipをダウンロードして、展開します。

wget https://www.dropbox.com/s/0bf6fl0ye2vz3vr/data.zip
mkdir data
unzip -d data data.zip
mv data/data data/dnerf

ディレクトリ名を変更するのを忘れずに。以降のスクリプトが動きません。

データの確認

サンプルデータ bouncingballsにどんなファイルが含まれているのか確認します。
以下のように193ファイルあります。

data/dnerf/bouncingballs/
├── test
│   ├── r_000.png
(snip)
│   └── r_019.png
├── train
│   ├── r_000.png
(snip)

│   └── r_149.png
├── transforms_test.json
├── transforms_train.json
├── transforms_val.json
└── val
    ├── r_000.png
(snip)
    └── r_019.png

3 directories, 193 files

test（テスト）の下、20ファイル。

trainの下、150ファイル。

val（評価）の下、20ファイル。

これらを入力としてトレーニングを実施です。

トレーニング

10分ほどかかります。

python train.py -s data/dnerf/bouncingballs --port 6017 --expname "dnerf/bouncingballs" --configs arguments/dnerf/bouncingballs.py

この時点で生成されるファイル（JPG画像は除く）は以下です。

$ find output/dnerf/bouncingballs -type f -ls | sort -k 10 | grep -v jpg
  1126699      4 -rw-r--r--   1 user user     2809 Jan 17 21:24 output/dnerf/bouncingballs/cfg_args
  1127424  13220 -rw-r--r--   1 user user 13535576 Jan 17 21:24 output/dnerf/bouncingballs/point_cloud/coarse_iteration_1000/deformation.pth
  1127426     40 -rw-r--r--   1 user user    37065 Jan 17 21:24 output/dnerf/bouncingballs/point_cloud/coarse_iteration_1000/deformation_accum.pth
  1127425      4 -rw-r--r--   1 user user     3785 Jan 17 21:24 output/dnerf/bouncingballs/point_cloud/coarse_iteration_1000/deformation_table.pth
  1127423    736 -rw-r--r--   1 user user   751481 Jan 17 21:24 output/dnerf/bouncingballs/point_cloud/coarse_iteration_1000/point_cloud.ply
  1127508  13220 -rw-r--r--   1 user user 13535576 Jan 17 21:25 output/dnerf/bouncingballs/point_cloud/coarse_iteration_3000/deformation.pth
  1127510    204 -rw-r--r--   1 user user   208713 Jan 17 21:25 output/dnerf/bouncingballs/point_cloud/coarse_iteration_3000/deformation_accum.pth
  1127509     20 -rw-r--r--   1 user user    18057 Jan 17 21:25 output/dnerf/bouncingballs/point_cloud/coarse_iteration_3000/deformation_table.pth
  1127507   4200 -rw-r--r--   1 user user  4299370 Jan 17 21:25 output/dnerf/bouncingballs/point_cloud/coarse_iteration_3000/point_cloud.ply
  1127712  13220 -rw-r--r--   1 user user 13535576 Jan 17 21:25 output/dnerf/bouncingballs/point_cloud/iteration_1000/deformation.pth
  1127714    216 -rw-r--r--   1 user user   218569 Jan 17 21:25 output/dnerf/bouncingballs/point_cloud/iteration_1000/deformation_accum.pth
  1127713     20 -rw-r--r--   1 user user    18889 Jan 17 21:25 output/dnerf/bouncingballs/point_cloud/iteration_1000/deformation_table.pth
  1127711   4400 -rw-r--r--   1 user user  4502730 Jan 17 21:25 output/dnerf/bouncingballs/point_cloud/iteration_1000/point_cloud.ply
(snip)
  1128260  13220 -rw-r--r--   1 user user 13535576 Jan 17 21:34 output/dnerf/bouncingballs/point_cloud/iteration_20000/deformation.pth
  1128262    332 -rw-r--r--   1 user user   333193 Jan 17 21:34 output/dnerf/bouncingballs/point_cloud/iteration_20000/deformation_accum.pth
  1128261     28 -rw-r--r--   1 user user    28489 Jan 17 21:34 output/dnerf/bouncingballs/point_cloud/iteration_20000/deformation_table.pth
  1128259   6712 -rw-r--r--   1 user user  6871874 Jan 17 21:34 output/dnerf/bouncingballs/point_cloud/iteration_20000/point_cloud.ply
$

レンダリング

こちらは10秒ほどで終了します。

python render.py --model_path output/dnerf/bouncingballs  --skip_train --configs arguments/dnerf/bouncingballs.py

生成されるファイルは以下です。
videoディレクトリの下に以下のファイルが生成されます。
・PNG画像 : 160ファイル
・MP4 : 1ファイル

  1128264     80 -rw-r--r--   1 user user    79965 Jan 17 21:38 output/dnerf/bouncingballs/test/ours_20000/gt/00000.png
(snip)
  1128281     68 -rw-r--r--   1 user user    69044 Jan 17 21:38 output/dnerf/bouncingballs/test/ours_20000/gt/00019.png
  1128284     84 -rw-r--r--   1 user user    85835 Jan 17 21:38 output/dnerf/bouncingballs/test/ours_20000/renders/00000.png
(snip)
  1128301     84 -rw-r--r--   1 user user    83103 Jan 17 21:38 output/dnerf/bouncingballs/test/ours_20000/renders/00019.png
  1128303     80 -rw-r--r--   1 user user    78748 Jan 17 21:38 output/dnerf/bouncingballs/test/ours_20000/video_rgb.mp4
  1128304    100 -rw-r--r--   1 user user   100009 Jan 17 21:38 output/dnerf/bouncingballs/video/ours_20000/renders/00000.png
(snip)
  1128462     88 -rw-r--r--   1 user user    88443 Jan 17 21:38 output/dnerf/bouncingballs/video/ours_20000/renders/00159.png
  1128464    252 -rw-r--r--   1 user user   256773 Jan 17 21:38 output/dnerf/bouncingballs/video/ours_20000/video_rgb.mp4

データ bouncingballsでのレンダリング結果はこちら。

これは、data[dot]zip 内の bouncingballsをtrainしてrenderしたもの。 pic.twitter.com/1nvakNAPFE
— NOGUCHI, Shoji (@noguchis) January 17, 2024

データ hellwarriorでの結果はこちら。

これは、data[dot]zip 内の hellwarriorをtrainしてrenderしたもの。 pic.twitter.com/9NdczZCzQv
— NOGUCHI, Shoji (@noguchis) January 17, 2024

評価（val）

生成されたモデルの評価のコマンドと実行結果です。

$ python metrics.py --model_path "output/dnerf/bouncingballs/"

Scene: output/dnerf/bouncingballs/
Method: ours_20000
Metric evaluation progress: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:19<00:00,  1.05it/s]
Scene:  output/dnerf/bouncingballs/ SSIM :    0.9943415
Scene:  output/dnerf/bouncingballs/ PSNR :   40.7844009
Scene:  output/dnerf/bouncingballs/ LPIPS-vgg:    0.0151397
Scene:  output/dnerf/bouncingballs/ LPIPS-alex:    0.0061164
Scene:  output/dnerf/bouncingballs/ MS-SSIM:    0.9953993
Scene:  output/dnerf/bouncingballs/ D-SSIM:    0.0023004

4. real dynamic scenesを試してみる

Hypernerf Datasetの「vrig_chicken.zip」を使用して試してみます。

ダウンロードと展開

wget https://github.com/google/hypernerf/releases/download/v0.1/vrig_chicken.zip
mkdir data/hypernerf
unzip -d data/hypernerf vrig_chicken.zip

2Dから3Dをつくる

提供されている colmap.shを使用して進めます。
（注記）このcolmap.shの中で COLMAPが呼び出されています。

環境変数PATHにCOLMAPのインストール先のパスを追加して、実行します。
実行時間は 約35分 です。

$ export PATH=/usr/local/colmap/bin:$PATH
$ time bash colmap.sh data/hypernerf/vrig-chicken hypernerf
(snip)
real    34m58.285s
user    42m1.977s
sys     4m41.207s
$

ファイル data/hypernerf/vrig-chicken/colmap/dense/workspace/fused.ply が生成されます。

ダウンサンプリング

入力された点（＝ fused.ply ファイル）を間引いて、点の数を40,000以下まで減らすスクリプトである downsample_point.pyを実行します。

$ time python scripts/downsample_point.py data/hypernerf/vrig-chicken/colmap/dense/workspace/fused.ply data/hypernerf/vrig-chicken/points3D_downsample.ply
Total points: 1153637
Downsampled points: 646002
Downsampled points: 374661
Downsampled points: 233434
Downsampled points: 156957
Downsampled points: 109059
Downsampled points: 80196
Downsampled points: 60776
Downsampled points: 47421
Downsampled points: 37841

real    0m2.473s
user    0m4.281s
sys     0m5.294s
$

実行時間は数秒。
結果、以下のplyファイルが作成されます。
・data/hypernerf/vrig-chicken/points3D_downsample.ply

（注）scene/dataset_readers.pyのreadHyperDataInfosメソッド内でこのplyファイルが参照されているため、生成しておかないと次の「トレーニング」でエラーになります。

トレーニング

$ time python train.py -s data/hypernerf/vrig-chicken --port 6070 --expname hypernerf/vrig-chicken --configs arguments/hypernerf/chicken.py
(snip)
real    17m45.223s
user    21m20.504s
sys     3m31.512s

実行時間は 約18分 ほど。
なお、ダウンサンプリングしておかないと以下のエラーが発生します。

FileNotFoundError: [Errno 2] No such file or directory: '/home/shoji_noguchi/devsecops/venv/4dgaussians/4DGaussians/data/hypernerf/vrig-chicken/points3D_downsample.ply'

レンダリング

画像の生成と、その画像をもとにしてmp4を作成します。

python render.py --model_path output/hypernerf/vrig-chicken  --configs arguments/hypernerf/chicken.py --skip_train --skip_test

だがしかし。エラーが発生してしまいます。

Rendering progress:  66%|█████████████████████████████████████████▉                      | 328/500 [00:04<00:02, 73.02it/s]
FPS: 111.11941862504221 [17/01 20:14:36]
IMAGEIO FFMPEG_WRITER WARNING: input image is not divisible by macro_block_size=16, resizing from (536, 960) to (544, 960) to ensure video compatibility with most codecs and players. To prevent resizing, make your input image divisible by the macro_block_size or set the macro_block_size to 1 (risking incompatibility).
[swscaler @ 0x61eb500] Warning: data is not aligned! This can lead to a speed loss

以下のようにrender.pyを修正（macro_block_size = Noneを追加）して、

diff --git a/render.py b/render.py
index 45c7da8..8f7cbd1 100644
--- a/render.py
+++ b/render.py
@@ -83,7 +83,7 @@ def render_set(model_path, name, iteration, views, gaussians, pipeline, backgrou
     multithread_write(render_list, render_path)


-    imageio.mimwrite(os.path.join(model_path, name, "ours_{}".format(iteration), 'video_rgb.mp4'), render_images, fps=30)
+    imageio.mimwrite(os.path.join(model_path, name, "ours_{}".format(iteration), 'video_rgb.mp4'), render_images, fps=30, macro_block_size = None)
 def render_sets(dataset : ModelParams, hyperparam, iteration : int, pipeline : PipelineParams, skip_train : bool, skip_test : bool, skip_video: bool):
     with torch.no_grad():
         gaussians = GaussianModel(dataset.sh_degree, hyperparam)

再実行です。

$ time python render.py --model_path output/hypernerf/vrig-chicken  --configs arguments/hypernerf/chicken.py --skip_train --skip_test
(snip)
[swscaler @ 0x60585c0] Warning: data is not aligned! This can lead to a speed loss

real    0m20.161s
user    3m21.555s
sys     0m26.046s
$

20秒ほどの処理時間です。

生成されたmp4がこちら。

4DGaussiansを試し中。
これは、Hypernerf Datasetのvrig-chickenをダウンロードして、ローカルPC上でアレコレしてレンダリングしてできたmp4。 pic.twitter.com/ulyEAsRazX
— NOGUCHI, Shoji (@noguchis) January 17, 2024

視点がずれているのか、ちらついているように見えますね。
私の環境セットアップが不味いのか、はてさて…。

かかった時間はトータルで55分ぐらいでしょうか。約1時間。

おまけ

Hypernerf Datasetの「interp_aleks-teapot.zip」を使用して生成したものも上げておきます。
このデータセット用の設定ファイルは含まれていなかったため default.pyを使用しています。

# ダウンロードと展開
wget https://github.com/google/hypernerf/releases/download/v0.1/interp_aleks-teapot.zip
unzip -d data/hypernerf interp_aleks-teapot.zip

# 2Dから3Dへ
bash colmap.sh data/hypernerf/aleks-teapot hypernerf

# ダウンサンプリング
python train.py -s data/hypernerf/aleks-teapot --port 6070 --expname hypernerf/aleks-teapot --configs arguments/hypernerf/defualt.py

# トレーニング
python train.py -s data/hypernerf/aleks-teapot --port 6070 --expname hypernerf/aleks-teapot --configs arguments/hypernerf/default.py

# レンダリング
python render.py --model_path output/hypernerf/aleks-teapot  --configs arguments/hypernerf/default.py --skip_train --skip_test

生成されたmp4がこちら。こちらはスムース。

Hypernerf Datasetのaleks-teapotをcolmap, downsamplle, train, renderしてできたmp4。 pic.twitter.com/N3QpGFHiF8
— NOGUCHI, Shoji (@noguchis) January 17, 2024

この記事が気に入ったらサポートをしてみませんか？