






ironawi@ironawi-ally:~$ cd terraform/kubernetes/
ironawi@ironawi-ally:~/terraform/kubernetes$ ls
kubernetes_node.tf  modules  output.tf  sg-k8s-master.tf  sg-k8s-worker.tf  sg-misskey...tf  terraform.tfstate  terraform.tfstate.backup
ironawi@ironawi-ally:~/terraform/kubernetes$ terraform plan
Note: Objects have changed outside of Terraform

Terraform detected the following changes made outside of Terraform since the last "terraform apply" which may have affected this plan:

  # module.worker_node2.aws_instance.main has been deleted
  - resource "aws_instance" "main" {
        id                                   = "i-0891cba0a7c60c5fe"
      - public_dns                           = "ec2-15-152-119-16.ap-northeast-3.compute.amazonaws.com" -> null
        tags                                 = {
            "Name" = "worker_node2"
        # (33 unchanged attributes hidden)

        # (9 unchanged blocks hidden)

Unless you have made equivalent changes to your configuration, or ignored the relevant attributes using ignore_changes, the following plan may include actions to undo or respond to these changes.


Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # module.worker_node2.aws_instance.main will be created
  + resource "aws_instance" "main" {
      + ami                                  = "ami-0c1531991482a24e1"
      + arn                                  = (known after apply)
      + associate_public_ip_address          = (known after apply)
      + availability_zone                    = (known after apply)
      + cpu_core_count                       = (known after apply)
      + cpu_threads_per_core                 = (known after apply)
      + disable_api_stop                     = (known after apply)
      + disable_api_termination              = (known after apply)
      + ebs_optimized                        = (known after apply)
      + get_password_data                    = false
      + host_id                              = (known after apply)
      + host_resource_group_arn              = (known after apply)
      + iam_instance_profile                 = (known after apply)
      + id                                   = (known after apply)
      + instance_initiated_shutdown_behavior = (known after apply)
      + instance_lifecycle                   = (known after apply)
      + instance_state                       = (known after apply)
      + instance_type                        = "t3.small"
      + ipv6_address_count                   = (known after apply)
      + ipv6_addresses                       = (known after apply)
      + key_name                             = "yaiwata-dev-northeast3"
      + monitoring                           = (known after apply)
      + outpost_arn                          = (known after apply)
      + password_data                        = (known after apply)
      + placement_group                      = (known after apply)
      + placement_partition_number           = (known after apply)
      + primary_network_interface_id         = (known after apply)
      + private_dns                          = (known after apply)
      + private_ip                           = (known after apply)
      + public_dns                           = (known after apply)
      + public_ip                            = (known after apply)
      + secondary_private_ips                = (known after apply)
      + security_groups                      = (known after apply)
      + source_dest_check                    = true
      + spot_instance_request_id             = (known after apply)
      + subnet_id                            = "subnet-0988129ff19aad0e4"
      + tags                                 = {
          + "Name" = "worker_node2"
      + tags_all                             = {
          + "Name" = "worker_node2"
      + tenancy                              = (known after apply)
      + user_data                            = (known after apply)
      + user_data_base64                     = (known after apply)
      + user_data_replace_on_change          = false
      + vpc_security_group_ids               = [
          + "sg-03bf3ff0d56c8f475",
          + "sg-0fc354d2e7824d1e6",

      + instance_market_options {
          + market_type = "spot"

          + spot_options {
              + instance_interruption_behavior = (known after apply)
              + max_price                      = "0.01"
              + spot_instance_type             = (known after apply)
              + valid_until                    = (known after apply)

      + root_block_device {
          + delete_on_termination = true
          + device_name           = (known after apply)
          + encrypted             = (known after apply)
          + iops                  = (known after apply)
          + kms_key_id            = (known after apply)
          + throughput            = (known after apply)
          + volume_id             = (known after apply)
          + volume_size           = 30
          + volume_type           = "gp3"

Plan: 1 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  ~ worker_node2 = "ec2-15-152-119-16.ap-northeast-3.compute.amazonaws.com" -> (known after apply)


Note: You didn't use the -out option to save this plan, so Terraform can't guarantee to take exactly these actions if you run "terraform apply" now.

まずはterraform planで状態の確認を行う。出力結果からはworkerの1台が消滅しており、消滅したworkerが再作成されることが分かる。
状態を確認したので、terraform applyでVMの再作成を行う。

ironawi@ironawi-ally:~/terraform/kubernetes$ terraform apply
Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

module.worker_node2.aws_instance.main: Creating...
module.worker_node2.aws_instance.main: Still creating... [10s elapsed]
module.worker_node2.aws_instance.main: Provisioning with 'local-exec'...
module.worker_node2.aws_instance.main (local-exec): Executing: ["/bin/sh" "-c" "modules/ec2/../scripts/check_ssh_connection.sh <host name>"]
module.worker_node2.aws_instance.main (local-exec): checking ssh connection...
module.worker_node2.aws_instance.main (local-exec): ssh connection established!
module.worker_node2.aws_instance.main: Provisioning with 'local-exec'...
module.worker_node2.aws_instance.main (local-exec): Executing: ["/bin/sh" "-c" "ansible-playbook -i <host name>, modules/ec2/../ansible/setup_k8s.yaml"]

module.worker_node2.aws_instance.main (local-exec): PLAY [all] *********************************************************************
module.worker_node2.aws_instance.main: Creation complete after 1m19s [id=<instance id>]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.


master_node = "<host name>"
worker_node1 = "<host name>"
worker_node2 = "<host name>"
worker_node3 = "<host name>"



control plane nodeにsshログインし、リソースの整理を行う。


kubectl edit deploy -n misskey web-deployment


kubectl get po -n misskey
kubectl get node
kubectl drain --ignore-daemonsets --force <node名>
kubectl delete node <node名>

新規構築したk8s worker nodeのクラスタjoin

上記ページのワーカーノードの作成を参考に、worker nodeのkubeletの再起動までを実施。

上記ページのトークンを作成を参考に、control plane nodeでクラスタjoin用のトークンを再発行。
表示されたkubeadm joinコマンドをworker nodeで実行することで、k8sクラスタへworker nodeが追加される。


control plane nodeに置いてあるDB用manifestをapplyし、DBを再起動。

kubectl apply -f db.yaml 
kubectl get po -n misskey

db Podの起動を確認できたら、バックアップファイルをPod内へ送りこむ。

kubectl cp /k8s/misskey/backup/<バックアップファイル>.tar.gz misskey/db:/

kubectl exec でPod内に入り、送り込んだtar.gzを解凍してバックアップファイルを取り出してDBを復旧する。

kubectl exec -it -n misskey db -- /bin/bash
tar zxf <バックアップファイル>.tar.gz
psql -U misskey-user misskey < tmp/backup/dump.sql 


redisが落ちた場合は、バックアップしておいたdump.rdbを /k8s/misskey/redis/ に置き直してredisを再起動すればOK。


