tjtjtjのメモ

自分のためのメモです

nomad トライアル jobおさらい

すっかり忘れたのでおさらい。

https://www.nomadproject.io/intro/getting-started/jobs.html

エージェント起動と停止

開発モードでエージェント起動

$ sudo nomad agent -dev

エージェント起動

ctrl-c

エージェント状態確認

$ nomad agent-info

クラスタの状態確認?

このあたりよく分かっていない。

$ nomad node status
ID        DC   Name   Class   Drain  Eligibility  Status
b62b7daf  dc1  nomad  <none>  false  eligible     ready
$ nomad server members
Name          Address    Port  Status  Leader  Protocol  Build  Datacenter  Region
nomad.global  127.0.0.1  4648  alive   true    2         0.8.6  dc1         global

job, taskgroup, task

job-taskgroup-task って構造だった。

  • job: taskgroupのまとめ <--- example
  • taskgroup: スケジューリング単位 1ノードで実行 <--- cache, alloc
  • task: nomad での実行最小単位 <--- redis

job 実行

ジョブ確認。何もいない。

$ nomad job status
No running jobs

ジョブファイル生成。上書きしないので削除して作り直し。

$ nomad job init
Job 'example.nomad' already exists
$ rm example.nomad
$ nomad job init
Example job file written to example.nomad

ジョブ実行

$ nomad job run example.nomad
==> Monitoring evaluation "8a175ecb"
    Evaluation triggered by job "example"
    Allocation "dfee8772" created: node "b62b7daf", group "cache"
    Evaluation within deployment: "6c7ed692"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "8a175ecb" finished with status "complete"

ジョブ一覧

$ nomad job status
ID       Type     Priority  Status   Submit Date
example  service  50        running  2019-04-09T11:09:58Z

ジョブ詳細。taskgroup:cache は node:b62b7daf で alloc:dfee8772 になっている。

$ nomad status example
vagrant@nomad:~$ nomad status example
ID            = example
Name          = example
Submit Date   = 2019-04-09T11:09:58Z
Type          = service
Priority      = 50
Datacenters   = dc1
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost
cache       0       0         1        0       0         0

Latest Deployment
ID          = 6c7ed692
Status      = successful
Description = Deployment completed successfully

Deployed
Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline
cache       1        1       1        0          2019-04-09T11:20:25Z

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created    Modified
dfee8772  b62b7daf  cache       0        run      running  3m11s ago  2m44s ago

nomad job のサブコマンド

Subcommands:
    deployments    List deployments for a job
    dispatch       Dispatch an instance of a parameterized job
    eval           Force an evaluation for the job
    history        Display all tracked versions of a job
    init           Create an example job file
    inspect        Inspect a submitted job
    plan           Dry-run a job update to determine its effects
    promote        Promote a job's canaries
    revert         Revert to a prior version of the job
    run            Run a new job or update an existing job
    status         Display status information about a job
    stop           Stop a running job
    validate       Checks if a given job specification is valid

alloc

alloc:dfee8772 確認。 example(job).cache(group).redis(task) って構造だった。

$ nomad alloc status dfee8772
ID                  = dfee8772
Eval ID             = 8a175ecb
Name                = example.cache[0]
Node ID             = b62b7daf
Job ID              = example
Job Version         = 0
Client Status       = running
Client Description  = <none>
Desired Status      = run
Desired Description = <none>
Created             = 6m32s ago
Modified            = 6m5s ago
Deployment ID       = 6c7ed692
Deployment Health   = healthy


Task "redis" is "running"
Task Resources
CPU        Memory            Disk     IOPS  Addresses
3/500 MHz  1000 KiB/256 MiB  300 MiB  0     db: 127.0.0.1:28263

Task Events:
Started At     = 2019-04-09T11:10:07Z
Finished At    = N/A
Total Restarts = 0
Last Restart   = N/A

Recent Events:
Time                  Type        Description
2019-04-09T11:10:07Z  Started     Task started by client
2019-04-09T11:09:58Z  Driver      Downloading image redis:3.2
2019-04-09T11:09:58Z  Task Setup  Building Task Directory
2019-04-09T11:09:58Z  Received    Task received by client
vagrant@nomad:~$

nomad alloc のサブコマンド。logs は前回やった。

Subcommands:
    fs        Inspect the contents of an allocation directory
    logs      Streams the logs of a task.
    status    Display allocation status information and metadata

fs 見てみる

$ nomad alloc fs dfee8772
Mode        Size     Modified Time         Name
drwxrwxrwx  4.0 KiB  2019-04-09T11:09:58Z  alloc/
drwxrwxrwx  4.0 KiB  2019-04-09T11:10:06Z  redis/

スケール 1->3

example.nomad 書き換え。group:cache の count=1 を 3 にする

$ vi example.nomad
↓
count=1 <--- 3

変更プラン確認。最初気が付かなかったが、ここは plan であって run ではない。

$ nomad job plan example.nomad
+/- Job: "example"
+/- Task Group: "cache" (2 create, 1 in-place update)
  +/- Count: "1" => "3" (forces create)
      Task: "redis"

Scheduler dry-run:
- All tasks successfully allocated.

Job Modify Index: 27
To submit the job with version verification run:

nomad job run -check-index 27 example.nomad

When running the job with the check-index flag, the job will only be run if the
server side version matches the job modify index returned. If the index has
changed, another user has modified the job and the plan's results are
potentially invalid.

変更プラン適用

$ nomad job run -check-index 27 example.nomad
==> Monitoring evaluation "432f4471"
    Evaluation triggered by job "example"
    Allocation "47dff45a" created: node "b62b7daf", group "cache"
    Allocation "baf43930" created: node "b62b7daf", group "cache"
    Allocation "dfee8772" modified: node "b62b7daf", group "cache"
    Evaluation within deployment: "ae570b04"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "432f4471" finished with status "complete"

job詳細確認。alloc 2コ増え3コになっている。

$ nomad status example
ID            = example
Name          = example
Submit Date   = 2019-04-09T11:31:50Z
Type          = service
Priority      = 50
Datacenters   = dc1
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost
cache       0       0         3        0       0         0

Latest Deployment
ID          = ae570b04
Status      = successful
Description = Deployment completed successfully

Deployed
Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline
cache       3        3       3        0          2019-04-09T11:42:10Z

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created     Modified
47dff45a  b62b7daf  cache       1        run      running  1m36s ago   1m16s ago
baf43930  b62b7daf  cache       1        run      running  1m36s ago   1m21s ago
dfee8772  b62b7daf  cache       1        run      running  23m28s ago  1m25s ago

■ イメージ変更 redis:3.2 -> redis:4.0

task "redis のイメージを redis:3.2 -> redis:4.0 にする。

$ vi example.nomad
↓
image = "redis:3.2" <--- redis:4.0

プラン確認。プラン確認というかプラン生成か。

$ nomad job plan example.nomad
+/- Job: "example"
+/- Task Group: "cache" (1 create/destroy update, 2 ignore)
  +/- Task: "redis" (forces create/destroy update)
    +/- Config {
      +/- image:           "redis:3.2" => "redis:4.0"
          port_map[0][db]: "6379"
        }

Scheduler dry-run:
- All tasks successfully allocated.

Job Modify Index: 58
To submit the job with version verification run:

nomad job run -check-index 58 example.nomad

When running the job with the check-index flag, the job will only be run if the
server side version matches the job modify index returned. If the index has
changed, another user has modified the job and the plan's results are
potentially invalid.

この時点ではまだallocに影響はない。

$ nomad status example
↓
Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created     Modified
47dff45a  b62b7daf  cache       1        run      running  8m33s ago   8m13s ago
baf43930  b62b7daf  cache       1        run      running  8m33s ago   8m18s ago
dfee8772  b62b7daf  cache       1        run      running  30m25s ago  8m22s ago

プラン適用。1プランに複数 Evaluation がぶら下がっていて、Evaluation ごとに check-index が進んでいく関係だろうか。 新alloc:33eac15d が生成された。ここで3コでなく1コなのは、task のイメージ変更は 1コ成功したら次のをやる、って感じだからか。

$ nomad job run -check-index 58 example.nomad
==> Monitoring evaluation "e226b835"
    Evaluation triggered by job "example"
    Allocation "33eac15d" created: node "b62b7daf", group "cache"
    Evaluation within deployment: "03079e18"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "e226b835" finished with status "complete"

job:example の状態をポーリング。version:2 に切り替わっている様子を確認できた。

$ nomad status example
Allocations
ID        Node ID   Task Group  Version  Desired  Status    Created     Modified
33eac15d  b62b7daf  cache       2        run      running   10s ago     5s ago
47dff45a  b62b7daf  cache       1        run      running   10m27s ago  10m7s ago
baf43930  b62b7daf  cache       1        run      running   10m27s ago  10m12s ago
dfee8772  b62b7daf  cache       1        stop     complete  32m19s ago  5s ago
↓
Allocations
ID        Node ID   Task Group  Version  Desired  Status    Created     Modified
bca76393  b62b7daf  cache       2        run      pending   3s ago      3s ago
33eac15d  b62b7daf  cache       2        run      running   28s ago     4s ago
47dff45a  b62b7daf  cache       1        run      running   10m45s ago  10m25s ago
baf43930  b62b7daf  cache       1        stop     running   10m45s ago  3s ago
dfee8772  b62b7daf  cache       1        stop     complete  32m37s ago  23s ago
↓
Allocations
ID        Node ID   Task Group  Version  Desired  Status    Created     Modified
f624164c  b62b7daf  cache       2        run      pending   0s ago      0s ago
bca76393  b62b7daf  cache       2        run      running   25s ago     1s ago
33eac15d  b62b7daf  cache       2        run      running   50s ago     26s ago
47dff45a  b62b7daf  cache       1        stop     running   11m7s ago   0s ago
baf43930  b62b7daf  cache       1        stop     complete  11m7s ago   20s ago
dfee8772  b62b7daf  cache       1        stop     complete  32m59s ago  45s ago

job 停止

停止

$ nomad job stop example
==> Monitoring evaluation "2c30cd46"
    Evaluation triggered by job "example"
    Evaluation within deployment: "03079e18"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "2c30cd46" finished with status "complete"

確認

$ nomad job status
ID       Type     Priority  Status          Submit Date
example  service  50        dead (stopped)  2019-04-09T11:42:07Z

job 再開

再開

vagrant@nomad:~$ nomad job run example.nomad
==> Monitoring evaluation "332362e1"
    Evaluation triggered by job "example"
    Allocation "1f19159f" created: node "b62b7daf", group "cache"
    Allocation "381f28af" created: node "b62b7daf", group "cache"
    Allocation "513949b9" created: node "b62b7daf", group "cache"
    Evaluation within deployment: "705c27d6"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "332362e1" finished with status "complete"

確認

ID       Type     Priority  Status   Submit Date
example  service  50        running  2019-04-09T11:49:04Z

詳細確認。version:3 がないのは、停止も数えているため?

$ nomad status example
↓
Allocations
ID        Node ID   Task Group  Version  Desired  Status    Created     Modified
1f19159f  b62b7daf  cache       4        run      running   13s ago     1s ago
381f28af  b62b7daf  cache       4        run      running   13s ago     11s ago
513949b9  b62b7daf  cache       4        run      running   13s ago     11s ago
f624164c  b62b7daf  cache       2        stop     complete  6m20s ago   1m40s ago
bca76393  b62b7daf  cache       2        stop     complete  6m45s ago   1m40s ago
33eac15d  b62b7daf  cache       2        stop     complete  7m10s ago   1m39s ago
47dff45a  b62b7daf  cache       1        stop     complete  17m27s ago  1m44s ago
baf43930  b62b7daf  cache       1        stop     complete  17m27s ago  1m44s ago
dfee8772  b62b7daf  cache       1        stop     complete  39m19s ago  1m44s ago

nomad チュートリアルはもう少し。kubernetes で調べたいことが終わってない。kafka もやりたい。