f:id:ysmn_deus:20190122112104p:plain

どうも、靖宗です。
お次のタイトルは「Dynamic supervisors」。
今回も前回からの引き続きです。

忘れかけてましたが、第3章の最後に

{:ok, pid} = KV.Bucket.start_link([]) # リンク！
ref = Process.monitor(pid) # モニター！

が出てきて冗長でよくない！という話でした。
このままだとBucketが異常終了したらRegistryも落ちますし、全てのデータが消滅しかねません。
そこで、下記のテストを通るようにtest/registry_test.exsを実装していきます。

test "removes bucket on crash", %{registry: registry} do
  KV.Registry.create(registry, "shopping")
  {:ok, bucket} = KV.Registry.lookup(registry, "shopping")

  # Stop the bucket with non-normal reason
  Agent.stop(bucket, :shutdown) # ここが異なる
  assert KV.Registry.lookup(registry, "shopping") == :error
end

以前に作った“removes bucket on exit”のテストと同じですが、このテストではAgent.stop(bucket, :shutdown)となっています。
プロセスは:normal以外の理由で終了すると、リンクしているプロセス全てにEXITシグナルを送信します。（つまりリンクしてるプロセスが全て終了する）
試しにテストを実行してみます。

PS > mix test
...

  1) test removes bucket on crash (KV.RegistryTest)
     test/kv/registry_test.exs:26
     ** (exit) exited in: GenServer.call(#PID<0.165.0>, {:lookup, "shopping"}, 5000)
         ** (EXIT) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started
     code: assert KV.Registry.lookup(registry, "shopping") == :error
     stacktrace:
       (elixir) lib/gen_server.ex:989: GenServer.call/3
       test/kv/registry_test.exs:32: (test)

...

Finished in 0.04 seconds
1 doctest, 6 tests, 1 failure

Randomized with seed 710000

Registry内のBucketが異常終了したことにより、リンクされてるRegistryも終了しているので、GenServer.call/3で呼び出してるプロセス（Registry）IDの反応が無くなってます。

この辺を解決するのにDynamicSupervisorが用いられるのだそうですが、どのように活用するんでしょうか。

The bucket supervisor

とりあえずサンプルに従っていきます。
KV.BucketSupervisorを定義していきます。lib/kv/supervisor.exのinitを編集していきます。

  def init(:ok) do
    children = [
      {KV.Registry, name: KV.Registry},
      {DynamicSupervisor, name: KV.BucketSupervisor, strategy: :one_for_one} # ここ追加
    ]

    Supervisor.init(children, strategy: :one_for_one)
  end

わざわざKV.BucketSupervisorというモジュールを作成しなくていいみたいです。
きちんと動作するかiex -S mixで確認していきます。

iex> {:ok, bucket} = DynamicSupervisor.start_child(KV.BucketSupervisor, KV.Bucket)
{:ok, #PID<0.72.0>}
iex> KV.Bucket.put(bucket, "eggs", 3)
:ok
iex> KV.Bucket.get(bucket, "eggs")
3

これでBucketがAgentモジュールから直接起動しなくても良くなりました。
逐次childrenのプロセスをスタートしていくことからdynamicという名前なのでしょう。

このDynamicSupervisorを使うようにKV.Registryを修正していきます。

  def handle_cast({:create, name}, {names, refs}) do
    if Map.has_key?(names, name) do
      {:noreply, {names, refs}}
    else
      {:ok, pid} = DynamicSupervisor.start_child(KV.BucketSupervisor, KV.Bucket) # ここを変更
      ref = Process.monitor(pid)
      refs = Map.put(refs, ref, name)
      names = Map.put(names, name, pid)
      {:noreply, {names, refs}}
    end
  end

これでRegistryまで終了することはなくなるはずです。
テストを回します。

PS > mix test
Compiling 2 files (.ex)
.......

Finished in 0.03 seconds
1 doctest, 6 tests, 0 failures

Randomized with seed 614000

通りました✌('ω')
ただし、本当はテストで終了したDynamicSupervisorのBucketは終了が検知されDynamicSupervisorにより再起動が行われているそうです。
ですがRegistryは現状そのプロセスIDを知りようがないので見知らぬプロセスが走り続ける事になります。
これを解消するために、Bucketに「なんかクラッシュしたら再起動しないで」と明記する必要があります。
これはKV.Bucketに書いていきます。

defmodule KV.Bucket do
  use Agent, restart: :temporary

Agentのオプションとして指定できるようです。
明示的にBucketに:temporaryが指定してあるかチェックするテストも書いときます。

  test "are temporary workers" do
    assert Supervisor.child_spec(KV.Bucket, []).restart == :temporary
  end

Agentにオプションを指定していれば問題無くテストも通るかと思います。

ここで「再起動しないならSupervisor管理じゃなくてええやん」という気にもなりますが、supervision tree中でプロセスを動かすと嬉しいらしいです。
supervision tree？

Supervision trees

supervisorの起動の順番や依存関係についてをSupervision treesと呼んでいるみたいです。
supervisor同士の関係性や再起動などの振る舞い（strategy）を修正していきます。

まず、今回のアプリケーションではKV.RegistryとKV.Bucketの2種類がsupervisorで管理されてます。
内部の実装としてはKV.Bucketのsupervisorが立ち上がってからKV.Registryが管理される必要がある（KV.RegistryでBucketを管理してるから）ので、この順番を明示的にする必要があります。

  def init(:ok) do
    children = [
      {DynamicSupervisor, name: KV.BucketSupervisor, strategy: :one_for_one}, # 順番が入れ替わってる
      {KV.Registry, name: KV.Registry} # 順番が入れ替わってる
    ]

    Supervisor.init(children, strategy: :one_for_one)
  end

お次に振る舞い（strategy）を調整していきます。
KV.Registryに関しては、Registryが終了したときにBucketが残ってると無意味なプロセスが残ってしまうことになります。
故にKV.Registryが終了したときはKV.Bucketを全部終了させるように変更します。

  def init(:ok) do
    children = [
      {DynamicSupervisor, name: KV.BucketSupervisor, strategy: :one_for_one},
      {KV.Registry, name: KV.Registry}
    ]

    Supervisor.init(children, strategy: :one_for_all) # ここを変更
  end

strategyには一つ一つのsupervisorが独立して互いに起動/再起動を干渉し合わない:one_for_one、クラッシュしたsupervisorが元々起動したタイミング以降に立ち上がったsupervisorを終了させて再起動させる:rest_for_one、クラッシュしたら他の全supervisorも再起動するone_for_allがあるようです。

この順番とストラテジは慎重に吟味する必要があるかもしれません。なぜか動かない/なぜか動くに繋がりそうな項目です・・・

Shared state in tests

さて、Registryなどをsupervisorで管理するように変更したので、果たして今までのテストで問題無いかを検証する必要があります。
特にRegistryはBucketを管理していましたが、Bucketの管理はsupervisorがしているので若干テストが変わってくる恐れがあります。とくに

setup do
  registry = start_supervised!(KV.Registry)
  %{registry: registry}
end

として、各々のテストが違うRegistryを利用する設定になっています。
ただし、BucketはKV.BucketSupervisorに登録されるのでSupervisor.count_children(KV.BucketSupervisor)のようなBucketの総数を数えるテストがあった場合にはテストの修正が必要になります。
今回は特に修正不要ですが、supervisorを利用したアプリケーションのテストはこの辺にも気をつける必要がありそうです。

Observer

Debuggingのところで出てこなかった奴です。
あのときはよく分かってなかったiex -S mixのコマンドが分かるのでやってみます。

iex> :observer.start

起動しました✌('ω')

f:id:ysmn_deus:20190213142058p:plain

supervisorの関係性（Supervision trees？）などが分かりやすくGUIで表示されています。
ここで、Bucketを追加したりするとGUIにも反映されるそうです。

iex(2)> KV.Registry.create(KV.Registry, "shopping")
:ok
iex(3)> KV.Registry.lookup(KV.Registry, "shopping")
{:ok, #PID<0.176.0>}

f:id:ysmn_deus:20190213142312p:plain

これええですね～＾ω＾
ダブルクリックで詳細を見たり、右クリックで終了したりメッセージを送ったりできるみたいです。クッソ便利やん。
ここでGUIで操作するためだけにでもsupervisorで管理するメリットがありそうです。

技術メモ

プログラミングとか電子工作とか

Elixir入門（Mix and OTP編第5章 Dynamic supervisors）

The bucket supervisor

Supervision trees

Shared state in tests

Observer