Skip to main content

Elixir Syn: Custom Event Handler Callbacks for Process Conflict Resolution

ยท 5 min read
๐Ÿ‘‹ I'm a dev at Supabase

I work on logging and analytics, and manage the underlying service that Supabase Logs and Logflare. The service does over 7 billion requests each day with traffic constantly growing, and these devlog posts talk a bit about my day-to-day open source dev work.

It serves as some insight into what one can expect when working on high availability software, with real code snippets and PRs too. Enjoy!๐Ÿ˜Š

The :syn library provides a distributed process registry for Elixir applications, offering an alternative to :global for name registration across clusters. It allows you to define custom event handler callbacks to handle process conflicts and registration scenarios.

The out-of-the-box features will largely suit majority of use cases, but there are a few important behaviours to consider:

  1. :syn will always default to keeping the most recently registered process. This may result in older state being lost due to the conflict resolution.
  2. :syn by defualt has millisecond precision when comparing process recency. In clustered setups with high number of nodes, this may result in conflicts being resolved incorrectly without a deterministic resolution strategy.
  3. The moment a custom event handler callback is implemented, it will override the default behaviour of :syn and all process conflicts MUST be resolved and handled within the callback. :syn will not perfom any cleanup of processes post-callback, hence it is very important to terminate all unwanted processes within the callback to prevent memory leaks or other unexpected behaviour.

Understanding Syn Event Handlersโ€‹

When multiple processes attempt to register with the same name across a distributed cluster, :syn provides custom event handlers to resolve these conflicts. These handlers are useful for process migration between nodes, network partition recovery, supervisor restart scenarios, and cases where high-precision timestamp-based conflict resolution is needed.

Let's explore a few scenarios where custom event handlers can be useful.

Killing Processes and Supervisorsโ€‹

In scenarios where you want to ensure only one process exists for a given name, you might want to terminate conflicting processes or their supervisors.

defmodule MyApp.SynEventHandler do
@behaviour :syn_event_handler

def on_process_registered(scope, name, pid, meta) do
# Process successfully registered
:ok
end

def on_process_unregistered(scope, name, pid, meta, reason) do
# Process unregistered
:ok
end

def on_registry_conflict(scope, name, {pid1, meta1}, {pid2, meta2}) do
# Kill the newer process and its supervisor
case compare_registration_priority(meta1, meta2) do
:keep_first ->
terminate_process_and_supervisor(pid2)
{pid1, meta1}

:keep_second ->
terminate_process_and_supervisor(pid1)
{pid2, meta2}
end
end

defp terminate_process_and_supervisor(pid) do
# Find and terminate the supervisor
case find_supervisor(pid) do
{:ok, supervisor_pid} ->
Supervisor.terminate_child(supervisor_pid, pid)
:error ->
try_to_stop_process(pid)
end
end

@doc """
Tries to stop a process gracefully. If it fails, it sends a signal to the process.
"""
@spec try_to_stop_process(pid(), atom(), atom()) :: :ok | :noop
defp try_to_stop_process(pid, signal \\ :shutdown, force_signal \\ :kill) do
GenServer.stop(pid, signal, 5_000)
:ok
rescue
_ ->
Process.exit(pid, force_signal)
:ok
catch
:exit, _ ->
:noop
end

defp find_supervisor(pid) do
# Implementation to find the supervisor of a given process
# This could involve walking the supervision tree
end

defp compare_registration_priority(meta1, meta2) do
# Custom logic to determine which process should be kept
# Could be based on node priority, timestamps, etc.
end
end

Keeping the Original Processโ€‹

Sometimes you want to preserve the original process and reject new registration attempts:

defmodule MyApp.KeepOriginalHandler do
@behaviour :syn_event_handler

def on_registry_conflict(scope, name, {pid1, _meta1, timestamp1}, {pid2, _meta2, timestamp2}) do
# Always keep the first registered process
# this is in millisecond precision
if timestamp1 < timestamp2 do
Logger.info("Keeping original process #{inspect(pid1)} for #{name}")
pid1
else
Logger.info("Keeping original process #{inspect(pid2)} for #{name}")
pid2
end
end
end

However, what if we somehow have a situation where the timestamps are exactly the same (no matter how unlikely it is)? We can use nanosecond timestamps stored in process metadata to resolve the conflict with higher precision.

Nanosecond Timestamp Resolutionโ€‹

First, register processes with nanosecond timestamp metadata:

defmodule MyApp.MyProcess do
@doc """
Registers a process with nanosecond timestamp metadata for high-precision conflict resolution.
"""
def start_link(some_args) do
nanosecond_timestamp = System.os_time(:nanosecond)
GenServer.start_link(__MODULE__, some_arg, name: {:via, :syn, {:my_scope, __MODULE__, %{timestamp: nanosecond_timestamp}}})
end
end

Then implement the event handler with fallback to syn's built-in millisecond timestamp when metadata isn't available:

defmodule MyApp.SynEventHandler do
@moduledoc """
Event handler for syn. Always keeps the oldest process.
"""
@behaviour :syn_event_handler

require Logger

@impl true
def resolve_registry_conflict(scope, name, pid_meta1, pid_meta2) do
{original, to_stop} = keep_original(pid_meta1, pid_meta2)

# Only stop process if we're the local node responsible for it
if node() == node(to_stop) do
{pid1, _meta1, _} = pid_meta1
{pid2, _meta2, _} = pid_meta2

try_to_stop_process(to_stop, :shutdown, :kill)
end

original
end

# Use nanosecond-precision timestamp from metadata when available
defp keep_original(
{pid1, %{timestamp: timestamp1}, _syn_timestamp1},
{pid2, %{timestamp: timestamp2}, _syn_timestamp2}
) do
if timestamp1 < timestamp2, do: {pid1, pid2}, else: {pid2, pid1}
end

# Fallback to syn's built-in millisecond timestamp when metadata isn't present
defp keep_original(
{pid1, _meta1, syn_timestamp1},
{pid2, _meta2, syn_timestamp2}
) do
if syn_timestamp1 < syn_timestamp2, do: {pid1, pid2}, else: {pid2, pid1}
end

defp try_to_stop_process(pid, signal, force_signal) do
GenServer.stop(pid, signal, 5_000)
rescue
_ -> Process.exit(pid, force_signal)
catch
:exit, _ -> :noop
end
end

Configuration and Usage of a Custom Event Handlerโ€‹

Configure your syn event handler in your application:

# In your application.ex or config
def start(_type, _args) do
children = [
# Other children...
{:syn, [
event_handler: MyApp.SynEventHandler,
# other syn options
]}
]

Supervisor.start_link(children, strategy: :one_for_one)
end

Register processes with metadata for conflict resolution:

# Register with timestamp metadata
:syn.register(:my_scope, "unique_name", self(), %{
registered_at: System.monotonic_time(),
nano_timestamp: :erlang.monotonic_time(:nanosecond),
node: Node.self(),
priority: 1
})

Best Practicesโ€‹

  1. Always include timestamps in metadata for conflict resolution
  2. Handle supervisor relationships carefully when terminating processes
  3. Use monotonic time for reliable ordering across nodes
  4. Log conflict resolutions for debugging and monitoring
  5. Test partition scenarios thoroughly

Monitoring and Observabilityโ€‹

Monitor syn registry conflicts and resolutions:

# Add telemetry events in your event handler
def on_registry_conflict(scope, name, proc1, proc2) do
:telemetry.execute(
[:syn, :conflict, :resolved],
%{count: 1},
%{scope: scope, name: name}
)

# ... conflict resolution logic
end

The :syn library's event handler system enables you to manage distributed process registration conflicts, resulting in robust and predictable behavior in complex distributed systems.