Class: Evilution::ProcessSupervisor
- Inherits:
-
Object
- Object
- Evilution::ProcessSupervisor
- Defined in:
- lib/evilution/process_supervisor.rb
Overview
Single owner of the process-lifecycle invariant: every pid spawned here is group-isolated, tracked in a signal-safe registry, group-signalled through a TERM/KILL ladder, and reaped – with its fds closed and sandbox dir removed.
EV-9f3b / EV-5rrh, Track A step 1. Generalizes the lock-free COW WorkerRegistry (EV-jwao) and absorbs ProcessCleanup.safe_kill/safe_wait semantics. Pure unit: no call sites are migrated here – Isolation::Fork (inner path) and WorkQueue::Worker (outer path) are routed through it in later steps (EV-3aw3, EV-dg69, EV-7a91).
Shape: instances own the lifecycle of the children they spawn, but every handle is also recorded in ONE process-global registry so the Runner signal trap can ‘.signal_all` across every fork-site through a single owner.
Signal-safety: under MRI a trap handler runs on the main thread between VM instructions, so it must not acquire a Mutex (the main thread may hold it -> deadlock). register/unregister swap @registry for a freshly built frozen array via a single atomic reference assignment (copy-on-write). The trap reads the current reference once and iterates that complete, immutable snapshot – no torn reads, no lock.
Defined Under Namespace
Classes: Handle
Constant Summary collapse
- GRACE_PERIOD =
2
Class Attribute Summary collapse
-
.registry ⇒ Object
readonly
Frozen snapshot.
Class Method Summary collapse
-
.kill_and_reap_all ⇒ Object
Trap-safe teardown of every registered child: SIGKILL each process group (sweeping grandchildren) and the bare leader pid, then reap the leaders so they cannot zombie, and clear the registry.
- .register(handle) ⇒ Object
-
.reset_for_child! ⇒ Object
Drop every inherited entry so a freshly forked child starts owning nothing.
- .signal_all(sig) ⇒ Object
- .unregister(handle) ⇒ Object
Instance Method Summary collapse
-
#reap(handle) ⇒ Object
Reap the leader (ECHILD-tolerant if already reaped), then unconditionally release the resources the handle owns: close parent-side fds, remove the sandbox dir, and drop the handle from the registry.
-
#reap_nonblock(handle) ⇒ Object
Non-blocking reap for callers that poll a child’s liveness as part of a read protocol (e.g. Isolation::Fork’s marshal-pipe loop).
-
#signal_group(sig, handle) ⇒ Object
Signal the child’s whole process group (-pgid) to sweep any grandchildren, then the bare pid as a fallback for the case where setpgid failed (no group exists, so the group signal is a harmless Errno::ESRCH).
-
#spawn(sandbox_dir: nil, fds: [], isolate_in_child: true) ⇒ Object
Fork a child that becomes its own process-group leader and runs the block, returning a Handle.
-
#terminate(handle, grace: GRACE_PERIOD) ⇒ Object
Bounded TERM -> grace -> KILL ladder, then reap.
Class Attribute Details
.registry ⇒ Object (readonly)
Frozen snapshot. Safe to read from a signal handler.
39 40 41 |
# File 'lib/evilution/process_supervisor.rb', line 39 def registry @registry end |
Class Method Details
.kill_and_reap_all ⇒ Object
Trap-safe teardown of every registered child: SIGKILL each process group (sweeping grandchildren) and the bare leader pid, then reap the leaders so they cannot zombie, and clear the registry. Reads the COW snapshot once – no Mutex, safe from a signal handler.
EV-7a91: a process about to die on a fatal signal must not leave the children it OWNS behind. The Runner’s group-kill reaches only the worker groups; the inner per-mutation children left those groups (setpgid, EV-2sh8) and live in the worker’s own registry, so only the worker – their parent – can kill AND reap them before it dies. Without the reap they survive as zombies until some ancestor exits and init collects them, which never comes when evilution runs embedded in a long-lived host process.
80 81 82 83 84 85 86 87 88 89 90 |
# File 'lib/evilution/process_supervisor.rb', line 80 def kill_and_reap_all snapshot = @registry snapshot.each do |handle| kill_tolerant("KILL", -handle.pgid) kill_tolerant("KILL", handle.pid) end # Reap only after every group has been signalled, so a slow-to-die child # never delays killing the others' subtrees. snapshot.each { |handle| reap_tolerant(handle.pid) } # rubocop:disable Style/CombinableLoops @registry = (@registry - snapshot).freeze end |
.register(handle) ⇒ Object
41 42 43 |
# File 'lib/evilution/process_supervisor.rb', line 41 def register(handle) @registry = (@registry + [handle]).freeze end |
.reset_for_child! ⇒ Object
Drop every inherited entry so a freshly forked child starts owning nothing. A child inherits a COW copy of this registry, but the handles in it belong to the PARENT (e.g. sibling workers); if the child later signalled or reaped them – via signal_all / kill_and_reap_all in its own signal handler – it would tear down processes it never spawned. The child re-registers only what it spawns itself.
64 65 66 |
# File 'lib/evilution/process_supervisor.rb', line 64 def reset_for_child! @registry = [].freeze end |
.signal_all(sig) ⇒ Object
49 50 51 52 53 54 55 56 |
# File 'lib/evilution/process_supervisor.rb', line 49 def signal_all(sig) @registry.each do |handle| Process.kill(sig, -handle.pgid) rescue Errno::ESRCH # Group already gone (leader + subtree reaped) -- nothing to signal. nil end end |
.unregister(handle) ⇒ Object
45 46 47 |
# File 'lib/evilution/process_supervisor.rb', line 45 def unregister(handle) @registry = @registry.reject { |existing| existing.pid == handle.pid }.freeze end |
Instance Method Details
#reap(handle) ⇒ Object
Reap the leader (ECHILD-tolerant if already reaped), then unconditionally release the resources the handle owns: close parent-side fds, remove the sandbox dir, and drop the handle from the registry.
160 161 162 163 164 |
# File 'lib/evilution/process_supervisor.rb', line 160 def reap(handle) safe_wait(handle.pid) ensure release(handle) end |
#reap_nonblock(handle) ⇒ Object
Non-blocking reap for callers that poll a child’s liveness as part of a read protocol (e.g. Isolation::Fork’s marshal-pipe loop). Returns false while the child is still running – the handle stays registered so a signal trap can still reach it. Once the child has exited (or was already reaped), it releases the handle in the same step it reaps, so the process-global registry never holds a stale, already-reaped pgid.
172 173 174 175 176 177 |
# File 'lib/evilution/process_supervisor.rb', line 172 def reap_nonblock(handle) return false unless nonblocking_wait(handle.pid) release(handle) true end |
#signal_group(sig, handle) ⇒ Object
Signal the child’s whole process group (-pgid) to sweep any grandchildren, then the bare pid as a fallback for the case where setpgid failed (no group exists, so the group signal is a harmless Errno::ESRCH).
141 142 143 144 |
# File 'lib/evilution/process_supervisor.rb', line 141 def signal_group(sig, handle) safe_kill(sig, -handle.pgid) safe_kill(sig, handle.pid) end |
#spawn(sandbox_dir: nil, fds: [], isolate_in_child: true) ⇒ Object
Fork a child that becomes its own process-group leader and runs the block, returning a Handle. By default the child calls setpgid(0, 0) before yielding so any grandchildren it forks join its group and can be swept by a group signal; the parent repeats setpgid(pid, pid) to close the race where it signals before the child has isolated itself. The handle is registered BEFORE the parent-side setpgid so the trap can never observe a child that is already a group leader yet missing from the registry (EV-jwao race).
isolate_in_child: false suppresses the child-side setpgid for long-lived workers (the outer path): the child must NOT become its own group leader until the parent has registered it, otherwise a trap firing between fork and register would see a leader it cannot signal. With only the parent-side, post-register setpgid, the child stays in the parent group (reachable by the terminal signal directly) until the registry already lists it.
121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 |
# File 'lib/evilution/process_supervisor.rb', line 121 def spawn(sandbox_dir: nil, fds: [], isolate_in_child: true) pid = ::Process.fork do self.class.reset_for_child! isolate_self if isolate_in_child yield end # Track the sandbox first thing after fork: if the parent takes a fatal # signal before isolate_child returns, Runner's trap (TempDirTracker # .cleanup_all) can still see and remove it, narrowing the leak window. Evilution::TempDirTracker.register(sandbox_dir) if sandbox_dir handle = Handle.new(pid: pid, pgid: pid, fds: fds, sandbox_dir: sandbox_dir) self.class.register(handle) isolate_child(pid) handle end |
#terminate(handle, grace: GRACE_PERIOD) ⇒ Object
Bounded TERM -> grace -> KILL ladder, then reap. Always ends with the child reaped and its resources released, whichever rung it dies on.
148 149 150 151 152 153 154 155 |
# File 'lib/evilution/process_supervisor.rb', line 148 def terminate(handle, grace: GRACE_PERIOD) signal_group("TERM", handle) unless exited?(handle.pid) sleep(grace) signal_group("KILL", handle) unless exited?(handle.pid) end reap(handle) end |