Class: Ignis::Collective::Topology::Matrix
- Inherits:
-
Object
- Object
- Ignis::Collective::Topology::Matrix
- Defined in:
- lib/nvruby/collective/topology.rb
Overview
Topology matrix for a set of GPUs
Instance Attribute Summary collapse
-
#device_ids ⇒ Array<Integer>
readonly
List of GPU device IDs.
-
#paths ⇒ Hash<Array<Integer>, Path>
readonly
Map of [src, dst] to Path.
Instance Method Summary collapse
-
#full_p2p_mesh? ⇒ Boolean
Check if all GPUs have full P2P mesh.
-
#initialize(device_ids) ⇒ Matrix
constructor
A new instance of Matrix.
-
#nvlink_paths ⇒ Array<Path>
Get all paths with NVLink connectivity.
-
#optimal_ring_order ⇒ Array<Integer>
Get optimal ring order based on topology Minimizes total latency by placing NVLink-connected GPUs adjacent.
-
#p2p_paths ⇒ Array<Path>
Get all paths with P2P support.
-
#path(src, dst) ⇒ Path?
Get path between two GPUs.
-
#to_s ⇒ String
Human-readable matrix representation.
Constructor Details
#initialize(device_ids) ⇒ Matrix
Returns a new instance of Matrix.
103 104 105 106 107 |
# File 'lib/nvruby/collective/topology.rb', line 103 def initialize(device_ids) @device_ids = device_ids.dup.freeze @paths = {} build_matrix! end |
Instance Attribute Details
#device_ids ⇒ Array<Integer> (readonly)
Returns List of GPU device IDs.
97 98 99 |
# File 'lib/nvruby/collective/topology.rb', line 97 def device_ids @device_ids end |
#paths ⇒ Hash<Array<Integer>, Path> (readonly)
Returns Map of [src, dst] to Path.
100 101 102 |
# File 'lib/nvruby/collective/topology.rb', line 100 def paths @paths end |
Instance Method Details
#full_p2p_mesh? ⇒ Boolean
Check if all GPUs have full P2P mesh
157 158 159 |
# File 'lib/nvruby/collective/topology.rb', line 157 def full_p2p_mesh? @paths.values.all?(&:p2p_supported) end |
#nvlink_paths ⇒ Array<Path>
Get all paths with NVLink connectivity
145 146 147 |
# File 'lib/nvruby/collective/topology.rb', line 145 def nvlink_paths @paths.values.select { |p| p.interconnect_type == :nvlink } end |
#optimal_ring_order ⇒ Array<Integer>
Get optimal ring order based on topology Minimizes total latency by placing NVLink-connected GPUs adjacent
122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 |
# File 'lib/nvruby/collective/topology.rb', line 122 def optimal_ring_order return @device_ids.dup if @device_ids.size <= 2 # Greedy nearest-neighbor heuristic remaining = @device_ids.dup order = [remaining.shift] until remaining.empty? current = order.last # Find GPU with best connection to current best_next = remaining.min_by do |gpu| path_obj = path(current, gpu) path_obj ? path_obj.performance_rank : Float::INFINITY end order << best_next remaining.delete(best_next) end order end |
#p2p_paths ⇒ Array<Path>
Get all paths with P2P support
151 152 153 |
# File 'lib/nvruby/collective/topology.rb', line 151 def p2p_paths @paths.values.select(&:p2p_supported) end |
#path(src, dst) ⇒ Path?
Get path between two GPUs
113 114 115 116 117 |
# File 'lib/nvruby/collective/topology.rb', line 113 def path(src, dst) return nil if src == dst @paths[[src, dst]] end |
#to_s ⇒ String
Returns Human-readable matrix representation.
162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 |
# File 'lib/nvruby/collective/topology.rb', line 162 def to_s header = "Topology Matrix (#{@device_ids.size} GPUs)\n" rows = @device_ids.map do |src| cols = @device_ids.map do |dst| if src == dst " - " else path_obj = path(src, dst) type_abbr = path_obj.interconnect_type.to_s[0..3].upcase "#{type_abbr.ljust(5)}" end end "GPU#{src}: #{cols.join(' | ')}" end header + rows.join("\n") end |