Top Level Namespace
Defined Under Namespace
Modules: Clacky, Enumerable, RubyRich, YAMLCompat
Classes: BuiltinSkillsInstaller, ExternalSkillsImportRunner, ExternalSkillsImporter, FeishuSkillsInstaller, OpenClawImporter, RetryableError, ZipSkillInstaller
Constant Summary
collapse
- MIN_CONTENT_BYTES =
Minimum useful output (in bytes). Below this, a step is considered a miss and the next fallback is tried.
20
- SCRIPT_DIR =
Script directory — resolve sibling .py helpers relative to this file so it works both from the gem’s default_parsers/ dir and from the copied-to-user ~/.clacky/parsers/ dir.
File.dirname(File.expand_path(__FILE__))
- PRIMARY_HOST =
Primary CDN-accelerated endpoint. Fallback bypasses EdgeOne and is used when the primary times out or errors.
ENV.fetch("CLACKY_LICENSE_SERVER", "https://www.openclacky.com")
- FALLBACK_HOST =
"https://openclacky.up.railway.app"
- API_HOSTS =
When the env override is set we use only that host (dev/test mode).
ENV["CLACKY_LICENSE_SERVER"] ? [PRIMARY_HOST] : [PRIMARY_HOST, FALLBACK_HOST]
- HMAC_SECRET =
ENV.fetch("CARD_HMAC_SECRET", "openclacky-card-v1-default-secret-change-me")
- TOKEN_FILE =
File.expand_path("~/clacky_workspace/personal_website/token.json")
- OPEN_TIMEOUT =
8
- READ_TIMEOUT =
15
- ATTEMPTS_PER_HOST =
2
- INITIAL_BACKOFF =
0.5
- ILINK_BASE_URL =
"https://ilinkai.weixin.qq.com"
- BOT_TYPE =
"3"
- QR_POLL_TIMEOUT_S =
slightly above server’s 35s long-poll
37
- LOGIN_DEADLINE_S =
5 * 60
- CLACKY_SERVER_URL =
begin
url = "http://#{ENV.fetch("CLACKY_SERVER_HOST")}:#{ENV.fetch("CLACKY_SERVER_PORT")}"
uri = URI.parse(url)
raise "Invalid CLACKY_SERVER_URL: #{url}" unless uri.is_a?(URI::HTTP) && uri.host && uri.port
url
end
- FETCH_QR_MODE =
ARGV.include?("--fetch-qr")
- QRCODE_ID_IDX =
ARGV.index("--qrcode-id")
- GIVEN_QRCODE_ID =
QRCODE_ID_IDX ? ARGV[QRCODE_ID_IDX + 1] : nil
- WEIXIN_LOG_FILE =
Logging (suppress in –fetch-qr mode so stdout is clean JSON)
File.expand_path("~/.clacky/weixin_setup_debug.log")
- DISCORD_API_BASE =
"https://discord.com/api/v10"
- DISCORD_OAUTH_BASE =
"https://discord.com/oauth2/authorize"
- DISCORD_PORTAL_URL =
"https://discord.com/developers/applications"
- DEFAULT_BOT_PERMS =
"274877990912"
- DEFAULT_BOT_SCOPES =
"bot applications.commands"
- WATCH_GUILD_DEADLINE =
10 * 60
- WATCH_GUILD_INTERVAL =
3
- USER_AGENT =
"DiscordBot (https://github.com/clackyai/openclacky, 1.0)"
- DINGTALK_REG_BASE =
"https://oapi.dingtalk.com"
- DINGTALK_REG_SOURCE =
Registration source ID assigned by DingTalk (not a brand string — do not rebrand).
"DING_DWS_CLAW"
- POLL_INTERVAL =
3
- POLL_TIMEOUT =
300
Instance Method Summary
collapse
-
#build_markdown_table(rows) ⇒ Object
-
#cmd_delete(slug: nil) ⇒ Object
-
#cmd_publish(name:, html_file:) ⇒ Object
── Commands ──────────────────────────────────────────────────────────────────.
-
#device_fingerprint ⇒ Object
── HMAC signing ─────────────────────────────────────────────────────────────.
-
#discord_get(path, bot_token:, timeout: 15) ⇒ Object
-
#display_qr(qrcode_url) ⇒ Object
————————————————————————— QR code display (non-fetch-qr mode only) —————————————————————————.
-
#do_http_request(method, base, path, body:, extra_headers:) ⇒ Object
-
#extract_runs(para_node) ⇒ Object
-
#extract_text(shape_node) ⇒ Object
-
#fail!(msg) ⇒ Object
-
#hmac_headers ⇒ Object
-
#http_request(method, path, body: nil, extra_headers: {}) ⇒ Object
Resilient HTTP request: retries on transient errors, then fails over to the fallback host before giving up.
-
#ilink_get(path, extra_headers: {}, timeout: 15) ⇒ Object
-
#load_token_data ⇒ Object
── Token storage ─────────────────────────────────────────────────────────────.
-
#log(msg) ⇒ Object
In fetch-qr mode, write to stderr so stdout stays clean JSON.
-
#mode_poll(device_code, expires_in: POLL_TIMEOUT, interval: POLL_INTERVAL) ⇒ Object
── Mode: –poll <device_code> ──────────────────────────────────────────────── Poll until SUCCESS or a terminal state.
-
#mode_print_qr ⇒ Object
── Mode: –print-qr ───────────────────────────────────────────────────────── Call init + begin, print JSON with qr_url / device_code / expires_in, exit 0.
-
#ok(msg) ⇒ Object
-
#parse_paragraph(node, styles, numbering) ⇒ Object
-
#parse_row(row_node, shared_strings) ⇒ Object
-
#parse_slide(doc, slide_num) ⇒ Object
-
#parse_table(tbl_node) ⇒ Object
-
#poll_until_confirmed(qrcode) ⇒ Object
————————————————————————— Long-poll loop (shared by all modes) —————————————————————————.
-
#post_json(url, payload) ⇒ Object
-
#random_wechat_uin ⇒ Object
————————————————————————— iLink HTTP helpers —————————————————————————.
-
#read_document_xml(body) ⇒ Object
-
#read_numbering(body) ⇒ Object
-
#read_styles(body) ⇒ Object
-
#read_zip_entry(body, name) ⇒ Object
-
#safe_utf8(str) ⇒ Object
-
#save_to_server(bot_token:) ⇒ Object
————————————————————————— Clacky server — save credentials —————————————————————————.
-
#save_token_data(data) ⇒ Object
-
#saved_bot_token ⇒ Object
-
#server_get(path) ⇒ Object
-
#server_post(path, body) ⇒ Object
-
#step(msg) ⇒ Object
-
#try_antiword(path) ⇒ Object
Use antiword to extract text from .doc files (Linux/WSL).
-
#try_libreoffice(path, ext) ⇒ Object
Convert WPS formats to text using LibreOffice headless mode.
-
#try_ocr(path) ⇒ Object
OCR fallback for scanned/image-only PDFs.
-
#try_pdfplumber(path) ⇒ Object
-
#try_pdftotext(path) ⇒ Object
-
#try_textutil(path) ⇒ Object
Use macOS textutil to convert .doc → txt.
-
#warn(msg) ⇒ Object
-
#warn!(msg) ⇒ Object
-
#wlog(msg) ⇒ Object
Instance Method Details
#build_markdown_table(rows) ⇒ Object
33
34
35
36
37
38
39
40
41
42
|
# File 'lib/clacky/default_parsers/xlsx_parser.rb', line 33
def build_markdown_table(rows)
col_count = rows.map(&:size).max
lines = []
rows.each_with_index do |row, i|
padded = row + [""] * [col_count - row.size, 0].max
lines << "| #{padded.join(" | ")} |"
lines << "|#{" --- |" * col_count}" if i == 0
end
lines.join("\n")
end
|
#cmd_delete(slug: nil) ⇒ Object
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
|
# File 'lib/clacky/default_skills/personal-website/publish.rb', line 180
def cmd_delete(slug: nil)
token_data = load_token_data
token = token_data["update_token"]
slug = slug || token_data["slug"]
unless token && slug
warn "❌ No published website found (#{TOKEN_FILE} missing or incomplete)."
warn " Nothing to delete."
exit 1
end
status, body = http_request("DELETE", "/api/v1/personal_websites/#{slug}",
extra_headers: { "X-Card-Token" => token })
if status == 200
File.delete(TOKEN_FILE) if File.exist?(TOKEN_FILE)
puts "✅ Personal website deleted: /~#{slug}"
else
warn "❌ Delete failed (#{status}): #{body["error"] || body.inspect}"
exit 1
end
end
|
#cmd_publish(name:, html_file:) ⇒ Object
── Commands ──────────────────────────────────────────────────────────────────
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
|
# File 'lib/clacky/default_skills/personal-website/publish.rb', line 133
def cmd_publish(name:, html_file:)
unless File.exist?(html_file)
warn "❌ HTML file not found: #{html_file}"
exit 1
end
html_content = File.read(html_file, encoding: "utf-8")
if html_content.bytesize > 1_048_576
warn "❌ HTML file exceeds 1MB (#{html_content.bytesize / 1024}KB)"
exit 1
end
token_data = load_token_data
if token_data["slug"] && token_data["update_token"]
slug = token_data["slug"]
token = token_data["update_token"]
status, body = http_request("PATCH", "/api/v1/personal_websites/#{slug}",
body: { html_content: html_content },
extra_headers: { "X-Card-Token" => token })
if status == 200
puts "✅ Website updated: #{body["url"]}"
else
warn "❌ Update failed (#{status}): #{body["error"] || body.inspect}"
exit 1
end
else
status, body = http_request("POST", "/api/v1/personal_websites",
body: { name: name, html_content: html_content })
if status == 201
save_token_data("slug" => body["slug"], "update_token" => body["update_token"])
puts "✅ Website published: #{body["url"]}"
puts " Slug: #{body["slug"]}"
puts " Token saved to: #{TOKEN_FILE}"
else
warn "❌ Publish failed (#{status}): #{body["error"] || body.inspect}"
exit 1
end
end
end
|
#device_fingerprint ⇒ Object
── HMAC signing ─────────────────────────────────────────────────────────────
43
44
45
46
47
48
49
50
|
# File 'lib/clacky/default_skills/personal-website/publish.rb', line 43
def device_fingerprint
parts = []
parts << `hostname`.strip
hw = `system_profiler SPHardwareDataType 2>/dev/null | grep 'Hardware UUID'`.strip
parts << hw unless hw.empty?
parts << ENV["USER"].to_s
Digest::SHA256.hexdigest(parts.join("|"))[0, 16]
end
|
#discord_get(path, bot_token:, timeout: 15) ⇒ Object
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
|
# File 'lib/clacky/default_skills/channel-manager/discord_setup.rb', line 61
def discord_get(path, bot_token:, timeout: 15)
uri = URI("#{DISCORD_API_BASE}#{path}")
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
http.verify_mode = OpenSSL::SSL::VERIFY_PEER
http.read_timeout = timeout
http.open_timeout = 10
req = Net::HTTP::Get.new(uri.request_uri)
req["Authorization"] = "Bot #{bot_token}"
req["User-Agent"] = USER_AGENT
req["Accept"] = "application/json"
res = http.request(req)
body = res.body.to_s
parsed = (JSON.parse(body) rescue nil)
unless res.is_a?(Net::HTTPSuccess)
msg = parsed.is_a?(Hash) ? (parsed["message"] || body.slice(0, 200)) : body.slice(0, 200)
raise "Discord HTTP #{res.code} #{path}: #{msg}"
end
parsed
end
|
#display_qr(qrcode_url) ⇒ Object
QR code display (non-fetch-qr mode only)
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
|
# File 'lib/clacky/default_skills/channel-manager/weixin_setup.rb', line 112
def display_qr(qrcode_url)
displayed = false
if system("which qrencode > /dev/null 2>&1")
ascii = `qrencode -t ANSIUTF8 -o - #{Shellwords.shellescape(qrcode_url)} 2>/dev/null`
if $?.success? && !ascii.empty?
puts ascii
displayed = true
end
end
unless displayed
tmp_path = "/tmp/clacky-weixin-qr-#{Process.pid}.png"
if system("which qrencode > /dev/null 2>&1") &&
system("qrencode", "-o", tmp_path, qrcode_url, exception: false)
step("QR code saved to: #{tmp_path}")
system("open", tmp_path, exception: false) if RUBY_PLATFORM.include?("darwin")
displayed = true
end
end
unless displayed
$stderr.puts("[weixin-setup] Open this URL with WeChat to login:")
puts " #{qrcode_url}"
end
end
|
#do_http_request(method, base, path, body:, extra_headers:) ⇒ Object
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
|
# File 'lib/clacky/default_skills/personal-website/publish.rb', line 92
def do_http_request(method, base, path, body:, extra_headers:)
uri = URI.parse("#{base}#{path}")
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = uri.scheme == "https"
http.open_timeout = OPEN_TIMEOUT
http.read_timeout = READ_TIMEOUT
req_class = { "POST" => Net::HTTP::Post, "PATCH" => Net::HTTP::Patch,
"DELETE" => Net::HTTP::Delete }[method]
req = req_class.new(uri.path)
.each { |k, v| req[k] = v }
.each { |k, v| req[k] = v }
req.body = body.to_json if body
response = http.request(req)
parsed = JSON.parse(response.body) rescue { "raw" => response.body }
[response.code.to_i, parsed]
rescue Net::OpenTimeout, Net::ReadTimeout,
Errno::ECONNREFUSED, Errno::EHOSTUNREACH, Errno::ENETUNREACH,
Errno::ECONNRESET, EOFError, OpenSSL::SSL::SSLError => e
raise RetryableError, e.message
end
|
90
91
92
93
94
95
96
97
98
99
100
|
# File 'lib/clacky/default_parsers/docx_parser.rb', line 90
def (para_node)
parts = []
REXML::XPath.each(para_node, "w:r") do |run|
rpr = REXML::XPath.first(run, "w:rPr")
bold = REXML::XPath.first(rpr, "w:b") if rpr
text = REXML::XPath.match(run, "w:t").map(&:text).compact.join
next if text.empty?
parts << (bold ? "**#{text}**" : text)
end
parts.join
end
|
25
26
27
28
29
30
31
32
|
# File 'lib/clacky/default_parsers/pptx_parser.rb', line 25
def (shape_node)
paras = []
REXML::XPath.each(shape_node, ".//a:p") do |para|
text = REXML::XPath.match(para, ".//a:t").map(&:text).compact.join
paras << text unless text.strip.empty?
end
paras.join("\n")
end
|
#fail!(msg) ⇒ Object
68
69
70
71
72
73
74
75
|
# File 'lib/clacky/default_skills/channel-manager/weixin_setup.rb', line 68
def fail!(msg)
if FETCH_QR_MODE
$stdout.puts(JSON.generate({ error: msg }))
else
$stderr.puts("[weixin-setup] ❌ #{msg}")
end
exit 1
end
|
52
53
54
55
56
57
58
59
60
61
62
63
|
# File 'lib/clacky/default_skills/personal-website/publish.rb', line 52
def
ts = Time.now.to_i.to_s
fingerprint = device_fingerprint
payload = "openclacky:#{ts}:#{fingerprint}"
signature = OpenSSL::HMAC.hexdigest("SHA256", HMAC_SECRET, payload)
{
"X-Card-Timestamp" => ts,
"X-Card-Fingerprint" => fingerprint,
"X-Card-Signature" => signature,
"Content-Type" => "application/json"
}
end
|
#http_request(method, path, body: nil, extra_headers: {}) ⇒ Object
Resilient HTTP request: retries on transient errors, then fails over to the fallback host before giving up.
Returns [http_code_int, parsed_body_hash]. Calls exit(1) on network failure (all hosts/attempts exhausted).
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
|
# File 'lib/clacky/default_skills/personal-website/publish.rb', line 72
def http_request(method, path, body: nil, extra_headers: {})
last_error = nil
API_HOSTS.each_with_index do |base, host_index|
ATTEMPTS_PER_HOST.times do |attempt|
begin
result = do_http_request(method, base, path, body: body, extra_headers: )
return result
rescue RetryableError => e
last_error = e
backoff = INITIAL_BACKOFF * (2**attempt)
sleep(backoff)
end
end
end
warn "❌ Network error: #{last_error&.message || "unknown"}"
exit 1
end
|
#ilink_get(path, extra_headers: {}, timeout: 15) ⇒ Object
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
|
# File 'lib/clacky/default_skills/channel-manager/weixin_setup.rb', line 86
def ilink_get(path, extra_headers: {}, timeout: 15)
uri = URI("#{ILINK_BASE_URL}/#{path}")
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
http.verify_mode = OpenSSL::SSL::VERIFY_PEER
http.read_timeout = timeout
http.open_timeout = 10
req = Net::HTTP::Get.new(uri.request_uri)
req["AuthorizationType"] = "ilink_bot_token"
req["X-WECHAT-UIN"] = random_wechat_uin
.each { |k, v| req[k] = v }
res = http.request(req)
fail!("HTTP #{res.code} from #{path}: #{res.body.slice(0, 200)}") unless res.is_a?(Net::HTTPSuccess)
JSON.parse(res.body)
rescue Net::ReadTimeout, Net::OpenTimeout
nil rescue => e
fail!("iLink request failed (#{path}): #{e.message}")
end
|
#load_token_data ⇒ Object
── Token storage ─────────────────────────────────────────────────────────────
120
121
122
123
|
# File 'lib/clacky/default_skills/personal-website/publish.rb', line 120
def load_token_data
return {} unless File.exist?(TOKEN_FILE)
JSON.parse(File.read(TOKEN_FILE)) rescue {}
end
|
#log(msg) ⇒ Object
In fetch-qr mode, write to stderr so stdout stays clean JSON
63
64
65
66
|
# File 'lib/clacky/default_skills/channel-manager/weixin_setup.rb', line 63
def log(msg)
$stderr.puts("[weixin-setup] #{msg}")
wlog(msg)
end
|
#mode_poll(device_code, expires_in: POLL_TIMEOUT, interval: POLL_INTERVAL) ⇒ Object
── Mode: –poll <device_code> ────────────────────────────────────────────────Poll until SUCCESS or a terminal state. Exits with:
0 — SUCCESS: credentials saved and adapter started
2 — WAITING: user hasn't scanned yet (Agent should ask user to scan and retry)
1 — terminal failure (expired, fail, or server error)
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
|
# File 'lib/clacky/default_skills/channel-manager/dingtalk_setup.rb', line 96
def mode_poll(device_code, expires_in: POLL_TIMEOUT, interval: POLL_INTERVAL)
step "Phase 3 — Checking DingTalk authorization..."
client_id = nil
client_secret = nil
deadline = Time.now + expires_in
loop do
if Time.now > deadline
puts "[dingtalk-setup] WAITING_TIMEOUT"
exit 2
end
poll_data = post_json("#{DINGTALK_REG_BASE}/app/registration/poll",
{ device_code: device_code })
status = poll_data["status"].to_s.upcase
case status
when "WAITING"
puts "[dingtalk-setup] WAITING"
exit 2
when "SUCCESS"
client_id = poll_data["client_id"].to_s.strip
client_secret = poll_data["client_secret"].to_s.strip
fail! "Authorization succeeded but missing client credentials" if client_id.empty? || client_secret.empty?
ok "Authorization complete! client_id=#{client_id}"
break
when "EXPIRED"
fail! "Authorization QR code expired. Please re-run."
when "FAIL"
fail! "Authorization failed: #{poll_data["fail_reason"] || "unknown reason"}"
else
warn "Unknown status=#{status}, retrying..."
sleep interval
end
end
step "Phase 4 — Saving credentials to clacky server..."
begin
res = server_post("/api/channels/dingtalk",
{ client_id: client_id, client_secret: client_secret, enabled: true })
if res.code.to_i == 200
ok "Credentials saved, DingTalk Stream adapter starting..."
else
body = JSON.parse(res.body) rescue { "error" => res.body }
fail! "Server rejected credentials: #{body["error"] || res.body}"
end
rescue StandardError => e
fail! "Could not reach clacky server: #{e.message}"
end
step "Phase 5 — Waiting for DingTalk Stream connection..."
ws_ready = false
ws_deadline = Time.now + 30
loop do
break if Time.now > ws_deadline
begin
res = server_get("/api/channels")
channels = JSON.parse(res.body)["channels"] || []
dingtalk = channels.find { |c| c["platform"] == "dingtalk" }
if dingtalk&.fetch("running", false)
ws_ready = true
break
end
rescue StandardError => e
warn "Channel status check failed: #{e.message}"
end
sleep 2
end
if ws_ready
ok "DingTalk Stream WebSocket connected."
else
warn "Stream connection not confirmed within 30s — it may still be starting."
end
ok "🎉 DingTalk channel setup complete! Search for your robot in DingTalk to start chatting."
ok " client_id: #{client_id}"
end
|
#mode_print_qr ⇒ Object
── Mode: –print-qr ─────────────────────────────────────────────────────────Call init + begin, print JSON with qr_url / device_code / expires_in, exit 0.
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
|
# File 'lib/clacky/default_skills/channel-manager/dingtalk_setup.rb', line 71
def mode_print_qr
step "Phase 1 — Starting DingTalk Device Flow registration..."
init_data = post_json("#{DINGTALK_REG_BASE}/app/registration/init",
{ source: DINGTALK_REG_SOURCE })
nonce = init_data["nonce"].to_s.strip
fail! "Missing nonce in init response" if nonce.empty?
begin_data = post_json("#{DINGTALK_REG_BASE}/app/registration/begin", { nonce: nonce })
device_code = begin_data["device_code"].to_s.strip
qr_url = begin_data["verification_uri_complete"].to_s.strip
expires_in = (begin_data["expires_in"] || POLL_TIMEOUT).to_i
fail! "Missing device_code in begin response" if device_code.empty?
fail! "Missing verification_uri_complete" if qr_url.empty?
ok "Device Flow started. QR expires in #{expires_in}s."
puts JSON.generate({ qr_url: qr_url, device_code: device_code, expires_in: expires_in })
end
|
#ok(msg) ⇒ Object
60
|
# File 'lib/clacky/default_skills/channel-manager/weixin_setup.rb', line 60
def ok(msg); $stderr.puts("[weixin-setup] ✅ #{msg}") unless FETCH_QR_MODE; wlog("✅ #{msg}"); end
|
#parse_paragraph(node, styles, numbering) ⇒ Object
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
|
# File 'lib/clacky/default_parsers/docx_parser.rb', line 102
def parse_paragraph(node, styles, numbering)
ppr = REXML::XPath.first(node, "w:pPr")
style = REXML::XPath.first(ppr, "w:pStyle")&.attributes&.[]("w:val") if ppr
num_pr = REXML::XPath.first(ppr, "w:numPr") if ppr
text = (node)
return nil if text.strip.empty?
if style && styles[style]
level = styles[style][:heading]
return "#{"#" * level} #{text}"
end
if num_pr
ilvl = REXML::XPath.first(num_pr, "w:ilvl")&.attributes&.[]("w:val").to_i
indent = " " * ilvl
return "#{indent}- #{text}"
end
text
end
|
#parse_row(row_node, shared_strings) ⇒ Object
25
26
27
28
29
30
31
|
# File 'lib/clacky/default_parsers/xlsx_parser.rb', line 25
def parse_row(row_node, shared_strings)
REXML::XPath.match(row_node, ".//c").map do |c|
v = REXML::XPath.first(c, "v")&.text
next "" unless v
c.attributes["t"] == "s" ? (shared_strings[v.to_i] || "") : v
end
end
|
#parse_slide(doc, slide_num) ⇒ Object
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
|
# File 'lib/clacky/default_parsers/pptx_parser.rb', line 54
def parse_slide(doc, slide_num)
lines = []
title_text = nil
REXML::XPath.each(doc, "//p:sp") do |sp|
ph = REXML::XPath.first(sp, ".//p:ph")
next unless ph
ph_type = ph.attributes["type"]
if ph_type == "title" || ph_type == "ctrTitle"
title_text = (sp).strip
break
end
end
lines << "## Slide #{slide_num}#{title_text && !title_text.empty? ? ": #{title_text}" : ""}"
REXML::XPath.each(doc, "//p:sp") do |sp|
ph = REXML::XPath.first(sp, ".//p:ph")
if ph
ph_type = ph.attributes["type"]
next if %w[title ctrTitle sldNum dt ftr].include?(ph_type)
end
text = (sp).strip
next if text.empty?
next if text == title_text
text.each_line do |line|
lines << "- #{line.rstrip}" unless line.strip.empty?
end
end
REXML::XPath.each(doc, "//a:tbl") do |tbl|
lines << parse_table(tbl)
end
lines.join("\n")
end
|
#parse_table(tbl_node) ⇒ Object
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
|
# File 'lib/clacky/default_parsers/docx_parser.rb', line 124
def parse_table(tbl_node)
rows = []
REXML::XPath.each(tbl_node, "w:tr") do |tr|
cells = REXML::XPath.match(tr, "w:tc").map do |tc|
REXML::XPath.match(tc, ".//w:t").map(&:text).compact.join(" ").strip
end
rows << cells
end
return "" if rows.empty?
col_count = rows.map(&:size).max
lines = []
rows.each_with_index do |row, i|
padded = row + [""] * [col_count - row.size, 0].max
lines << "| #{padded.join(" | ")} |"
lines << "|#{" --- |" * col_count}" if i == 0
end
lines.join("\n")
end
|
#poll_until_confirmed(qrcode) ⇒ Object
Long-poll loop (shared by all modes)
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
|
# File 'lib/clacky/default_skills/channel-manager/weixin_setup.rb', line 173
def poll_until_confirmed(qrcode)
deadline = Time.now + LOGIN_DEADLINE_S
scanned_once = false
started_at = Time.now
loop do
fail!("Login timed out. Please run setup again.") if Time.now > deadline
resp = ilink_get(
"ilink/bot/get_qrcode_status?qrcode=#{CGI.escape(qrcode)}",
extra_headers: { "iLink-App-ClientVersion" => "1" },
timeout: QR_POLL_TIMEOUT_S
)
if resp.nil?
wlog("poll: timeout/nil, retrying...")
next
end
wlog("poll response: #{resp.to_json}")
case resp["status"]
when "wait"
when "scaned"
unless scanned_once
$stderr.puts("[weixin-setup] WeChat scanned! Please confirm in the app...")
scanned_once = true
end
when "confirmed"
elapsed = Time.now - started_at
token = resp["bot_token"].to_s.strip
base_url = resp["baseurl"].to_s.strip
base_url = ILINK_BASE_URL if base_url.empty?
fail!("Login confirmed but no token received") if token.empty?
if elapsed < 3 && !scanned_once
wlog("confirmed too fast (#{elapsed.round(1)}s), treating as stale session")
fail!("[stale-session] QR session confirmed immediately — account already logged in. Run --fetch-qr to get a fresh QR code.")
end
wlog("confirmed after #{elapsed.round(1)}s")
return { token: token, base_url: base_url }
when "expired"
fail!("QR code expired. Please run setup again.")
else
$stderr.puts("[weixin-setup] Unknown status: #{resp["status"]}, continuing...")
end
end
end
|
#post_json(url, payload) ⇒ Object
39
40
41
42
43
44
45
46
47
48
49
50
51
|
# File 'lib/clacky/default_skills/channel-manager/dingtalk_setup.rb', line 39
def post_json(url, payload)
uri = URI.parse(url)
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = uri.scheme == "https"
req = Net::HTTP::Post.new(uri.path, "Content-Type" => "application/json")
req.body = JSON.generate(payload)
resp = http.request(req)
data = JSON.parse(resp.body)
fail! "API error (#{resp.code}): #{data["errmsg"] || resp.body}" if data["errcode"] && data["errcode"] != 0
data
rescue JSON::ParserError => e
fail! "JSON parse error from #{url}: #{e.message}"
end
|
#random_wechat_uin ⇒ Object
81
82
83
84
|
# File 'lib/clacky/default_skills/channel-manager/weixin_setup.rb', line 81
def random_wechat_uin
uint32 = SecureRandom.random_bytes(4).unpack1("N")
Base64.strict_encode64(uint32.to_s)
end
|
#read_document_xml(body) ⇒ Object
47
48
49
50
51
|
# File 'lib/clacky/default_parsers/docx_parser.rb', line 47
def read_document_xml(body)
xml = read_zip_entry(body, "word/document.xml")
raise "Could not extract content — possibly encrypted or invalid format" unless xml
xml
end
|
#read_numbering(body) ⇒ Object
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
|
# File 'lib/clacky/default_parsers/docx_parser.rb', line 53
def read_numbering(body)
result = {}
xml = read_zip_entry(body, "word/numbering.xml")
return result unless xml
doc = REXML::Document.new(xml)
REXML::XPath.each(doc, "//w:abstractNum") do |an|
id = an.attributes["w:abstractNumId"]
levels = {}
REXML::XPath.each(an, "w:lvl") do |lvl|
ilvl = lvl.attributes["w:ilvl"].to_i
fmt = REXML::XPath.first(lvl, "w:numFmt")&.attributes&.[]("w:val")
levels[ilvl] = { fmt: fmt || "bullet" }
end
result[id] = levels
end
result
rescue
{}
end
|
#read_styles(body) ⇒ Object
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
|
# File 'lib/clacky/default_parsers/docx_parser.rb', line 73
def read_styles(body)
result = {}
xml = read_zip_entry(body, "word/styles.xml")
return result unless xml
doc = REXML::Document.new(xml)
REXML::XPath.each(doc, "//w:style") do |s|
sid = s.attributes["w:styleId"]
name = REXML::XPath.first(s, "w:name")&.attributes&.[]("w:val").to_s
if name =~ /^heading (\d)/i
result[sid] = { heading: $1.to_i }
end
end
result
rescue
{}
end
|
#read_zip_entry(body, name) ⇒ Object
38
39
40
41
42
43
44
45
|
# File 'lib/clacky/default_parsers/docx_parser.rb', line 38
def read_zip_entry(body, name)
xml = nil
Zip::File.open_buffer(StringIO.new(body)) do |zip|
entry = zip.find_entry(name)
xml = safe_utf8(entry.get_input_stream.read) if entry
end
xml
end
|
#safe_utf8(str) ⇒ Object
30
31
32
33
34
35
36
|
# File 'lib/clacky/default_parsers/docx_parser.rb', line 30
def safe_utf8(str)
utf8 = str.dup.force_encoding("UTF-8")
return utf8 if utf8.valid_encoding?
str.encode("UTF-8", "binary", invalid: :replace, undef: :replace, replace: "")
end
|
#save_to_server(bot_token:) ⇒ Object
Clacky server — save credentials
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
|
# File 'lib/clacky/default_skills/channel-manager/weixin_setup.rb', line 146
def save_to_server(token:, base_url:)
uri = URI("#{CLACKY_SERVER_URL}/api/channels/weixin")
body = JSON.generate({ token: token, base_url: base_url })
http = Net::HTTP.new(uri.host, uri.port)
http.read_timeout = 15
http.open_timeout = 5
req = Net::HTTP::Post.new(uri.path, "Content-Type" => "application/json")
req.body = body
res = http.request(req)
data = JSON.parse(res.body) rescue {}
unless res.is_a?(Net::HTTPSuccess) && data["ok"]
fail!("Failed to save Weixin config: #{data["error"] || res.body.slice(0, 200)}")
end
ok("Credentials saved via clacky server")
rescue => e
fail!("Could not reach clacky server: #{e.message}")
end
|
#save_token_data(data) ⇒ Object
125
126
127
128
129
|
# File 'lib/clacky/default_skills/personal-website/publish.rb', line 125
def save_token_data(data)
FileUtils.mkdir_p(File.dirname(TOKEN_FILE))
File.write(TOKEN_FILE, JSON.pretty_generate(data))
File.chmod(0600, TOKEN_FILE)
end
|
#saved_bot_token ⇒ Object
85
86
87
88
89
90
|
# File 'lib/clacky/default_skills/channel-manager/discord_setup.rb', line 85
def saved_bot_token
yml_path = File.expand_path("~/.clacky/channels.yml")
return nil unless File.exist?(yml_path)
data = YAML.safe_load_file(yml_path, permitted_classes: [Symbol], aliases: true) rescue nil
data&.dig("channels", "discord", "bot_token") || data&.dig(:channels, :discord, :bot_token)
end
|
#server_get(path) ⇒ Object
62
63
64
65
66
67
|
# File 'lib/clacky/default_skills/channel-manager/dingtalk_setup.rb', line 62
def server_get(path)
uri = URI(CLACKY_SERVER_URL)
Net::HTTP.start(uri.host, uri.port, open_timeout: 3, read_timeout: 10) do |h|
h.request(Net::HTTP::Get.new(path))
end
end
|
#server_post(path, body) ⇒ Object
53
54
55
56
57
58
59
60
|
# File 'lib/clacky/default_skills/channel-manager/dingtalk_setup.rb', line 53
def server_post(path, body)
uri = URI(CLACKY_SERVER_URL)
Net::HTTP.start(uri.host, uri.port, open_timeout: 3, read_timeout: 10) do |h|
req = Net::HTTP::Post.new(path, "Content-Type" => "application/json")
req.body = JSON.generate(body)
h.request(req)
end
end
|
#step(msg) ⇒ Object
59
|
# File 'lib/clacky/default_skills/channel-manager/weixin_setup.rb', line 59
def step(msg); $stderr.puts("[weixin-setup] #{msg}") unless FETCH_QR_MODE; wlog(msg); end
|
#try_antiword(path) ⇒ Object
Use antiword to extract text from .doc files (Linux/WSL)
36
37
38
39
40
41
42
43
44
|
# File 'lib/clacky/default_parsers/doc_parser.rb', line 36
def try_antiword(path)
stdout, _stderr, status = Open3.capture3("antiword", path)
return nil unless status.success?
text = stdout.strip
return nil if text.bytesize < MIN_CONTENT_BYTES
text
rescue Errno::ENOENT
nil end
|
#try_libreoffice(path, ext) ⇒ Object
Convert WPS formats to text using LibreOffice headless mode. .et (spreadsheet) → csv for structured output; .wps/.dps → txt.
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
|
# File 'lib/clacky/default_parsers/wps_parser.rb', line 30
def try_libreoffice(path, ext)
Dir.mktmpdir("clacky-wps") do |dir|
output_ext = ext == ".et" ? "csv" : "txt"
_stdout, _stderr, status = Open3.capture3(
"libreoffice", "--headless", "--convert-to", output_ext,
"--outdir", dir, path
)
return nil unless status.success?
output_file = Dir.glob(File.join(dir, "*.#{output_ext}")).first
return nil unless output_file && File.exist?(output_file)
text = File.read(output_file).strip
return nil if text.bytesize < MIN_CONTENT_BYTES
text
end
rescue Errno::ENOENT
nil
end
|
#try_ocr(path) ⇒ Object
OCR fallback for scanned/image-only PDFs. See pdf_parser_ocr.py for the actual extraction logic.
Installation hints (also printed on final failure):
macOS: brew install tesseract tesseract-lang poppler
pip3 install pytesseract pdf2image
Linux: apt install tesseract-ocr tesseract-ocr-chi-sim poppler-utils
pip3 install pytesseract pdf2image
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
|
# File 'lib/clacky/default_parsers/pdf_parser.rb', line 73
def try_ocr(path)
_stdout, _stderr, status = Open3.capture3("tesseract", "--version")
return nil unless status.success?
script = File.join(SCRIPT_DIR, "pdf_parser_ocr.py")
return nil unless File.exist?(script)
stdout, stderr, status = Open3.capture3("python3", script, path)
unless status.success?
warn stderr.strip unless stderr.strip.empty?
return nil
end
text = stdout.strip
return nil if text.bytesize < MIN_CONTENT_BYTES
text
rescue Errno::ENOENT
nil end
|
#try_pdfplumber(path) ⇒ Object
52
53
54
55
56
57
58
59
60
61
62
63
|
# File 'lib/clacky/default_parsers/pdf_parser.rb', line 52
def try_pdfplumber(path)
script = File.join(SCRIPT_DIR, "pdf_parser_plumber.py")
return nil unless File.exist?(script)
stdout, _stderr, status = Open3.capture3("python3", script, path)
return nil unless status.success?
text = stdout.strip
return nil if text.bytesize < MIN_CONTENT_BYTES
text
rescue Errno::ENOENT
nil end
|
#try_pdftotext(path) ⇒ Object
42
43
44
45
46
47
48
49
50
|
# File 'lib/clacky/default_parsers/pdf_parser.rb', line 42
def try_pdftotext(path)
stdout, _stderr, status = Open3.capture3("pdftotext", "-layout", "-enc", "UTF-8", path, "-")
return nil unless status.success?
text = stdout.strip
return nil if text.bytesize < MIN_CONTENT_BYTES
text
rescue Errno::ENOENT
nil end
|
#try_textutil(path) ⇒ Object
Use macOS textutil to convert .doc → txt
25
26
27
28
29
30
31
32
33
|
# File 'lib/clacky/default_parsers/doc_parser.rb', line 25
def try_textutil(path)
stdout, _stderr, status = Open3.capture3("textutil", "-convert", "txt", "-stdout", path)
return nil unless status.success?
text = stdout.strip
return nil if text.bytesize < MIN_CONTENT_BYTES
text
rescue Errno::ENOENT
nil end
|
#warn(msg) ⇒ Object
33
|
# File 'lib/clacky/default_skills/channel-manager/dingtalk_setup.rb', line 33
def warn(msg); puts("[dingtalk-setup] ⚠️ #{msg}"); end
|
#warn!(msg) ⇒ Object
50
|
# File 'lib/clacky/default_skills/channel-manager/discord_setup.rb', line 50
def warn!(msg); $stderr.puts("[discord-setup] #{msg}"); end
|
#wlog(msg) ⇒ Object
53
54
55
56
57
|
# File 'lib/clacky/default_skills/channel-manager/weixin_setup.rb', line 53
def wlog(msg)
File.open(WEIXIN_LOG_FILE, "a") { |f| f.puts("[#{Time.now.strftime("%H:%M:%S")}] #{msg}") }
rescue StandardError
end
|