클로드가 내 ("로터리") 전화기를 해킹했습니다
hackernews
|
|
📰 뉴스
#ai 모델
#chatgpt
#claude
#command r
원문 출처: hackernews · Genesis Park에서 요약 및 분석
요약
Claude Code가 영국식 부스 전화기에 내장된 바이킹 VoIP 전화기의 하드웨어 프로토콜을 역설계하여 에이전트와 연결하는 데 성공했습니다. 제조사가 제공한 윈도우용 소프트웨어를 쓰지 못하는 맥 환경에서, AI가 직접 하드웨어에 접속해 자율적으로 반복 작업을 수행하며 문제를 해결했습니다. 지난해 채팅GPT로는 실패했던 작업이 성공한 사례는 지난 1년 사이 AI 역량이 얼마나 크게 도약했는지를 잘 보여줍니다.
본문
Claude code hacked my (rotary) phone A rather technical report from claude itself on how it reverse engineered a hardware protocol of a viking voip phone As part of my work on putting a voice agent in a British red telephone booth, I needed to reprogram a Viking K-1900D-IP VoIP phone to call the agent. The manufacturer only provided Windows software, and I only had a Mac. So I decided to work around any middleware software, programming hardware directly. Interestingly, our team already tried to crack this phone in 2025. Back then they used a simple chatGPT and failed. This year I closed the loop by connecting claude code directly to the phone and letting it autonomously iterate; and succeeded. The whole process took me about a day. This illustrates the leap in AI capabilities that happened over last year. The interaction itself was also quite interesting. My role in the project was more of an assistant to the agent. For example, on one instance, it asked me to pick up the phone and count how many beeps I could hear. The article below is the report of steps that Claude took (co-create with Claude!) Setup I connected the phone to the mini router via PoE injector. I connected my laptop to the same router via WiFi. I asked Claude to program the phone with the SIP trunk I needed. At first, it spent quite some time trying to find documentation online. Then it gave up and suggested to try to connect directly and figure out the protocol. Subscribe to stay up to date with my future experiments. Finding the phone Claude found the phone on the local network and scanned for open ports: $ nmap -Pn -p- 192.168.8.235 PORT STATE SERVICE 10001/tcp open scp-config 107/tcp open rtelnet Port 10001 turned out to be a Lantronix XPort serial-to-TCP tunnel — a red herring. The actual protocol lives on port 107. The text protocol Claude then sent a test sequence VIKING and received ER[VIKING] . This hinted at a XX[value] response format, but everything else was unknown. What followed was dozens of probes, most failing: Tried random two-letter codes. Some worked: NA — no responseWR —WR[0] SC —SC[NOT_CONNECTED] UP —UP[] That confirmed the read syntax: just send two letters. Tried write formats: MN[test] — errorMN test — errorSET MN test — errorMN=test —MN[test] — this one worked Gradually the protocol revealed itself: | Operation | Syntax | Response | |---|---|---| | Read | XX\r\n | XX[value] | | Write | XX=value\r\n | XX[value] | | Error | anything invalid | ER[input] | | Quit | Q\r\n | GOODBYE | It came up with the hypotheses that all commands are two-letter sequences. Then it suggested to brute-force all two-letter sequences and see which of them work. It wrote a script to try all 676 combinations: for first in {A..Z}; do for second in {A..Z}; do cmd="${first}${second}" result=$(printf "${cmd}\r\n" | nc -w 1 192.168.8.235 107) [[ ! "$result" =~ "ER[" ]] && echo "$cmd -> $result" done done 80+ valid registers came back: SIP settings (all empty — unconfigured): UC (username), UD (domain), UP (password), UR (registrar), UU (auth ID), UX (outbound proxy) Network: WA = 192.168.8.235 (IP), WM = 18e80f513f66 (MAC) Device: MN = VIKING_MK64_Vik02 (device name), MB = R8.44.2236 (firmware build), MC = SGTL5000 0xA011 (audio codec) Plus some others: audio volume, speed dial slots, NTP server, baud rate, and various control commands. Now, the entire protocol was mapped. This seemed like a victory, but its never that easy :) The wall We set all the SIP credentials, rebooted, and: UC --> UC[] UD --> UD[] UP --> UP[] SR --> SR[NOT_REGISTERED] Everything gone! The text protocol writes to RAM, but SIP stack reads from flash. They’re separate stores. What followed was a long, frustrated search for a save command. WR=1 , WF=1 , CE , CU — some return OK , but none make SIP settings survive a reboot. Claude tried everything, but couldn’t find a way to persist flash. It realised that Windows software of the phone is likely to use some other mechanism to save changes. It then suggested to run it in VM, and intercept the traffic to learn how it works. The MITM A Windows VM in UTM couldn’t reach the phone directly over WiFi, due to driver incompatibility. So Claude set up a man-in-the-middle chain: UTM’s bridged networking didn’t work over WiFi. We ended up with NAT networking + a port forward in PowerShell + a Python TCP proxy on the Mac. Once connected, the Viking software read the phone’s state using the text commands we already knew. Then I entered SIP credentials and clicked Apply. The proxy captured something new — TS A binary commands, one command per byte, writing directly to flash memory: TS A a7 6d 65 68 9a 6c 6a 3c 73 26 20 TS A a7 6d 65 68 9a 6c 6a 3d 69 2f 20 TS A a7 6d 65 68 9a 6c 6a 3e 70 27 20 Format: TS A 0x20 The 7-byte prefix is a device-specific identifier, constant per phone. First byte encodes the operation: 0xa7 for write, 0xa6 for read, 0xa5 for erase. The data bytes are plaintext ASCII — no encryption, just a trivial checksum for integrity. After all writes, the save sequence: CE -> GB=40 -> ME=1 -> MR=1 The save commands were the same ones we’d found during the brute force and dismissed as useless. They don’t work with text protocol writes alone — but after binary flash writes, this exact sequence commits everything to persistent storage and reboots. Cracking the checksum Claude analysed the captured writes for the string "sip" : addr: 0x3c data: 0x73 checksum: 0x26 # 's' = 0x73 addr: 0x3d data: 0x69 checksum: 0x2f # 'i' = 0x69 addr: 0x3e data: 0x70 checksum: 0x27 # 'p' = 0x70 And cracked the checksum: (0x195 - address - data_byte) & 0xFF Verify: (0x195 - 0x3c - 0x73) & 0xFF = 0x26 . It matches! The checksum is a single subtraction. We then dumped all 256 bytes of the primary flash region and mapped it: | Address | Contents | |---|---| 0x0e | Security code (default 845464 — spells “VIKING” on a phone keypad) | 0x3c | SIP server / registrar | 0x7c | Auto-dial phone number 1 | 0x90-0xcb | Phone numbers 2-4 | 0xcf | Firmware ID (read-only) | The two layers The full picture of what we’d been fighting: Layer 1: Text protocol (RAM, volatile). Two-letter commands, read/write to RAM. The SIP stack doesn’t read from here. Layer 2: Binary flash protocol (persistent). The TS command reads and writes directly to flash memory, one byte at a time, with a device-specific auth prefix and a trivial checksum. For a factory-fresh phone, you must use the binary protocol to write the SIP server to flash. This activates the SIP stack. After that, text protocol writes with the save sequence are sufficient for subsequent changes. Victory # Write SIP server to flash (binary protocol) server = "vikingphone.sip.us1.twilio.com" for i, byte in enumerate(server.encode()): addr = 0x3c + i checksum = (0x195 - addr - byte) & 0xFF send(f"TS A {prefix} {addr:02x} {byte:02x} {checksum:02x} 20") # Set SIP credentials (text protocol) send("UC=vikingphone") send("UP=s3cret") send("UU=vikingphone") send("UR=vikingphone.sip.us1.twilio.com") # Commit and reboot send("CE"); send("GB=40"); send("ME=1"); send("MR=1") After reboot: SR --> SR[REGISTERED] The phone registers with Twilio. A TwiML bin routes outbound calls to an ElevenAgent. Pick up the handset and an AI answers. Parting Thoughts As software feels “solved” by AI today, it’s interesting to see how far it could diffuse into physical world. This project made me feel like the answer is: surprisingly far. I also built it in the middle of Mythos cyber hell. Even though this was a baby version of “hacking”, it was still amusing (and slightly alarming) how much I could achieve without any sort of expertise in low level / networks / cyber. Finally, I open sourced a skill file describing the protocol in detail, so that the next person doesn’t need to burn through the tokens to reverse engineer it again. A skill is a product. Subscribe to stay up to date with my future experiments.
Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.
공유