Security Research
Independent vulnerability research and responsible disclosure from the HoneyLens project. Our approach combines manual code review, LLM-assisted analysis, and automated fuzzing to find vulnerabilities in open-source software. All findings go through coordinated disclosure via national CERT teams before publication.
Apache Tomcat WebDAV — CVE-2026-41284
Coordinated disclosure of an unbounded-resource issue in Apache
Tomcat’s WebdavServlet. Reported through Apache
Security on 2026-04-11; rated Low and published 2026-05-12 in the
May release advisory.
TP-Link Archer A6 v4.0 — LLM-Assisted Router Penetration Testing
48-hour LLM-assisted pentest of a consumer WiFi router. Reversed the encrypted API protocol, decompressed the TPOS firmware (4.3MB ARM binary), and discovered 8 vulnerabilities including 2 Critical. Firmware update 1.14.30 fixed 1 of 8. 120+ test vectors across 6 rounds. Report submitted to TP-Link PSIRT.
wolfSSL 5.9.0 — LLM-Augmented Fuzzing Campaign
1.22 billion AFL++ executions, 3 LLM models scanning 202K lines of C, zero exploitable vulnerabilities. wolfSSL survived everything we threw at it. The value was building and validating the methodology.
BearSSL — Applying the wolfSSL Methodology to an Unfuzzed Target
Same pipeline, different target. BearSSL has never been through OSS-Fuzz. 419M AFL++ executions, 3 LLM models, cross-model confirmation of findings. Findings under manual verification and exploit development.
Vision Model Comparison — Hardware Pentest Tool Identification
Can LLM vision models identify hardware security tools from a photograph? Three rounds of testing across 6 models revealed that local vision models are not yet viable — and that controlled testing methodology matters more than you’d think.
LLM Protocol Recognition — Can AI Identify Protocols Better Than Shodan?
12 test cases covering DNS tunneling, C2 beaconing, ICS/SCADA, and protocol evasion. Sonnet scored 100%, qwen2.5:14b 75%. Even a local GPU model would have avoided the Shodan misclassification that killed our server.