Security Research

Independent vulnerability research and responsible disclosure from the HoneyLens project. Our approach combines manual code review, LLM-assisted analysis, and automated fuzzing to find vulnerabilities in open-source software. All findings go through coordinated disclosure via national CERT teams before publication.

Complete — CVE-2026-41284 (Low) 2026-04-11 reported · 2026-05-12 published

Apache Tomcat WebDAV — CVE-2026-41284

Coordinated disclosure of an unbounded-resource issue in Apache Tomcat’s WebdavServlet. Reported through Apache Security on 2026-04-11; rated Low and published 2026-05-12 in the May release advisory.

Apache Tomcat WebDAV CVE-2026-41284 Low severity
Responsible disclosure in progress 2026-04-04

TP-Link Archer A6 v4.0 — LLM-Assisted Router Penetration Testing

48-hour LLM-assisted pentest of a consumer WiFi router. Reversed the encrypted API protocol, decompressed the TPOS firmware (4.3MB ARM binary), and discovered 8 vulnerabilities including 2 Critical. Firmware update 1.14.30 fixed 1 of 8. 120+ test vectors across 6 rounds. Report submitted to TP-Link PSIRT.

IoT Router Firmware RE ARM 8 vulnerabilities 2× Critical
Complete 2026-04-02

wolfSSL 5.9.0 — LLM-Augmented Fuzzing Campaign

1.22 billion AFL++ executions, 3 LLM models scanning 202K lines of C, zero exploitable vulnerabilities. wolfSSL survived everything we threw at it. The value was building and validating the methodology.

Fuzzing AFL++ CMPLOG ASAN TLS ASN.1
Manual verification 2026-04-02

BearSSL — Applying the wolfSSL Methodology to an Unfuzzed Target

Same pipeline, different target. BearSSL has never been through OSS-Fuzz. 419M AFL++ executions, 3 LLM models, cross-model confirmation of findings. Findings under manual verification and exploit development.

Fuzzing AFL++ Multi-model TLS
Complete 2026-03-31

Vision Model Comparison — Hardware Pentest Tool Identification

Can LLM vision models identify hardware security tools from a photograph? Three rounds of testing across 6 models revealed that local vision models are not yet viable — and that controlled testing methodology matters more than you’d think.

Vision LLM benchmark Hardware pentest OCR Ollama
Complete 2026-03-31

LLM Protocol Recognition — Can AI Identify Protocols Better Than Shodan?

12 test cases covering DNS tunneling, C2 beaconing, ICS/SCADA, and protocol evasion. Sonnet scored 100%, qwen2.5:14b 75%. Even a local GPU model would have avoided the Shodan misclassification that killed our server.

Protocol analysis LLM benchmark ICS/SCADA C2 detection Ollama