Fig 1: The ReAct agent prompt structure and workflow [19]. The prompt at each stage consists of several components marked with different colors.
We are excited to announce our ICAART extension paper, “Prompt. Exploit. Repeat: Automating Network Security Testing with LLMs,” is now available as part of the Lecture Notes in Artificial Intelligence series. This two-part volume, published in the Lecture Notes in Computer Science (LNCS) collection, constitutes the refereed post-proceedings of the 16th International Conference on Agents and Artificial Intelligence (ICAART 2024), which took place in Rome, Italy, in February 2024.
This work is the result of a collaborative research effort by Maria Rigaki (CTU in Prague), Ondřej Lukáš (CTU in Prague), Carlos A. Catania (UNCuyo), and Sebastian Garcia (CTU in Prague).
Large Language Models (LLMs) have demonstrated remarkable capabilities across many domains, yet their application in cybersecurity remains largely unexplored. Despite their inherent limitations, LLM-based designs show promise in planning and navigating open-world scenarios. Our paper investigates how pre-trained LLMs can serve as agents in network security environments—a field traditionally dominated by reinforcement learning (RL) approaches.
Fig 2: Setup of both NetSecGame topology versions: small scenario with the blue parts and full scenario in green
We introduce a novel method that uses LLMs for sequential decision-making in cybersecurity scenarios. Unlike RL models, our approach is flexible and does not require re-training to adapt to new scenarios. We evaluated our method in two distinct environments: Microsoft’s CyberBattleSim and our newly developed NetSecGame. NetSecGame offers more realistic network conditions and includes a defender component, addressing limitations of existing platforms.
Our results are promising: the best LLM agents achieved success rates of 100% in undefended scenarios and up to 53.3% in the most challenging defended scenarios. These outcomes outperformed traditional RL agents without requiring additional training.
Additionally, we present NetSecGame, a modular and scalable environment for testing attacking and defending agents in realistic conditions.
This research highlights the potential of LLMs in cybersecurity, offering a flexible and efficient alternative to traditional RL approaches in network security testing.