Skip to Content

ReinforceWall

AI-Powered Network Defense Through Reinforcement Learning

GitHub Repo

 Overview / About

ReinforceWall is a Reinforcement Learning-based network defense system that trains an intelligent agent to detect and respond to cyberattacks in real-time. Instead of relying on static rules, the system uses a Deep Q-Network (DQN) to learn optimal defensive strategies — deciding whether to blockalertlog, or ignore each incoming network request based on 20-dimensional behavioral feature vectors.

Problem

Traditional intrusion detection systems depend on hand-crafted rules and known attack signatures. They struggle with novel threats, produce high false-positive rates, and require constant manual tuning. As attackers evolve their strategies, static defenses fall behind.

Solution

Train an RL agent that learns from experience — observing patterns in network traffic and adapting its defense strategy over thousands of simulated episodes. The agent receives rewards for correctly identifying threats and penalties for false alarms, naturally developing a balanced and effective security policy.

Key Features

FeatureDescription
10 Attack TypesSQL Injection, XSS, Brute Force, DDoS, Command Injection, Path Traversal, Port Scanning, CSRF, MITM, Phishing
Deep Q-NetworkPyTorch-based DQN agent with experience replay, epsilon-greedy exploration, and target network updates
Custom RL EnvironmentGymnasium-compatible environment with a 20-dimensional state space and 4-action defensive action space
Curriculum LearningProgressive difficulty levels that gradually increase attack complexity during training
Real-time DashboardFlask + WebSocket dashboard for live training monitoring, metrics visualization, and model management
Firewall IntegrationSupports both simulation mode and real iptables integration for production deployment
Baseline ComparisonRule-based attack detector included as a performance baseline

Tech Stack

LayerTechnologies
Core AI/MLPython, PyTorch, Gymnasium, NumPy
EnvironmentCustom Gym environment, attack traffic simulator, 20D state feature extraction
TrainingDQN with experience replay, target networks, curriculum learning, epsilon decay
DashboardFlask, Flask-SocketIO, WebSocket, HTML/CSS/JS, Chart.js
Infrastructureiptables integration (optional), structured logging, metrics tracking (CSV/JSON)

How It Works

  1. Traffic Simulation — The AttackSimulator generates realistic network requests, mixing normal traffic with 10 attack types at configurable probabilities
  2. State Extraction — A StateExtractor converts each raw request into a 20-dimensional feature vector capturing behavioral patterns (request rate, payload entropy, suspicious headers, etc.)
  3. Agent Decision — The DQN agent observes the state and selects a defensive action: Block, Alert, Log, or Ignore
  4. Reward Signal — The environment provides rewards: +8 for correctly blocking attacks, −10 for ignoring attacks, −2 for blocking legitimate traffic
  5. Learning — Through thousands of episodes with epsilon-greedy exploration, experience replay, and target network updates, the agent converges on an optimal defense policy

Results / Outcomes

  • Successfully trains to detect and respond to all 10 attack categories
  • Agent learns to balance security vs. availability — minimizing both false negatives (missed attacks) and false positives (blocked legitimate traffic)
  • Curriculum learning enables tackling progressively harder attack mixes
  • Real-time dashboard provides full visibility into training progress and model performance

My Role

  • Designed the complete RL pipeline: environment, state representation, reward structure, and agent architecture
  • Implemented 10 realistic attack traffic generators with configurable patterns
  • Built the DQN agent in PyTorch with experience replay and curriculum learning
  • Created a real-time Flask + WebSocket dashboard for live training monitoring
  • Developed a comprehensive metrics tracking and evaluation system