activeMar 1, 2026

IoT Network Attack Detection

A multi-class machine learning project detecting IoT network attacks using behavioral time-window traffic features from a 27M packet dataset.

PythonScikit-learnPandasNumPyMachine LearningData Mining

Highlights

  • - Processed and sampled from a 27M packet IoT network dataset.
  • - Built multi-class classification models to detect SYN DoS, ARP MitM, Mirai, and other attacks.
  • - Evaluated model performance using precision, recall, F1-score, and ROC-AUC.
  • - Analyzed feature importance across 115 engineered temporal traffic features.

Overview

This project was completed for IS 480: Advanced Business Analytics and focuses on detecting IoT network attacks using supervised machine learning.

The dataset comes from the Kitsune Network Attack dataset (UCI Machine Learning Repository) and contains over 27 million packet-level records captured from a commercial IP-based surveillance and IoT environment under controlled attack scenarios.

Each packet is represented by 115 engineered behavioral traffic features extracted across five historical time windows.


Problem

IoT devices are increasingly vulnerable to network-based attacks including denial-of-service, man-in-the-middle, reconnaissance scanning, and botnet infections.

Traditional signature-based intrusion detection systems struggle to identify behavioral or previously unseen threats.

This project investigates whether temporal behavioral network statistics can be used to accurately classify and distinguish between multiple IoT attack types.


Dataset

  • 27M+ packet records
  • 115 numerical traffic behavior features
  • Sequential, time-series structure
  • Multi-class attack categories including:
    • SYN DoS
    • ARP MitM
    • Mirai Botnet
    • SSDP Flood
    • Benign traffic

Due to dataset size, a stratified sampling strategy was used to ensure computational feasibility while preserving class distribution.


Approach

Data Processing

  • Loaded and merged selected attack datasets
  • Applied stratified sampling
  • Scaled numerical features where required
  • Structured labels for multi-class classification

Models Evaluated

  • Logistic Regression
  • Decision Tree
  • Random Forest
  • Gradient Boosting (where feasible)

Evaluation Metrics

  • Accuracy
  • Precision
  • Recall
  • F1-score
  • ROC-AUC
  • Confusion matrix analysis

Special emphasis was placed on recall and F1-score due to class imbalance and the security context of attack detection.


Key Focus

  • Multi-class attack detection rather than simple binary classification
  • Behavioral feature importance analysis
  • Understanding which temporal traffic patterns most strongly indicate specific attack types

Status

Model training and evaluation experiments are ongoing. A formal performance comparison and technical write-up will be published upon completion.