activeMar 1, 2026

IoT Network Attack Detection

A multi-class machine learning project detecting IoT network attacks using behavioral time-window traffic features from a 27M packet dataset.

PythonScikit-learnPandasNumPyMachine LearningData Mining

Highlights

- Processed and sampled from a 27M packet IoT network dataset.
- Built multi-class classification models to detect SYN DoS, ARP MitM, Mirai, and other attacks.
- Evaluated model performance using precision, recall, F1-score, and ROC-AUC.
- Analyzed feature importance across 115 engineered temporal traffic features.

Overview

This project was completed for IS 480: Advanced Business Analytics and focuses on detecting IoT network attacks using supervised machine learning.

The dataset comes from the Kitsune Network Attack dataset (UCI Machine Learning Repository) and contains over 27 million packet-level records captured from a commercial IP-based surveillance and IoT environment under controlled attack scenarios.

Each packet is represented by 115 engineered behavioral traffic features extracted across five historical time windows.

Problem

IoT devices are increasingly vulnerable to network-based attacks including denial-of-service, man-in-the-middle, reconnaissance scanning, and botnet infections.

Traditional signature-based intrusion detection systems struggle to identify behavioral or previously unseen threats.

This project investigates whether temporal behavioral network statistics can be used to accurately classify and distinguish between multiple IoT attack types.

Dataset

27M+ packet records
115 numerical traffic behavior features
Sequential, time-series structure
Multi-class attack categories including:
- SYN DoS
- ARP MitM
- Mirai Botnet
- SSDP Flood
- Benign traffic

Due to dataset size, a stratified sampling strategy was used to ensure computational feasibility while preserving class distribution.

Approach

Data Processing

Loaded and merged selected attack datasets
Applied stratified sampling
Scaled numerical features where required
Structured labels for multi-class classification

Models Evaluated

Logistic Regression
Decision Tree
Random Forest
Gradient Boosting (where feasible)

Evaluation Metrics

Accuracy
Precision
Recall
F1-score
ROC-AUC
Confusion matrix analysis

Special emphasis was placed on recall and F1-score due to class imbalance and the security context of attack detection.

Key Focus

Multi-class attack detection rather than simple binary classification
Behavioral feature importance analysis
Understanding which temporal traffic patterns most strongly indicate specific attack types

Status

Model training and evaluation experiments are ongoing. A formal performance comparison and technical write-up will be published upon completion.