Hornet 40: Network Dataset of Geographically Placed Honeypots

Citation

Data In Brief Journal

Valeros, V., & Garcia, S. (2022). Hornet 40: Network dataset of geographically placed honeypots. Data in Brief, 40, 107795. doi: 10.1016/j.dib.2022.107795

Dataset

Valeros, Veronica (2021), “Hornet 40: Network Dataset of Geographically Placed Honeypots”, Mendeley Data, V1, doi: 10.17632/tcfzkbpw46.1

Description

Hornet 40 is a dataset of forty days of network traffic attacks captured in cloud servers used as honeypots to help understand how geography may impact the inflow of network attacks. The honeypots are located in eight different cities: Amsterdam, London, Frankfurt, San Francisco, New York, Singapore, Toronto, Bangalore. The data was captured in April, May, and June 2021.

The eight cloud servers were created and configured simultaneously following identical instructions. The network capture was performed using the Argus network monitoring tool in each cloud server. The cloud servers had only one service running (SSH on a non-standard port) and were fully dedicated as a honeypot. No honeypot software was used in this dataset.

The dataset consists of eight scenarios, one for each geographically located cloud server. Each scenario contains bidirectional NetFlow files in the following format: - hornet40-biargus.tar.gz: all scenarios with bidirectional NetFlow files in Argus binary format; - hornet40-netflow-v5.tar.gz: all scenarios with bidirectional NetFlow v5 files in CSV format; - hornet40-netflow-extended.tar.gz: all scenarios with bidirectional NetFlows files in CSV format containing all features provided by Argus. - hornet40-full.tar.gz: download all the data (biargus, NetFlow v5, and extended NetFlows)

Steps to reproduce

This dataset used cloud server instances from Digital Ocean. For this dataset all cloud servers have the same technical configurations: a) Operating System: Ubuntu 20.04LTS, b) Instance Capacity: 1GB / 1 Intel CPU, c) Instance Storage: 25 GB NVMe SSDs, d) Instance Transfer: 1000 GB transfer.

Once the cloud instances were created the servers were configured simultaneously using the parallel-ssh and parallel-scp tools:

  1. Update the software repository: apt update
  2. Install Argus: apt install -yq argus-client argus-server
  3. Upload common SSH configuration with SSH on a non-standard port to each server /etc/ssh/sshd_config
  4. Restart SSH servers: /etc/init.d/ssh restart
  5. Upload common Argus configuration to each server at /etc/argus.conf
  6. Start Argus server: argus -F /etc/argus.conf -i eth0
  7. Create a folder to store the NetFlow files: mkdir /root/dataset
  8. Start rasplit to store the network data received by Argus: rasplit -S 127.0.0.1:900 -M time 1h -w /root/dataset/%Y/%m/%d/do-sensor.%H.%M.%S.biargus

Additional Files

In the resources folder available in the full download (hornet40-full.tar.gz) we provided the argus, ra, and sshd configurations.