RanDS
The largest ransomware dataset.
Extracts ASCII and UTF-16 strings from PE files using static analysis. The dataset is cleaned, normalized, and consolidated into text for each sample.
A filtered version of the Raw Strings Dataset that keeps only meaningful English words using the Python Enchant library.
Extracts imported and exported API calls from PE files using the PeFile module and stores them in structured JSON format.
Provides a demangled version of the APIs Dataset using Demumble, supporting both Itanium and Visual Studio symbols.
Generated by executing PE files in CAPEv2 and Cuckoo Sandbox to capture runtime behaviors such as registry, file, process, network, and API activities.
Paper
For more details about the dataset, please check our paper listed below