Security Data Science at Avast
Shell Language Processing: parsing Unix commands for Machine Learning
In this workshop, we present a Shell Language Preprocessing (SLP) concept, which is focused on adjusting Unix commandline telemetry for Machine Learning pipeline. We describe the rationale behind the need for a new approach with specific examples when conventional Natural Language Processing (NLP) pipelines fail. Furthermore, we we will share a Notebook with dataset that allows on a hands-on manner to evaluate our methodology on a security classification task against widely accepted information and communications technology (ICT) tokenization techniques and achieve significant improvement of an F1-score from 0.392 to 0.874
Dmitrijs is a Security Data Scientist at Avast. In past: Red Teaming, Threat Hunting, membership in NATO cybersecurity events. Publications and speaker at DefCon and CAMLIS. Two higher educations (MSc Data Science, MSc Network Security). Certified OSCP, CCNA, CCSA, deeplearning.ai.