An Improved Hidden Markov Model for Anomaly Detection Using Frequent Common Patterns
Host-based intrusion detection techniques are needed to ensure the safety and security of software systems, especially, if these systems handle sensitive data. Most host-based intrusion detection systems involve building some sort of reference models offline, usually from execution traces (in the absence of the source code), to characterize the system healthy behavior. The models can later be used as a baseline for online detection of abnormal behavior. Perhaps the most popular techniques are the ones based on the use of Hidden Markov Models (HMM). These techniques, however, require long training time of the models, which makes them computationally infeasible, the main reason being the large size of typical traces, often millions of lines long. In this paper, we propose an improved HMM using the concept of frequent common patterns. In other words, we build models based on extracting the largest n-grams (patterns) in the traces instead of taking each trace event on its own. We show through a case study that our approach can reduce the training time by 31.96%-48.44% compared to the original HMM algorithms while keeping almost the same accuracy rate.
Tracks concerned:
- AHLS›Scalable Detection infrastructure - Harmonized Anomaly Detection Techniques