A machine learning approach to detecting attacks by identifying anomalies in network traffic
Mahoney, Matthew V.
MetadataShow full item record
The current approach to detecting novel attacks in network traffic is to model the normal frequency of session IP addresses and server port usage and to signal unusual combinations of these attributes as suspicious. We make four major contributions to the field of network anomaly detection. First, rather than just model user behavior, we also model network protocols from the data link through the application layer in order to detect attacks that exploit vulnerabilities in the implementation of these protocols. Second, we introduce a time-based model suitable for the bursty nature of network traffic: the probability of an event depends on the time since it last occurred rather than just its average frequency. Third, we introduce an algorithm for learning conditional rules from attack free training data that are sensitive to anomalies. Fourth, we extend the model to cases where attack-free training data is not available. On the 1999 DARPA/Lincoln Laboratory intrusion detection evaluation data set, our best system detects 75% of novel attacks by unauthorized users at 10 false alarms per day after training only on attack-free traffic. However this result is misleading because the background traffic is simulated and our algorithms are sensitive to artifacts. We compare the background traffic to real traffic collected from a university departmental server and conclude that we could realistically expect to detect 30% of these attacks in this environment, or 47% if we are willing to accept 50 false alarms per day.