This isn’t an article (its a scholarly paper) but I thought it would be interesting to revisit last week’s post after I had done some research. As a refresher, last week I posted an article around machine learning algo’s being able to predict cyber intrusions and in essence learn what is a relevant attack by analyzing (a) was there misuse of the system and (b) what was the loss. I also posited that this could potentially reduce the need for the human element with regards to IDPS system which rely on the human element.
Last week I went back to work and asked the team working on such algo’s and I’ve learned a few things from them; (1) the algorithm uses statistical ensembles, which is essentially a mathematical physics solution that provides probability distribution for a set a systems and their properties and (2) on top of the statistical ensemble they use partition functions which essentially describe the statistical properties and their equilibrium in the set of systems. I find this fascinating because the Black-Scholes Model for asset pricing also uses physics equations.
Essentially the algorithm is instructed to (1) map all possible states of equilibrium for the set of systems and its characteristics, (2) if there is any variance outside of those designated equilibrium states to then investigate and use partition functions to map the characteristics. Here is where the learning comes into play, using ANN or a type of machine learning/information intelligence – the algo will use the historical series, multi-correlation regression and time series to build prediction models. As the size and complexity of the system(s) grows the models change and the machine “learns” and its learning tasks grow. The system is no fluid and as more inputs are placed into the ANN the more accurate and reliable the output, thus improving the ANN’s generalization ability. One should keep in mind the paper points to certain flaws in a single network ANN as such the instructions (algo) fed to the ANN which dictates how it behaves is based on an ensemble (which consists of multiple systems).
The paper details results of such an ANN employed at the database level and is referred to as, a “statistical database anomaly prediction system.” The results were “[the system] has been presented to learn previously observed user behavior in order to prevent future intrusions in database systems…”
The idea of a prediction system that can learn the behavior of agents is fascinating. This could be a paradigmatic shift in the field. As Professor Mackey said though, these ANN’s aren’t placed in operation, rather they’re still being researched and tested.
What makes me think it has the possibility to disrupt the demand for the human element? This stems out of economics, and the reality of the production function or the relation between economic inputs and outputs.
(Q) is a function of L (labor), K (human capital / capital), M (raw material) and T (technology)
It is the basis of microeconomics that where this is a macro increase or change in technology the short run micro-utility (or the value society gets from the introduction of that new technology is diminished then rise exponentially to a point then slows at an increasing rate). I’ll provide two good examples…
(1) In the cockpit of a commercial passenger air plane there used to be three people, a pilot, co-pilot and engineer, since we started transporting people in the air that was the way of the world. As technology grew and the cockpit instruments became more sophisticated, the need for the engineer in the cockpit decreased. Today there is only two people in the cockpit, the pilot and co-pilot. The system replaced the engineer, the job of airline instruments engineer does not exist today, the system tracks and re-calibrates the instruments used during flight.
(2) E-ZPass, we’re witnessing this phase out.
It would be prudent to believe that this type of machine learning and anomaly prediction system would replace at least one human in the field once implemented.
Article:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.98.7461&rep=rep1&type=pdf
PS: I used many articles during my research and have them in an email if anyone is interested.
Wade Mackey says
You may want to take a glance at Imperva’s product line. They essential have a behavioral tool for Web, Database, and Cloud. Probably not as sophisticated as what the article describes, but they do compete in this space.