We run into issues in network where servers failed to serve the requests owing to the upsurge in requests suddenly or uneven dynamic request servings. The features that are assumed to optimally account for failure were: o> Throughput and o> Latencies. This simple model detects anomaly which might be helpful to avoid larger breakdowns.It is later subject to a dataset of 100+ features.
Threshold selection for anomaly was questionable to us and ambiguous too at the same time. We had to run the rough model various times on the test dataset to get a decent F1 performance score to use the respective value of threshold as our threshold. We then learnt to make a function to automatically choose the threshold for the max F1 score. To make the model faster and light, we implemented Vectorization for every loop in the functions rather than loops or recursions. We then had problems figuring out how to make functions compatible with multi-dimensional dataset, which we used to implement at the end for a dataset with 100+ features that resembled more real-life data.
Technologies used
Discussion