Further Reading: Chapter 22
Anomaly Detection: Isolation Forests, Autoencoders, and Finding Needles in Haystacks
Foundational Papers
1. "Isolation Forest" --- Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou (2008) The paper that introduced Isolation Forest. Liu et al. proposed the key insight that anomalies are "few and different," making them susceptible to isolation by random partitioning. The paper proves that anomalies have shorter average path lengths in random trees and introduces the normalization using the expected path length of binary search trees. Published at ICDM 2008, with an extended version in ACM Transactions on Knowledge Discovery from Data (2012). This is the definitive reference for understanding why Isolation Forest works and what its theoretical guarantees are.
2. "Isolation-Based Anomaly Detection" --- Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou (2012) The extended journal version of the original Isolation Forest paper. It includes additional analysis of the algorithm's linear time complexity, experiments on high-dimensional data, and a detailed comparison with LOF, One-Class SVM, and Random Forest-based anomaly detection. Published in ACM Transactions on Knowledge Discovery from Data, Vol. 6, No. 1. Read this for the most complete treatment of Isolation Forest, including its behavior on datasets with varying dimensionality and contamination rates.
3. "LOF: Identifying Density-Based Local Outliers" --- Markus Breunig, Hans-Peter Kriegel, Raymond Ng, and Jorg Sander (2000)
The paper that introduced Local Outlier Factor. Breunig et al. formalized the concept of local density comparison: an observation is an outlier if its local density is significantly lower than the density of its neighbors. This local perspective enables detection of outliers in datasets with clusters of varying density, where global methods fail. Published at ACM SIGMOD 2000. This is the foundational reference for understanding why LOF handles multi-density data and how the n_neighbors parameter affects sensitivity.
Autoencoder-Based Anomaly Detection
4. "Anomaly Detection with Robust Deep Autoencoders" --- Chong Zhou and Randy Paffenroth (2017) Zhou and Paffenroth address a critical limitation of standard autoencoders for anomaly detection: contamination of the training data. They propose a robust deep autoencoder that decomposes the input into a low-rank component (captured by the autoencoder) and a sparse component (the anomalies), inspired by robust PCA. Published at KDD 2017. Read this if you need to train autoencoders on data that contains unlabeled anomalies.
5. "Deep Learning for Anomaly Detection: A Review" --- Guansong Pang, Chunhua Shen, Longbing Cao, and Anton van den Hengel (2021) A comprehensive survey covering autoencoders, variational autoencoders (VAEs), GANs, and self-supervised methods for anomaly detection. The paper categorizes methods by their assumptions (clean training data vs. contaminated), data types (tabular, image, sequence, graph), and evaluation protocols. Published in ACM Computing Surveys, Vol. 54, No. 2. This is the best starting point for understanding the landscape of deep learning approaches to anomaly detection beyond simple autoencoders.
6. "Variational Autoencoder Based Anomaly Detection Using Reconstruction Probability" --- Jinwon An and Sungzoon Cho (2015) An and Cho demonstrate that reconstruction probability (from a VAE) is a more principled anomaly score than reconstruction error (from a standard autoencoder). The VAE models the distribution of reconstructions rather than a single point estimate, providing uncertainty quantification alongside the anomaly score. Published as a technical report at Seoul National University. Read this if you want to move beyond standard autoencoders to probabilistic anomaly scoring.
Statistical Methods
7. "Robust Statistics: Theory and Methods" --- Ricardo Maronna, Douglas Martin, and Victor Yohai (2019, 2nd Edition) The standard reference for robust statistical methods, including robust estimation of location and scatter (used in Mahalanobis distance with the Minimum Covariance Determinant estimator). Chapters 6-8 cover robust multivariate methods directly applicable to statistical anomaly detection. Published by Wiley. Read the relevant chapters if you use Mahalanobis distance on data where outliers may corrupt the covariance estimate.
8. "A Fast Algorithm for the Minimum Covariance Determinant Estimator" --- Peter Rousseeuw and Katrien Van Driessen (1999)
The paper behind scikit-learn's EllipticEnvelope and MinCovDet. The Minimum Covariance Determinant (MCD) estimator finds the subset of observations whose covariance matrix has the smallest determinant, providing a robust estimate of location and scatter. This is critical for Mahalanobis-based anomaly detection because the standard covariance matrix is itself corrupted by the anomalies you are trying to detect. Published in Technometrics, Vol. 41, No. 3.
One-Class SVM
9. "Estimating the Support of a High-Dimensional Distribution" --- Bernhard Scholkopf, John Platt, John Shawe-Taylor, Alex Smola, and Robert Williamson (2001)
The paper that introduced One-Class SVM. Scholkopf et al. formulated novelty detection as finding a hyperplane in kernel feature space that separates the data from the origin with maximum margin. The nu parameter controls the trade-off between the fraction of training points outside the boundary and the margin width. Published in Neural Computation, Vol. 13, No. 7. This is the theoretical foundation for scikit-learn's OneClassSVM.
Evaluation and Benchmarks
10. "Revisiting Time Series Outlier Detection: Definitions and Benchmarks" --- Siddharth Bhatia, Arjit Jain, Pan Li, Ritesh Kumar, and Bryan Hooi (2022) A rigorous benchmark of anomaly detection methods on time series data, with careful attention to evaluation methodology. The authors demonstrate that evaluation choices (point-based vs. range-based scoring, thresholding strategy, dataset selection) can reverse the ranking of methods. Published at NeurIPS 2022 Datasets and Benchmarks. Read this to understand how evaluation methodology affects conclusions about which method is "best."
11. "ADBench: Anomaly Detection Benchmark" --- Songqiao Han, Xiyang Hu, Hailiang Huang, Minqi Jiang, and Yue Zhao (2022) The most comprehensive benchmark of anomaly detection algorithms on tabular data to date. ADBench evaluates 30 algorithms across 57 datasets, comparing unsupervised, semi-supervised, and supervised approaches. Key finding: supervised methods (when labels exist) consistently outperform unsupervised methods, and among unsupervised methods, Isolation Forest and ECOD (Empirical Cumulative distribution-based Outlier Detection) perform best overall. Published at NeurIPS 2022. This is the paper to cite when justifying your algorithm choice.
12. "Precision and Recall for Time Series" --- Nesime Tatbul, Tae Jun Lee, Stan Zdonik, Mejbah Alam, and Justin Gottschlich (2018) Tatbul et al. extend precision and recall to handle the temporal structure of time series anomaly detection, where a single anomaly event spans multiple time points. Standard point-wise precision and recall can be misleading: detecting 1 of 100 anomalous time points in an event should count as detecting the event, not as 1% recall. Published at NeurIPS 2018. Read this if you are applying anomaly detection to sensor time series like the TurbineTech scenario.
Practical Applications
13. "Predictive Maintenance Using Machine Learning" --- Susto, Schirru, Pampuri, McLoone, and Beghi (2015) A practical overview of ML-based predictive maintenance in semiconductor manufacturing, covering anomaly detection, remaining useful life estimation, and failure prediction. The authors discuss the challenges of label scarcity, concept drift (equipment aging), and integration with maintenance scheduling systems. Published in IEEE Transactions on Industrial Informatics, Vol. 11, No. 3. Directly relevant to the TurbineTech case study.
14. "Unsupervised Anomaly Detection with Generative Adversarial Networks" --- Thomas Schlegl, Philipp Seebock, Sebastian Waldstein, Ursula Schmidt-Erfurth, and Georg Langs (2017) Schlegl et al. introduced AnoGAN, which uses a GAN trained on normal data to detect anomalies in medical images. The anomaly score combines reconstruction error (how well the GAN can reproduce the input) with discrimination loss (how realistic the GAN considers the input). Published at IPMI 2017. While focused on images, the approach of using generative models for anomaly detection extends to tabular data.
15. "Real-Time Anomaly Detection for Streaming Analytics" --- Subutai Ahmad, Alexander Lavin, Scott Purdy, and Zuha Agha (2017) Ahmad et al. describe the Numenta Anomaly Benchmark (NAB) for evaluating real-time streaming anomaly detection. The paper introduces a scoring framework that rewards early detection and penalizes late detection, directly relevant to the TurbineTech use case where early warning time matters. Published in Neurocomputing, Vol. 262. Read this for evaluation methodology for streaming/time-series anomaly detection.
Extended Isolation Forest and Variants
16. "Extended Isolation Forest" --- Sahand Hariri, Matias Carrasco Kind, and Robert Brunner (2021)
The original Isolation Forest uses axis-aligned splits, which creates artifacts when anomalies lie along diagonal directions in feature space. Hariri et al. propose Extended Isolation Forest, which uses random hyperplane splits (not axis-aligned), eliminating these artifacts. The implementation is available as the eif Python package. Published in IEEE Transactions on Knowledge and Data Engineering. Read this if you observe Isolation Forest producing ghost anomalies along feature axes.
17. "Deep Isolation Forest for Anomaly Detection" --- Hongzuo Xu, Guansong Pang, Yijie Wang, and Yanwei Wang (2023) Xu et al. combine the isolation principle with deep neural network representations, building isolation trees in learned feature spaces rather than the raw feature space. The method handles tabular, image, and sequential data. Published at IEEE Transactions on Knowledge and Data Engineering. Read this for the state of the art in combining isolation-based and deep learning approaches.
Software Libraries
18. PyOD: Python Outlier Detection Library --- Yue Zhao, Zain Nasrullah, and Zheng Li (2019)
PyOD is the most comprehensive Python library for anomaly detection, implementing 30+ algorithms including Isolation Forest, LOF, One-Class SVM, autoencoders, ECOD, COPOD, and many others. It provides a unified API similar to scikit-learn, making it easy to compare methods. Published in the Journal of Machine Learning Research, Vol. 20. Available at pyod.readthedocs.io. If you work with anomaly detection regularly, PyOD should be in your toolkit alongside scikit-learn.
19. scikit-learn User Guide --- Outlier and Novelty Detection
scikit-learn's documentation for IsolationForest, LocalOutlierFactor, OneClassSVM, and EllipticEnvelope. Includes clear explanations of the difference between outlier detection (unsupervised, labels training data) and novelty detection (semi-supervised, predicts on new data), practical examples, and parameter guidance. Available at scikit-learn.org in the User Guide under "Unsupervised Learning."
20. Alibi Detect --- Seldon Technologies
A Python library focused on outlier detection, adversarial detection, and concept drift detection for production ML systems. Includes implementations of autoencoders, VAEs, and sequence-based detectors (for time series). The library is designed for deployment, with integrations for serving frameworks. Available at docs.seldon.io/projects/alibi-detect. Read the documentation if you plan to deploy anomaly detection in a production pipeline.
How to Use This List
If you read nothing else, read Liu et al. (item 1) on Isolation Forest and Han et al. (item 11, ADBench) for an empirical benchmark showing how methods compare on real datasets. Together, these two papers justify why Isolation Forest is the default starting point and when you might need something else.
If you want to go deeper on autoencoders, read Pang et al. (item 5) for a survey of deep learning approaches and An and Cho (item 6) for the argument that reconstruction probability (VAE) is more principled than reconstruction error (AE).
If you are working on time series anomaly detection (like TurbineTech), read Tatbul et al. (item 12) on evaluation methodology and Ahmad et al. (item 15) on streaming detection. Standard point-wise metrics are misleading for temporal data.
If you need a practical toolkit, install PyOD (item 18) alongside scikit-learn. PyOD's unified API lets you swap algorithms with a single line change, which is invaluable during the exploratory phase of an anomaly detection project.
This reading list supports Chapter 22: Anomaly Detection. Return to the chapter to review concepts before diving in.