Merchandising teams watch sales, orders, and commissions week over week. When something drifts in a bad way, you want a signal early. You also want fewer false alarms so people still trust the system. Our alerting stack pulls structured performance data, prepares slices for different kinds of partners (for example high-volume merchants and affiliates), and runs detection. One of the main detectors is Principal Component Analysis (PCA) used as an anomaly detector, not as a decorative chart for a slide deck.
The approach follows classic multivariate process monitoring: correlated metrics are summarized by a few components; then each week is checked both inside that summary space and outside it (reconstruction residual). That two-part check is the same family of methods used for decades in chemical and industrial process control under names such as Hotelling’s T² and squared prediction error (SPE). It remains a solid, interpretable baseline when you have moderate dimensionality and enough history per entity.
Because the production data is company-sensitive, this post stays high level. You will see how the pieces fit together, what PCA is doing in plain language, and a few short code excerpts so you can map concepts to implementation. You will not see real merchant names, live numbers, or internal product identifiers.
Prerequisites
- Basic comfort with Python,
pandas, andscikit-learn. - Familiarity with weekly or panel-style data (the same entity observed over many weeks).
- Optional: any short primer on PCA scores versus reconstruction error for outlier work (search terms: PCA, SPE, Hotelling T², multivariate SPC).
What this pipeline contains
Think of the research side of alerting as a layered stack, not a single notebook cell:
- SQL builds rolling windows of metrics from the warehouse (for example long lookback extracts aligned to click week).
- Preprocessing loads configuration, connects to the warehouse with those templates, and builds subsets such as “top” merchants or affiliates by rolling activity so PCA runs on entities that actually have enough signal.
- A dedicated PCA module implements the rolling anomaly detector used in production-style runs and back-tests.
- Parallel paths (for example IQR and percentage-based rules) handle a different population: lower volume partners where multivariate PCA would be starved for stable counts.
- Back-testing replays the same building blocks on historical extracts, compares variants (for example false positive reduction tweaks), and feeds reporting helpers.
- Reporting and utilities cover logs, database connectors, and summary exports.
So the story is the path from warehouse query to CSV outputs and reports, with PCA as one star player for the high-volume segments.
Why PCA for this problem
Weekly merchant or affiliate metrics tend to move together. When orders rise, revenue often rises too. PCA learns a small set of directions in which most of the “normal” variation lives. The current week is then compared to that learned subspace in two complementary ways:
Distance inside the PCA subspace (Hotelling-type T²)
“Is this week extreme along the main directions of variation we have seen lately?”Reconstruction error (often called Q or SPE)
“After we project the week onto those directions and reconstruct, how much of the original vector is left unexplained?”
If the week is unusual in scale along familiar patterns, T² tends to react. If the week breaks the usual relationships between metrics, Q tends to react when there is a residual subspace to measure. Our implementation labels combinations of those flags into simple types such as structure versus scale, which helps analysts reason about what kind of unusual week they got.
Why three components for three variables
We monitor a small, fixed set of weekly metrics (three numeric features in production). You might expect PCA to use one or two components so that Q can watch a “leftover” residual space for odd coupling between variables. We instead keep three components for three variables, which is a deliberate design choice, not an oversight.
What happens mathematically. After standardizing the training window, PCA with n_components equal to the number of features is an orthogonal change of basis. Reconstruction from all three components is exact (up to floating point noise), so the residual vector is essentially zero and Q (squared prediction error) is not informative. There is no leftover subspace to monitor.
What we still get. T² is computed on all three score dimensions, scaled by the eigenvalues from the same window. That is the multivariate analogue of asking whether this week is far from the recent joint norm in a space where axes are uncorrelated and ordered by how much variance they explain in that window.
How the code reflects it. When components cover the full feature space, Q-based flags are turned off so alerts are not driven by numerical dust on a meaningless residual:
# pca_model.py: full PCA space means Q is not used for alerting
full_space = self.n_components >= len(self.features)
q_flag = (not full_space) and (Q > Q_lim)
t2_flag = T2 > T2_lim
So “PCA with three components on three variables” here means: decorrelate and variance-weight the metrics from recent history, then judge the current week with T² in that full three-dimensional space, while promotion filters, persistence, and percent-deviation gates handle the operational side.
How rolling detection works (conceptually)
For each merchant or affiliate, we walk forward in time. For each week we:
- Take a lookback window of recent non-promotion weeks only (promotion weeks are treated separately so campaigns do not look like emergencies).
- Standardize the feature columns so scale differences do not dominate blindly.
- Fit PCA on that window and project the current week.
- Compute Q from the reconstruction residual and T² from the scores and explained variance.
- Compare each statistic to a threshold derived from the window.
- Optionally require persistence (an anomaly must show up across more than one step) so one noisy week does not spam the inbox.
- Optionally apply a minimum percent deviation gate so tiny numerical blips that barely cross a statistical line do not become alerts.
A compact view of the flow:
Warehouse SQL
↓
Preprocessing (top vs low tier, rolling filters)
↓
Rolling PCA detector (fit_detect)
↓
Per-week labels: ALERT, OK, or IN_PROMO
↓
CSV exports and downstream reporting
A peek at the implementation
The class RollingPCAAnomalyDetector documents the intent: Q and T² statistics, thresholds, persistence, exports, and contributor summaries.
# pca_model.py
class RollingPCAAnomalyDetector:
def __init__(
self,
df: pd.DataFrame,
target: str = "merchant_name",
features: list = [],
window: int = 12,
n_components: int = 3,
alpha: float = 0.99,
persistence: int = 1,
min_pct_deviation: float = 0.0
):
...
The core score computation for a test week, after fitting PCA on the training window:
# pca_model.py: reconstruction residual (Q) and score energy (T²)
x_recon = pca.inverse_transform(x_test_pca)
residual = x_test_scaled - x_recon
Q = np.sum(residual ** 2)
T2 = np.sum((x_test_pca ** 2) / pca.explained_variance_)
How this sits next to other detectors
PCA is not the only tool in the stack. Low-volume segments use IQR and percentage-based detectors with their own tuned multipliers. That split matters: multivariate PCA needs enough stable history; a tiny affiliate with sparse counts may be better served by simpler fences. Architecturally, that is one pipeline with multiple branches rather than one model pretending to fit everyone.
Conclusion
This slice of a merchandising alerting system is intentionally boring in shape: query, preprocess, detect, export, report, back-test. PCA gives a principled way to monitor correlated weekly metrics for high-volume merchants and affiliates when you pair rolling windows, a T²-led multivariate score (three components on three metrics, so residual-based Q is not part of the alert rule), persistence, and explicit promotion handling. Operators get a simple label at a glance (ALERT vs OK vs IN_PROMO) while the heavier statistics stay in well-tested Python.
If you are building something similar, start with clear entity definitions and enough history per entity, then invest in false positive controls (persistence, deviation gates, and honest back-tests). The PCA part is only valuable if people still open the email when it fires.