Probability-Based Adaptive Detection🙏🏻 PBAD (Probability-Based Adaptive Detection) : adaptive control tool for outliers || novelty detection, made for worst case data & processes, for the highest time complexity O(n^2) compared with the alternatives (would be explained in a sec). Thresholds are completely data driven and axiomatic, no need in provided hyperparameters, are not learned or optimized. The method accepts multiple weights, e.g. both temporal and volatility weights.
Method briefly explained (I can go deeper if any1 asks explicitly):
Performs weighted KDE on initial input data, finds KDE global maximum (mode), creates new “residuals” dataset by centering initial data around this value;
Performs weighted KDE on residuals, uses sigmoid based probability mass targets with increasing probability coverage to construct a set of non-disjoint High Density Intervals (also called HDR, HPD in Bayesian terms);
Uses these intervals to calculate analogs of centralized & standardized moments;
Uses these ^^ moments to construct a set of control thresholds. The scheme used in PBAD is not only based on a central threshold, or on neighboring ones, it utilizes all previous thresholds, gaining more information.
...
The most important part is to understand whether you really need PBAD. Because even tho it seems to be the best one given highest algocomplexity, irl it would work worse in cases when it’s not required by your data.
Here’s the menu (aka taxonomy omg) of methods you can use that would let you make the right choice:
Moment-Based Adaptive Detection (MBAD) :
Norm: L2
Time complexity: original O(n), successfully reduced to O(1) in online version
Use case: default, general purpose
Based on: method of moments (powers of residuals from mean)
Thresholds architecture: centralized
Quantile-Based Adaptive Detection (QBAD):
Norm: L1
Time complexity: O(nlogn)
Use case: either bad data Or process instability
Based on: quantile moments (dyadic percentiles of residuals from median)
Thresholds architecture: chained/recursive/sequential
Probability-Based Adaptive Detection (PBAD):
Norm: L0
Time complexity: O(n^2)
Use case: both bad data And process instability
Based on: probability moments (target probability masses of residuals from KDE mode)
Thresholds architecture: decentralized (for lack of a better name xd, the idea is that these thresholds gain information from the all other threshold and are Not exclusively based on the central or neighboring thresholds)
...
Examples of true use cases:
^^ an appropriate financial instrument to use PBAD
^^ and another one
...
Additional details about how to use it:
Keep the student5 kernel, it’s the best you can do. I added others mostly for comparisons and if you want to use the tool Not for its primary purpose (on a fine data)
“Calculate for N bars” and “Starting at bar N” options allow to reduce calculation period only on the N number of last bars or next bars from a chosen one. It's vital, because calculations here are heavy
Keep plotting offset at 1 (allows to visually compare current bar with the previous threshold values). This is the way it should be done on price data.
HLC3 is the optimal source input, unless you want to use your own better one point estimate of each datapoint (in the best case done by using PBAD itself on OHLC+ values).
In essence it should be used just like MBAD or QBAD, fade/push extensions and limit, fade/push/skip deviations & basis, or other strategies of your. Again, the only reason for 3 methods to exist is to be chosen for according data characteristics.
Btw:
This is the initial version, I don’t consider it perfected tbh, even tho it works as expected, however this method is very situational anyways.
In this script KDE function is modified to ensure the outcoming probabilities Do sum up to 1. I didn’t do this normalization in Weighted KDE Mode script , but there it’s not required since we just need a KDE global max.
see ya
∞
Kerneldensityestimation
Weighted KDE Mode🙏🏻 The ‘ultimate’ typical value estimator, for the highest computational cost @ time complexity O(n^2). I am not afraid to say: this is the last resort BFG9000 you can ‘ever’ get to make dem market demons kneel before y’all
Quickguide
pls read it, you won’t find it anywhere else in open access
When to use:
If current market activity is so crazy || things on your charts are really so bad (contaminated data && (data has very heavy tails || very pronounced peak)), the only option left is to use the peak (mode) of Kernel Density Estimate , instead of median not even mentioning mean. So when WMA won’t help, when WPNR won’t help, you need this thing.
Setting it up:
Interval: choose what u need, you can use usual moving windows, but I also added yearly and session anchors alike in old VWAP (always prefer 24h instead of Session if your plan allows). Other options like cumulative window are also there.
Parameters: this script ain't no joke, it needs time to make calculations, so I added a setting to calculate only for the last N bars (when “starting at bar N” is put on 0). If it’s not zero it acts as a starting point after which the calculations happen (useful for backtesting). Other parameters keep em as they are, keep student5 kernel , turn off appropriate weights if u apply it to other than chart data, on other studies etc.
But instead of listening to me just experiment with parameters and see what they change, would take 5 mins max
Been always saying that VWAP is ish, not time-aware etc, volume info is incorporated in a lil bit wrong way… So I decided not just to fix VWAP (you can do it yourself in 5 mins), but instead to drop there the Ultimate xD typical value estimator that is ever possible to do. Time aware, volume / inferred volume aware, resistant to all kinds of BS. This is your shieldwall.
How it works:
You can easily do a weighted kernel density estimation, in our case including temporal and intensity information while accumulating densities. Here are some details worth mentioning about the thing:
Kernels are raw (not unit variance), that’s easier to work with later.
h_constants for each kernel were calculated ^^ given that ^^ with python mpmath module with high decimal precision.
In bandwidth calculation instead of using empirical standard deviation as a scaler, I use... ta.range(src, len) / math.sqrt(12)
...that takes data range and converts it to standard deviation, assuming data is uniformly distributed. That’s exactly what we need: a scaler that is coherent with the KDE, that has nothing to do with stdevs, as the kernels except for gaussian ones (that we don’t even need to use). More importantly, if u take multiple windows and see over time which distro they approach on the long term, that would be the uniform one (not the normal one as many think). Sometimes windows are multimodal, sometimes Laplace like etc, so in general all together they are uniform ish.
The one and only kernel you really need is Student t with v = 5 , for the use case I highlighted in the first part of the post for TV users. It’s as far as u can get until ish becomes crazy like undefined variance etc. It has the highest kurtosis = 9 of all distros, perfect for the real use case I mentioned. Otherwise, you don’t even need KDE 4 real, but still I included other senseful kernels for comparison or in case I am trippin there.
Btw, don’t believe in all that hype about Epanechnikov kernel which in essence is made from beta distribution with alpha = beta = 2, idk why folk call it with that weird name, it’s beta2 kernel. Yes on papers it really minimises AMISE (that’s how I calculated h constants for all dem kernels in the script), but for really crazy data (proper use case for us), it ain't provides even ‘closely’ compared with student5 kernel. Not much else to add.
Shout out to @RicardoSantos for inspiration, I saw your KDE script a long time ago brotha, finna got my hands on it.
∞

