FREL A Stable Feature Selection Algorithm for trading system parameters selection



When performing Walkforward optimization due to the high number of paramaters our trading system may have we may not know what parameters are really critical and what they do not provide any significant value added to the robust performance of the trading system involved. The FREL algorithm is a useful method for ranking, and even weighting, predictor candidate variables in a classification application that is relatively low noise but is plagued by high dimensionality. FEATURE selection has been an active research area in machine learning and data mining for decades. It is an important and frequently used technique for data dimension reduction by removing irrelevant and redundant information from a data set. It is also a knowledge discovery tool for providing insights on the problem through interpretations of the most relevant features.


Timothy Masters  (i.e. in Data Mining Algorithms in C++ 2018 from pages 149 to 164 provides a C++ multithread algorithm) implements so includes an approximate Monte Carlo permutation test (MCPT) of the null hypothesis that all predictors have equal value, as well as an MCPT of the null hypothesis that the predictors, taken as a group, are worthless. Sadly, I am unable to devise a FREL based MCPT of any null hypothesis concerning individual predictors taken in isolation. A frequently useful variation on the original algorithm is to take many bootstrap samples from the dataset and compute the final weight estimate by averaging the estimates produced from each bootstrap sample. The sampling must be done without replacement, as nearest-neighbor algorithms are irreparably damaged when the dataset contains exact replications of cases. 


Bootstrapping FREL has at least two major advantages over doing one FREL analysis of the entire dataset.


1) Stability is usually improved. A critical aspect of any weighting scheme is that the computed optimal weights should be affected as little as possible by small changes in the dataset. Such changes might be inclusion or exclusion of a few training cases or the addition of noise to the data. An average of bootstraps is much more robust against data changes compared to a single complete FREL processing.
2) Because run time of the FREL algorithm is proportional to the square of the number of cases, we can greatly decrease the run time by performing many iterations of a small sample.


For these reasons, bootstrapping is generally recommended. The sample size must be large enough that each sample is virtually guaranteed to have a significant number of representatives from each target class. For the number of iterations, my own rough rule of thumb is that the product of the number of iterations times the sample size should be about twice the number of training cases.


A Monte Carlo permutation test is a useful, though time-consuming, way to test certain null hypotheses about the predictor candidates subjected to the FREL algorithm. It is vital to understand that these tests are significantly different from the permutation tests.



Attachments
Stability of feature selection algorithm A review 2019.pdf
(1014.10 KiB)
Data Mining Algorithms in C - Data Patterns and Algorithms for Modern Applications Apress 2018.pdf
(4.36 MiB)
FREL A Stable Feature Selection Algorithm 2015.pdf
(1.14 MiB)
  • Votes +1
  • Project StrategyQuant X
  • Type Feature
  • Status New
  • Priority Normal

History

Jd
#1

jdelcarm66

11.05.2020 18:54

Task created

Jd
#2

jdelcarm66

11.05.2020 18:55
Voted for this task.

Votes: +1

Drop files to upload

or

choose files

Max size: 5MB

Not allowed: exe, msi, application, reg, php, js, htaccess, htpasswd, gitignore

...
Wait please