FFX: Massively Shallow Learning

FFX is a technique for symbolic regression, to induce whitebox models given X/y training data. It does Fast Function Extraction. It is:

  • Fast - runtime 5-60 seconds, depending on problem size (1GHz cpu)
  • Scalable - 1000 input variables, no problem!
  • Deterministic - no need to "hope and pray".

If you ignore the whitebox-model aspect, FFX can be viewed as a regression tool. It's been used this way for thousands of industrial problems with 100K+ input variables. It can also be used as a classifier (FFXC), by wrapping the output with a logistic map. 

The original FFX project work was 2011-2013. It was embedded in Solido software and since then,  used successfully on thousands of industrial problems.

Technical details:

Open-source Code:

  • See github.com/trentmc/ffx2011
  • FFX.py implements FFX algorithm, and runffx.py is a testing toolkit. License.
  • Dependencies - python (2.5, 2.6, and 2.7 work), numpy/scipy (0.9.0 works), scikit-learn (0.8 works)
  • (Alt open source code: natekupp/ffx. Derived from v1.3. 2012-2021. 🙏 Nathan Kupp.)

Real-world test datasets:

Representative papers:

[1] T. McConaghy, FFX: Fast, Scalable, Deterministic Symbolic Regression Technology, Genetic Programming Theory and Practice IX, Edited by R. Riolo, E. Vladislavleva, and J. Moore, Springer, 2011.

[2] T. McConaghy, High-Dimensional Statistical Modeling and Analysis of Custom Integrated Circuits, Proc. Custom Integrated Circuits Conference, Sept. 2011