FFX: Massively Shallow Learning

FFX is a technique for symbolic regression, to induce whitebox models given X/y training data. It does Fast Function Extraction. It is:

  • Fast - runtime 5-60 seconds, depending on problem size (1GHz cpu)
  • Scalable - 1000 input variables, no problem!
  • Deterministic - no need to "hope and pray".

If you ignore the whitebox-model aspect, FFX can be viewed as a regression tool. It's been used this way for thousands of industrial problems with 100K+ input variables. It can also be used as a classifier (FFXC), by wrapping the output with a logistic map. This has also been used successfully on thousands of industrial problems.

Technical details:

Open-source code: (2012, derived from v1.3 code)

  • At github or PyPI.  A big thanks to Nathan Kupp for maintaining and improving the code! 
  • Feel free to contribute to the code!

v1.3 code (2011):

Real-world test datasets:

Representative papers:

  • T. McConaghy, FFX: Fast, Scalable, Deterministic Symbolic Regression Technology, Genetic Programming Theory and Practice IX, Edited by R. Riolo, E. Vladislavleva, and J. Moore, Springer, 2011.
  • T. McConaghy, High-Dimensional Statistical Modeling and Analysis of Custom Integrated Circuits, Proc. Custom Integrated Circuits Conference, Sept. 2011