TabPFN

“A Transformer That Solves Small Tabular Classification Problems in a Second”
https://arxiv.org/abs/2207.01848
https://medium.com/chat-gpt-now-writes-all-my-articles/tabpfn-the-new-xgboost-easy-models-for-efficient-high-performance-machine-learning-1327de095b4f
- “The training is computationally expensive, requiring significant time and computational resources. However, it is a one-time offline step done during algorithm development.”
it uses a transformer which encodes each feature vector and label as a token
- each each incoming feature vector a word vector? I guess you skip the tokenizer part
- but how do you turn a feature (e.g. age) into a 512 dim vector (for example)
github: https://github.com/automl/TabPFN
https://www.youtube.com/watch?v=9cE8lqQiLyM
- a video by one of the authors
- limitations:
  - up to 1000 data points, 100 features, 10 classes
- this video isn’t very good

Do not power transform your data before feeding into TabPFN.
“We do not have special nan handling built into our model. We replace nan values with zero at test time”

“We also focused the development of TabPFN to purely numerical datasets without missing values, and while they can be applied to datasets with categorical features and/or missing values, their performance is generally worse.”
“we did not consider the existence of many uninformative features in our prior, leading to performance degradation when such features are added”
since TabPFN does training AND inference at the same time, inference is slower
it works much better for categorization than regression

🏖️ Kaggle Solutions