• One assumption is: “The public leaderboard should be similar to the private leaderboard”
    • If the public leaderboard is very different from your train data, you have to make your train dataset look like the predicted test data
  • How to probe?
  • TLDR: Try your best to figure out what is the distribution of the test data. Make sure your training data has a distribution that matches the test data.