@inthehands Great thread!
I wanted to point out a small caveat. It is often said ML methods *can’t* generalize outside of their training domain. While this is often the case, it *isn’t always true*: this paper shows simple examples where strong generalization happens from sparse addition examples.
It is true big ML models struggle to generalize, but people should know that strong generalization isn’t *impossible*, it just might be hard in practice.