ML "Research" - Passing The Buck of Complexity
Original post: here.
Response to recent blog post: To Understand Language is to Understand Generalization and the earlier one Just Ask for Generalization
TL/DR: ML "generalization" is equivalent to language.
To put my position on this blog post up front, it should be made clear that the post is an awful mess.
It essentially passes the buck of ignorance and complexity from one area of vague understanding - ML "generalization" - to another area of vague understanding: human language.
This seems like a cop out.
Generalization Who?
Large amounts of diverse data are more important to generalization than clever model biases. - Just Ask for Generalization
This is likely because the only thing that ML models do is model some existing mathematical relation where each tuple in the relation is unique. Therefore, diversity of data is a more complete picture of the relation since it avoids unnecessary redundancy in the data.
reinforcement learning algorithms are much less computationally efficient when it comes to absorbing vast quantities of diverse data needed for generalization. - Just Ask for Generalization
I wonder why.
could require thousands of trials in order to achieve statistical certainty within a couple percent ... could encompass millions of unique tasks and scenarios, which makes accurate evaluation prohibitively expensive. - Just Ask for Generalization
Ah, of course. More complexity. Always more complexity.
Language Is Not Generalization
And how do you define generalization, anyway?
You don't.
... such definitions are not useful for ...
Anything really.
“What data is out-of-distribution?”
Are we asking what data is bad? Or are we asking which unique tuples of the mathematical relation we are attempting to model using highly-parameterized statistical models are missing from our dataset?
“Is my model + data robust to adversarial examples?”
Sigh. A measure of adversarial robustness is intractable. So, how could we ever properly measure it without a massive evaluation dataset of good and bad data? Just increment through all possible vectors in the input space?
“How can we train models to know what they don’t know?”
You cannot provably do this. So, if you cannot prove that it can be done (in training your model) or that you are doing it, or that you have done it, then how could you ever know what you don't know?
Like the parable of the blind men and the elephant
There is no elephant. Or, at least, the elephant in the room is not the existence of some currently misunderstood yet profound notion of "generalization" but rather the non-existence of that profoundness.
“compositionality”
Oh boy, here we go again back down the rabbit holes of the algebra-tization of everything as a method of "applied mathematical modeling". Maybe we can bust out the Category Theory next?
Do you know what else are “disentangled, standalone concepts”? Words!
Wait, wait. I think I get how generalization is human language now.
No, nevermind. Still lost.
the precise boundaries of anything - even definitions around the basic structures of language itself - become fuzzy if you look too hard.
And by "look too hard" we are implying "never question the assertions being made"?
But if you accept that “generalization is language”
I don't. I avoid "accepting" things that make claims to be sound and rigorous but make no attempt to actually be such.
Language is generalization is cognition.
Wow! We solved AGI and brain science? Man, I am way behind.
in an infintiely-composable way
Infinite composition! It's composition all the way down, folks. Cuz, Chomsky said so.
Of course, then we tie in these absolute insane conclusions and claims into a working paper that comes nowhere close to proving or validating our written claims but is too difficult for the average reader to understand or care that it does not do that.
ML "research" is coming everyday closer and closer to pure snake-oil salesmen-esque clickbait.
I don’t mean to suggest that they are ready today for solving ML once and for all
See! We were not getting carried away at all. Just ignore all the rambling. We are reasonable researchers.
But, we still continue to make crazy claims without any substance or backing such as:
The structure of language is the structure of generalization.
What does that even mean?
Chomsky has a lot more to say ...
No, he doesn't. Almost 100% of everything that Chomsky has ever said or devised is amateurishly worthless.
humans are born with some ability to generalize, even if what they are generalizing is their learning ability.
It's funny how ML "researchers" feel confident making statements about highly complex human biology before they have even properly defined their own jargon, let alone formally specified how their jargon should be interpreted when applied to human biology.
Decision Transformers
But, essentially both blog posts seem to have been inspired by this paper.
A few basic facts about the paper. It is 21 pages long. 10 of those pages are TOC, references, and essentially nothingness. That means that 48% of the entire published paper is not even, well, anything. There are 9 authors. Of the remaining 11 pages you have: half a dozen images, half a dozen tables, and half a page of source code.
How much work is actually here? 9 authors for a few pages of writing. Each person was assigned half a page of work? (Yes, yes I see that there was two primary authors and some advisors. I am being partially facetious and you know it).
Anyways, the core concept of the paper is also ... something. Did you know that since RL algorithms are about finding paths in state spaces that you could instead train a transformer to return the optimal path (or sequence of {state,action} pairs) instead? But, perhaps the most inspiring part: delayed/sparse rewards don't trip up the transformer model. Incredible.
I mean, what else can be said? Generalization is truly language.
Conclusions
I genuinely hate to have to say this. But, this is getting really bad. We are seeing time and time again that the ML "research" community is full of language but very little generalization (pun intended, of course). Everyone is so scared of bursting out of the bubble that is "acceptable and academically-accredited paths" to intelligence research. But, the bubble will burst inevitably and regardless of individual desires or behaviors.
It's time for ML "research" to come entirely to ALife. There is a lot more to be done and a lot more doable over here.
Subscribe to my newsletter
Read articles from Genevis directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Genevis
Genevis
Theoretical Biology Research.