Those chocolate chip cookie recipes online are the real AI danger


Ever try to google a recipe and you get the most annoying freaking article about how the author’s mom’s best friend’s adopted dog’s vacation to South America made them really want to learn how to make a proper Chilean Sea Bass? It goes on for pages. Normally, with my hands covered in flour or something sticky to coat the screen of my phone, I scroll and scroll and scroll to find the one ingredient that I can’t remember – getting angrier at the stupid article.
(Look! I made cookies and I didn’t need to know about your brother’s Zen retreat to Hershey, PA.)
The reason this happens is one of the biggest problems in AI. Everyone is worried about safety, and the destruction of the world, or Terminator, or job losses. Nope the biggest problem is the tons of crap that precedes the ingredient list or oven temperature in a recipe.
This is because any AI or Machine Learning algorithm is really good at optimizing what you told it you wanted. And yes, search engines like google are AI. You’ve been using types of AI for decades now.
You told it you wanted the best recipe for some chocolate chip cookies – but a computer doesn’t know jackshit about cookies – it knows about words that describe cookies. And really, it doesn’t even know that. It finds patterns of words that people click on when they type words about cookies. So rather than knowing anything about cookies, it has a proxy problem that seems to be close to it. Most of the time, a search engine is really good at finding words on webpages that are related to your search. Words describing what you want, but you didn’t know that you wanted … or didn’t know what they were – “Napoleon Defeat” —> “Waterloo”. They co-occur and are often answer your “information seeking need”.
(All the words on the left vs. only the recipe on the right)
But for that killer Paella, there aren’t enough words in my 15 step recipe to distinguish between my recipe and the 50,000 other people who have posted something. So seeing a lot more words has helped Google/Bing/whatever the fuck else is out there/Ask Jeeves to find that recipe. If I describe every possible detail about saffron, some of these are bound to be deemed related to paella, or better yet, even closer related to saffron than that other jerk posting his recipe without his personal travel blog attached.
This is a classic problem where the algorithm has over-optimized for the proxy problem. We call this the objective function. It is really good at finding related words for what you search for – and most of the time that’s what you want – but not all the time. It’s good at something that appears similar to what you want. And it will get as good as possible at doing that thing – but it’s not that thing. It isn’t really telling you how much wasabi you need. It is trying to win a game and get the best score. More words about “Japan” or “horseradish” make that little score go up.
So instead of finding the perfect recipe, it just rewards longer and longer articles. The law of unintended consequences.
And this is all those fancy AI bots are good at as well … making some little score go up. You feed the algorithm the whole internet and it gets good at knowing that “United States of” is probably going to be followed by “America” … or when you ask ChatGPT to talk like a pirate, that the last word is probably “Yarrrrr”.
This is the biggest problem in AI … we are hyper-optimizing for little scores that aren’t actually exactly what we want. We aren’t going to destroy the world … if anything, we are just going to flood our emails with verbose, AI generated content that is wayyy too long and resembles the quote-unquote perfect cookie recipe that no one actually asked for.
Me, I’d take the killer robots any day of the week over having to read another stupid story about some personal discovery when I’m trying to cook, but unfortunately that’s not what we are going to get.
Subscribe to my newsletter
Read articles from Academically Impertinent directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Academically Impertinent
Academically Impertinent
I am a professor who has been working on Artificial Intelligence for almost two decades. This is my unfiltered, NSFW takes on some of the developments in the field. Things should still be a proper reflection, and scientifically rigor-ish, just without all the egos and pomp that I see in most of Academia/Silicon Valley/Wall Street.