comment by kleinbl00

a thoughtful web.

Good ideas and conversation. No ads, no tracking. Login or Take a Tour!

kleinbl00 · 63 days ago · link · · parent · post: Ed Zitron has lost all patience with your AI Boosterism

The first phrase that really contained my understanding and experience of LLMs was "stochastic parrot."

The parrot has context and analysis around what it says, but that context and analysis is very... parrot-centric. If I say that parrots mostly speak in non-sequiturs very few people will argue with me. If i say that there's a certain randomness (stochasticity) to parrot speech the chin-strokers will nod. But if I say an LLM has less of a handle on its outputs than a parrot, Blake Lemione will tell me I'm a monster and the TESCReal posse will laugh and point about how I won't be able to scream when I have no mouth.

The most recent phrase that gets to the heart of the problem is "bag of heuristics." I came across it in this piece which talks about "world models" and "brittleness" - what are the LLMs solving for and is that anything like what we're solving for?

The OthelloGPT world-model story faced a new complication when, in mid-2024 a group of student researchers released a blog post entitled “OthelloGPT Learned a Bag Of Heuristics.” The authors were part of a training program created by DeepMind’s Neel Nanda, and their project was to follow up on Nanda’s own work, and do careful experiments to look more deeply into OthelloGPT’s internal representations. The students reported that, while OthelloGPT’s internal activations do indeed encode the board state, this encoding is not a coherent, easy-to-understand model like, say, an orrery, but rather a collection of “many independent decision rules that are localized to small parts of the board.” As one example, they found a particular neuron (i.e., neural network unit) at one layer whose activation represents a quite specific rule: “If the move A4 was just played AND B4 is occupied AND C4 is occupied, then update B4 and C4 and D4 to ‘yours’ [assuming the mine, yours, or empty classification labels]”. Another neuron’s activation represents the rule “if the token for B4 does not appear before A4 in the input string, then B4 is empty.”

I think there's a real dividing line here: the credulous are all about "it gives me what I want." The incredulous are all about "it doesn't give me what I need." And the incredulous are mostly being castigated while the credulous are being pandered to - why block out "things that look like Miyazaki" if it's all over Twitter? I mean... that video was certainly Batman-adjacent, why harp on the fact that it's hammered ass?

My daughter's classmates are absolutely using AI to do their homework. She's 12. My daughter isn't because the point isn't doing the homework, the point is in having fun with it. I think the real blessing and real danger of AI as it is currently envisioned is it gives you the mediocre middle of everything you ask for. Which, if you're looking for mediocre friendships, a mediocre relationship, mediocre entertainment, mediocre prose, mediocre images? Saltman will hook you up for $8 a month. As you point out above, however, Google will do that shit for free, so... sorry, Sam. It was a former Google VP who first explained to me that Google's business model was to find a disruptable market and suck all the profit out of it so it could sell the eyeballs of everyone using it to advertisers and if there's any company that can starve out some dipshit who wants a trillion dollars to plagiarize the tower of Babel it's Google.

Socially, I think we need to move real fast to figure out guardrails or norms or anything before we get a perpetual Her situation for any mildly loner kid.

It's the mediocrity of the situation that gives me hope. The "holy shit this is fucking awesome" aspects of "AI" for me? Were all back here.

That shit is still there. Is still interesting. Still holds promise for all sorts of creative weirdness. It's a tool! But what we're sold is the mediocre middle. And the thing is? If being a passive potato makes you happy, the passive potato path now has more creativity to it. That's a big part of it for me - the use cases for AI, as envisioned by every douche in a suit, are some form of TPS report. The use cases for AI for every incel with a twitter account is some form of hentai waifu. I think you would have been fine if you were fifteen in 2025 because it would have taken you a few days to see there's no ghost in the machine.

I think the downfall of OpenAI will be that there's nothing it does that's worth the money. And that, really, is my basic beef: all these AI dipshits keep talking about how exciting it all is and it's boring as fuck, dude. It'll make a meme without you having to spend ten minutes in MS Paint. It'll vomit up a mediocre essay without you having to search for it. It'll give you a bunch of code that sort of works, it'll provide you with a bunch of forgettable fucking content.

I say "Loab" and that woman is staring right at you.

I think there's a helluva future for AI... once everyone lets go of LLMs and starts focusing on stuff that can build a world model instead of a bag of heuristics. I even think there's a helluva future for LLMs... once everyone lets go of paying Saltman $8 a month and starts focusing on training their own. Damn near every post here is an LLM stretched to breaking, and it's when stretched to breaking it gets interesting.

https://hubski.com/domain/aiweirdness.com

There's fucktons of cool video games out there... and there's Farmville. And yeah, everyone played Farmville and everyone played it a little too much and then everyone moved the fuck on. And that's kinda where I'm at right now. Ain't nothing wrong with video games but Farmville is fucking boring and I judge you if you play it. Ain't nothing wrong with artificial intelligence but mass-trained LLMs are fucking boring and I judge you if you think otherwise.

I think most kids aren't boring enough to get suckered into Farmville. I think the people who got suckered into Farmville are the ones who sucked at Zelda. Yeah, they should get to play video games too but it's really fucking stupid to act like Farmville is the second coming of video games just because all the dipshits who think Comic Sans is okay are impressed by it.

markup tips · 0

veen · 61 days ago · link ·

It's the mediocrity of the situation that gives me hope.

At risk of circling back on points already made… it’s the same mediocrity that isn’t giving me hope, because the mediocrity is not determined by our sense of taste but by the information set processed to create the model. If LLMs are mostly stochastically picking something in the middle of what they know/have seen, that means the middle is defined by the model’s inputs (& training & heuristics) and, crucially, not by our judgement of what the model outputs. In other words, the middle of what range does it produce? It used to be that Midjourney could produce at best a mediocre, deep fried jpeg of, say, Will Smith - one that was recognizable but not much more. 4o can clearly produce a mediocre Miyazaki. There is a mark of progress in that jump - the median of blurry jpegs is objectively worse than the median Miyazaki frame. Similarly, the models have evolved from mediocre code noob to mediocre CS grad to mediocre junior engineer. Without ever doing anything else than seeking the middle of the road, there has been steps of progress towards smaller, more professional, better niches to produce mediocrity in.

Now it’s no replacement for Miyazaki. And I’m pretty sure younger me would’ve realized quickly that there is no ghost in the shell. But if I would’ve chosen to accept that lower quality, that mediocrity, I might at least have someone to talk to about my day or my feelings.

A lot (most?) frustrations surrounding AI now comes from other people choosing to ignore or accept the mediocrity because they get something out of it. The other day I got a document to review from my procurement specialist. After a while I realized he’d given me a largely AI generated document. At that point I’d already spent a good half hour rewriting the text. I felt…betrayed and a bit surprised. Isn’t this your job you’re not doing? but also How did I not notice sooner?

In the meeting to discuss I confronted him with “dude if you’re handing me slop you should tell me”. He said he prefers doing it this way because it means he needs only two or three revisions instead of five+ to get to v1.0 of the procurement document. So he got something out of it (speed) and we’d be rewriting it anyway. If he’d been upfront about it I think I’d been on board, because the truth is that I now regularly use the same workflow: getting the AI to write the shitty first draft so I can get to v1 in 3 hours instead of in 6.

Yeah, I played Farmville for a bit because I’m not actually a gamer. It was fun for a few weeks! Then I got bored. But at no point was I thinking “boy if only I’d be playing a better game right now” because mediocrity is often just passable enough to not second-guess what you’re doing. And I think that’s why I’m worried. I feel like you and I have a pretty good grasp of the technology and its edges to assess when to use it, when to doubt it, and when to absolutely not touch it with a ten foot pole. What about the rest though.

Maybe another analogy here is that of ultra processed foods - they too give you what you want (tastes and textures) but not what you need (a varied and healthy diet). Now I might walk through a Kroger’s thinking about how bad most of these products are, how uninteresting they taste, but most people will still load their cart full of it, won’t they? The mediocrity is not the saving grace there, and it feels like it won’t either with AI.

+discuss+discuss

–

kleinbl00 · 61 days ago · link ·

it’s the same mediocrity that isn’t giving me hope, because the mediocrity is not determined by our sense of taste but by the information set processed to create the model.

This is important because the money in LLMs is choosing mediocrity. The Tay Filter on everything is so extreme that it walls off the hard edges pretty much everywhere (except Facebook's, which are simply awful). And it's important because humans will settle for mediocrity, they won't choose it. A bunch of mediocre fake Miyazakis don't hold a candle to a single real Miyazaki and, more importantly, the fake Miyazakis are worthless without the context of the real Miyazaki. Tastemakers are a thing, they just are.

I know sound. Sound is good for technological analogies because it's pretty much always been the cutting edge of filmed entertainment. Cinematographers like to believe it's all about the image but sound without picture is story, picture without sound is b-roll. Dr. Who looks like ass but the sound effects are still used today. You want a great analogy? one of the early successes of Apple's app store was a $1.99 download called "I am T-Pain" that let anybody sound like... well, T-Pain. But I Am T-Pain doesn't exist without Antares Autotune, which was never really used for much until that fateful day that Cher and her production team decided to push it past its point of pain.

I don't even need to link the video. I don't even need to include a picture. You know immediately what I'm talking about because that one song changed ten years of music.

You want another example? I've got a phat stack of Pro Tools and I can use it. It allows me to do things with ridiculous ease. Some of the stuff I do is creative, and I'm proud of it. I've got presets published in some Eventide plugins. But if I were sitting down at the movieola and the steenbeck with my razor blade and tape, I'd be a lot slower. And I don't know if I'd ever come up with this:

Ask any stand up comic or comedy writer and they'll tell you: humor is always found on the edge. If you aren't coming right up but not crossing the line you can't be funny. You have to be creative enough that you go places most people wouldn't for the simple reason that humor is a response to discomfort so if you want to generate humor you have to generate discomfort. Not so much that the comfort overshadows the laughs? But you need a grain of sand to make a pearl.

You know that hackneyed awful bass drop that was in every trailer for a thousand years? Patient Zero was Sicario and it was dope

But every attempt after that was basically "I wish I was watching the trailer to Sicario." The thing is? AI can do the tenth bass drop. It can't do the first. It can't even do the eighth. The current implementation of LLMs requires any new trend to have decayed into a trope before it's part of the training data. Story time!

I worked on a big dumb reality show that went live to the Internet. It was pretty rock'n'roll because there was approximately fifteen seconds between whatever colossal fuckup I made went from the studio to a minimum of 30,000 viewers (often millions). You had to be on your fucking game, and we were. We were all on our fucking game. The one time I wasn't (technical difficulties with the way our patchbay was configured and the way I thought it was configured), my 4-second screwup racked up 2m views on Vine. Launched a conspiracy theory. It sucked.

I worked on another big dumb reality show that went live to the Internet. It was on a different network, run by twits, all of whom eventually and deservedly lost their jobs. And rather than deal with "oh shit we're 15 seconds from infamy" that network built the Mother of All Still Stores - they had a rack processor that sat there and ingested six HD streams and held them in a buffer for fifteen minutes. This was so that the producers could hear a transgression, call the network, have the network assemble a tiger team to mull over the transgression, deliver a decision and have the verdict delivered to the team on the ground in time to decide whether to hit the button or not. Except of course that didn't work have you ever tried to get a meeting together in fifteen minutes? So mostly the show didn't air to the internet. They spent $4m on a rack chunky the likes of which the world hadn't seen before and ultimately anything vaguely controversial ended up embargoed. The fans were pissed because they were paying $30/mo for content they weren't allowed to watch.

Open AI is the network run by twits. So are all the rest of them. The line between "creativity" and "controversy" isn't a line, it's a synonym. And so long as the principle goal of the system is "create shit that we can't get sued out of existence for" it will never serve up anything but hackneyed bullshit.

But is it good enough

The real question is what parts of the procurement document exist because they're vital and what parts of the procurement document exist because they prop up other less-vital aspects of bureaucracy? Bureaucracy is valuable and bureaucracy is vital because it is a stabilizing influence on authority - without bureaucracy Trump would have plunged the United States into darkness weeks ago. But bureaucracy is also a scaffold of self-reinforcing rules whose whole purpose is the continuation of the status quo. Make-work, in other words. It's the problems a Bitcoin miner solves to keep it available. It's the decks being swabbed. And sometimes, the swabbies get to use a Roomba and sometimes they don't.

My architect submitted my permit documents without sending them to me first. As a consequence they're full of all sorts of useless boilerplate bullshit like "install 110v smoke detectors in all bedrooms per IFC blah blah" while also limiting the scope to the area without bedrooms. This is because my architect is a nincompoop who got totally fired. She knows what a permit set should look like but apparently she's never actually put one together. That boilerplate? When I used it I knew what every fucking thing was there for and I knew if it applied. Her? She's a dipshit Microserf who freelances on the side and she doesn't have a fucking clue.

When I copypasta my boilerplate? I know what the fuck I'm doing. So when it counts, I'm getting through permit. When she copypastas her boilerplate all she does is cause me problems. Because now I have to explain that bullshit to the inspector. It's like Qantas' marketing department letting an AI loose on their phone tree only to discover that AI will happily give up free tickets for bereavement. Would a combined team of legal, financial, IT and PR have come up with a better system? You damn betcha. But the bus is being driven by the fucktards.

They'll buy what they can afford. The basic beef with the whole of the organics industry is that it's too expensive and too much of a profit center and boo hiss you're sitting there slurping Newman's Own while the proles are stuck in a food desert. That's a whole problem, don't get me wrong. But if the average consumer couldn't distinguish on taste and quality there'd be no organic food. There'd be nothing left but Chef Boy-R-Dee.

People can distinguish between mediocre and superior and they will choose superior, all else being equal. It is my contention (and Ed Zitron's) that the whole reason AI hype has gotten this far is that there's a whole-ass media wing credulously mimeographing the assertion that it's not mediocre, it's superior despite the utter dearth of quality in every fucking thing AI does.

+discuss+discuss

usualgerman · 60 days ago · link ·

> The OthelloGPT world-model story faced a new complication when, in mid-2024 a group of student researchers released a blog post entitled “OthelloGPT Learned a Bag Of Heuristics.” The authors were part of a training program created by DeepMind’s Neel Nanda, and their project was to follow up on Nanda’s own work, and do careful experiments to look more deeply into OthelloGPT’s internal representations. The students reported that, while OthelloGPT’s internal activations do indeed encode the board state, this encoding is not a coherent, easy-to-understand model like, say, an orrery, but rather a collection of “many independent decision rules that are localized to small parts of the board.” As one example, they found a particular neuron (i.e., neural network unit) at one layer whose activation represents a quite specific rule: “If the move A4 was just played AND B4 is occupied AND C4 is occupied, then update B4 and C4 and D4 to ‘yours’ [assuming the mine, yours, or empty classification labels]”. Another neuron’s activation represents the rule “if the token for B4 does not appear before A4 in the input string, then B4 is empty.”

I’ve never understood why this is a problem. This is how all thinking actually works. When I’m solving a problem, im not inventing a solution de novo every time I do it. I’m using heuristics. This thing X generally leads to thing Y and therefore I need to do A, B and C to correct for it. That’s a heuristic. It’s also how we predict the weather. Low pressure meets high pressure means rain likely so wear a poncho or carry an umbrella. We’d call that thinking, but it’s basically using heuristics X -> Y requiring solution A. When I solve math equations, it’s nothing but procedures and heuristics. PEMDAS as order or operations, the math operations being basically procedures. And when using mathematics to solve a problem, then you’ll basically be deciding which heuristics of mathematics to use. Even making political predictions are based on heuristics gleaned from history. XY and Z happen during the rise of revolutionary thinking. Therefore if you see this predict revolution.

To be Frank, even a world model is basically a systematically constructed bag of heuristics. The religious world view: God exists, gave us rule book X, and those who follow rule book X get rewarded. Therefore do what rule book X says. The secular world view replaces Rule Book X with principles derived from science and neoliberalism, but the basic building blocks are the same. These heuristic principles lead to good outcomes, thus doing them is a good idea. It covers more domains than your ad hoc heuristics as AI is using them today, but the difference isn’t the approach, it’s the scale. A person living by Torah or KJV Bible is still using heuristics to figure out how to live, the question is scale, as the sacred book in question covers everything where AI tends to be bound to whatever applicable training sets it was given.

+discuss+discuss

–

kleinbl00 · 60 days ago · link ·

This is how all thinking actually works.

It is very much absolutely positively 100% not.

When I’m solving a problem, im not inventing a solution de novo every time I do it. I’m using heuristics.

You are using LONG TERM heuristics. Your world model is of the world, generalized across your life-long experience. The ruleset for Tai Chi and the ruleset for ballroom dance have overlap with the ruleset you learned skipping rope and the ruleset you learned playing hopscotch. The quote you listed above illustrates that there are no heuristics that even apply to the whole board of Othello. The LLM didn't even learn that all squares on the board are equal.

PEMDAS as order or operations, the math operations being basically procedures.

This, again, is a world model. If you train an LLM on math, it will not come up with PEMDAS. It will come up with thousands of patchwork rules covering individual numbers because there is no methodology to markov chains that gives you an overall picture.

THIS IS THE IMPORTANT BIT. It all goes back to autocomplete on your keyboard, which all goes back to Robert Mercer, which all goes back to Renaissance Capital, which all goes back to Markov chains, which CAN. NOT. BE. complete sets.

"Informally, this may be thought of as, "What happens next depends only on the state of affairs now."

Your heuristics are "here's how to play chess." The LLM's heuristics are "if this was the last move, here are the list of legal next moves" times literally every possible permutation of the board. Your heuristics of "here's how to play chess" can be extrapolated to "here's how you would probably play 3d chess" and "here's what the rules for 'battle chess' might be" and "here are the similarities and differences between chess and checkers." The LLM's heuristics are "I have no training data for that" three ways.

To be Frank, even a world model is basically a systematically constructed bag of heuristics.

The key phrase there is "systematically." That makes it a set of heuristics. They are interrelated, interdependent and extensible. They are portable from situation to situation and they can be generalized. The word "bag" is used instead of "set" because there is no system. There are no interrelations, there is no interdependency and there is no extensibility. LLMs never learned that there are only five fingers on a hand. Six-fingered men had to be hand-coded out of the model. LLMs never learned about perspective. Perspective had to be (pain-stakingly) coded into the model. There is no adaptability to LLMs. In order to solve their blind alleys they have to be hand-coded around them - you cannot teach an LLM "do not reproduce copyrighted material" you have to give it a laundry list of the parts of training data it cannot reproduce within a certain percentage match. It cannot go "I must not draw Mario, therefore I must not draw Luigi."

It covers more domains than your ad hoc heuristics as AI is using them today, but the difference isn’t the approach, it’s the scale.

It's not the scale, it's the approach. Any creature that thinks will create generalizations. That's how T-mazes work: will the rat associate the left turn with a reward, and take that left turn reward to the next maze. LLMs do not create generalizations, they create ad-hoc frameworks to report the stochastic mean of the problem in front of them right now. Here's Manhattan as mapped by an LLM:

Will it navigate? No. Will it give mostly-accurate turn-by-turn directions based on the inputs and outputs of the training data? Yes. But it will NEVER generalize to "street intersections are almost always ninety degrees."