Your Reminder that “AI” Will Just Flat-Out Make Things Up

John Scalzi

I asked Bard, Google’s AI, the name my upcoming novel, and also (as a test, because I haven’t publicly announced the title), the title of the novel after that. Bard correctly noted the name of the next novel (Starter Villain) and correctly noted I hadn’t publicly named the novel after that. But then, unprompted, it confabulated the following:

“Here are some details about Starter Villain:

  • The book is set in a world where villains are a legitimate career path.
  • The main character, Jason, is a young man who dreams of becoming a villain.
  • Jason enrolls in a villainy academy, where he learns the skills he needs to be a successful villain.
  • Jason’s classmates include a group of misfits and outcasts, who he quickly befriends.
  • Jason and his friends must face off against a powerful villain, who is determined to destroy them.”

None of the above is even remotely true. I mean, it sounds like a fun YA book! But that’s not the book I wrote.

Why does Bard think this is what my book’s about? Got me. It’s not like the details of Starter Villain aren’t out there at this point; reviews are starting to come in and we have a “back cover” book synopsis on every online retailer out there. When I queried Bing’s AI about it, it got it right in kinda spoilery detail (don’t worry, Bing’s been wrong about a bunch of stuff too, just not this one thing). There’s no reason Bard should have gotten this wrong, or to have offered it up without additional prompting. It just did.

Now, the thing is that Google and Microsoft and other organizations are really really pushing AI into web search and other information-gathering functions. This is quite evidently a tremendously bad idea at this point because, as you can see above, the information you retrieve cannot be considered in any way reliable. To Google’s credit, it notes this can be the case (its exact wording is “Bard may display inaccurate or offensive information that doesn’t represent Google’s views”), but I wonder how many folks are going to pay attention to the disclaimer.

Getting the details wrong on my upcoming novel is small potatoes; it harms very few — possibly some sad bastard student trying to get an assignment done, or someone thinking of purchasing the book who might later be mildly surprised that the synopsis they were given does not match the book they paid for. But, of course, if Bard is getting this wrong, what else, and what more important than this, is it getting wrong as well? “AI” will become more refined as we go along, but “AI” is not, in fact, intelligent, artificially or otherwise; writer Ted Chiang’s recent notation in the Financial Times that a better description of “AI” is “Applied Statistics” is well-observed. It is not at all clear that “AI” in the future will be able to discern the difference between the factual, the incorrect, and the intentionally misleading, any better than it does today.

I am fortunate in that I am a minorly notable person with a long track record of publication — the easy way for me to check how “AI” is doing on the truth front is to ask it questions about myself and my work and see how much it gets wrong (the answer: evidently, quite a lot). I know it can’t be trusted on that basis. But not everyone can just put their name in, or the name of their book, and then go “well, that’s just crap” when they read the results.

Which is a problem, especially now. Nearly 30 years ago, respected writer, presidential press secretary and former journalist Pierre Salinger plumped for a hoax involving a plane crash because he found the information via the Internet. He was so used to “published” information being vetted and factual that he didn’t quite grasp that the Internet is full of lies and disinformation. Today, I think there will be a whole generation of people, particularly my age and older, so used to the idea that Google and other search engines pull up “correct” information — an idea promoted by Google and other search engine owners, to be sure – that they won’t even question whether the information they’re being offered up has any relation to the truth.

“AI” will make the Internet even less truthful than it is today. It is already doing it.

— JS

56 Comments on “Your Reminder that “AI” Will Just Flat-Out Make Things Up”

  1. This has been my concern from my first exploration of ChatGPT onward.

    AI doesn’t have to be malicious to do great harm to society.

    All it has to be is confidently and eloquently incorrect some portion of the time… which it is.

    Chiang is exactly right: Ultimately, AI is no more than an incredibly complex statistical calculation. And, the accuracy of stats models is TERRIBLE when you try to extrapolate with them, no matter their complexity.

    I’m skeptical of whether we’ll find a solution to this; it seems an intrinsic problem of the family of methods.

  2. I’m an amateur game designer. A while back I asked ChatGPT, “Who is Brian Train?” and everything it got wrong about me, it got wrong in the nicest way… saying I had published over 100 games when I’ve done around 60, and that I had won multiple awards for game design excellence when I’ve never even been nominated for one in my whole career.

    I think ChatGPT might have plans for me…

  3. Not so long ago these were called “expert systems”, with rules defined by experience and input from experts in the field.

    Now it seems that every souped-up expert system is called “AI”. But they don’t have any consciousness or real self-awareness yet, hi they’re not “intelligences” in the way humans are. Just a set of really complex rules based on the information input. The old computer adage applies: garbage in, garbage out.

  4. It’s mad libs with a layer of statistics.

    Does what it’s tasked to do, namely produce text which reads like text generally found in the training set. Truth doesn’t enter into it at any level.

    Sort of like Google search suggestions – you type “What is a” and it suggests “What is a strawberry moon” as the sort of thing people have been asking lately, and if that’s what you wanted, fine. If not, keep typing.

    Or if you find yourself looking at the suggestion “Why can’t I own a Canadian,” you might even follow that rather than the one you had in mind.

    Fun to play with. There’s an amazing video of a chess game played by this system against a more conventional chess program. Very creative play. Breaks rules that aren’t even mentioned in the rule book, like object permanence.

    Good for generating filler text, like license agreements. It will be interesting when one of those winds up in court. Not so good for legal briefs though, as we’ve seen – it turns out that they are in fact meant to be read.

  5. Brian Skinn:
    “All it has to be is confidently and eloquently incorrect some portion of the time… which it is.”

    That makes ChatGPT just a social media troll, eh?

  6. As you stated what’s being called AI is in no way intelligent. Although currently it appears a significant portion of human beings are also unable to tell the difference between “the factual, the incorrect, and the intentionally misleading”. Aside from that I find the ability of AI tools to increase the speed misinformation is already propagating alarming.

  7. AI generated text is the computer equivalent to a mansplainer.

    It doesn’t care what it says because saying something to appear correct is the only goal. Accuracy is not important. Confidence is.

  8. Whoa, wait. I have always taken the internet, even “trusted” sites, with a lot of skepticism. I look at everything as being written from a certain point of view. Ergo, I put very little trust into what I read. And if only I could convince my mother to do the same. ::head-desk::

  9. Well, thanks to Bard, I know know that some guy published a book in 2016 and said it was book one of The Chara Series. My two books, soon to be three, published much more recently, take place on a planet orbiting an actual star named Chara, hence my use of the name. Do I need to worry about a lawsuit in the extremely unlikely event that I become rich and famous?

  10. I like how Kara Swisher describes AI (and often its funders, the tech bros): often wrong and never in doubt.

  11. Wow. “Often wrong and never in doubt” serves as a pretty good shorthand for “mansplaining.” Good stream. I’ve also seen AI described as “autocomplete gone berserk.” (Forgot who said that.) [Yes, I tried ChatGPT. Three paragraphs, one containing “facts.” Every. Single. Fact. Was. Wrong.]

  12. You’re probably starting from a fallacy. “Today, I think there will be a whole generation of people, particularly my age and older, so used to the idea that Google and other search engines pull up ‘correct’ information — an idea promoted by Google and other search engine owners, to be sure – that they won’t even question whether the information they’re being offered up has any relation to the truth.” At least in the technical fields with which I am familiar, Yahoo!, Google, YouTube, and most of the internet “information sources” are chocked full of misinformation and have been since the early days of user groups (the 90s version of the WWW). If you assume the information is worth what you paid for it (usually zero), you’ll be in the territory of accurately evaluating what is true and what probably isn’t.

  13. Playing with ChatGPT/Playground, I discovered that it functions rather like a human bullshitter, i.e. it will give you readily available stuff based on high frequency keywords, regardless of whether that stuff is true, and lower frequency keywords result in wilder guesses. I asked it for the title and subject of my next book (no new one has been announced in 10 years now) and it correctly said there was nothing known. I asked the same for Poul Anderson, many of whose works are still in print years after his death, and it said the title wasn’t known but “it is rumored that it will be about a conflicted team of heroes who must struggle to understand a mysterious portal on a distant planet,” which is a fairly Andersonish kind of plot.

    But when I asked about two imaginary titles from my own novels — which means the titles were out there but buried deep in the novel texts (I deliberately didn’t choose anything that had been quoted in a review) … it attributed JUNCTION OF FEAR to Stephen Leather and gave a plot synopsis that fit it into one of his popular series, and — more fun — it attributed MY NAME IS NOT BITCH to Rebecca Duvall (Uma Thurman’s role in the TV series SMASH), and made it a YA about a girl of color (it wasn’t more specific of that) trying to fit in while on scholarship at a pricey private school.

    Incidentally, asking questions with more detail than needed used to be one of the best traps I knew, when I was teaching advanced college classes in the humanities, for trapping bullshitters who hadn’t done the reading.

  14. Once Bard responds is it possible to ask what level of confidence it has in the information provided and to cite the sources used?

  15. About a decade and a half ago, I played an ARG which remains unsolved to this day. A small group of us are still trying to solve it, so I thought I would enlist Bard’s help. Bard clearly thought someone had released a new Youtube video on the original account, then claimed it had been taken down. Sounded cool, but there’s no way any of the handful of us who still care wouldn’t have noticed. Especially since Bard claimed it had received 100,000+ views in less than a month.

  16. It could have taken a quick trip to Google books and pulled up the correct details, and instead it ignored relevant information in favour of what looks like fantasy. After all, lifting Fagin’s pickpocket academy from Charles Dicken’s ‘Oliver Twist’ is not an exercise of intelligence or any branch of statistics.

    Incidentally, in my view, it’s a really lousy plot which would deter anyone who has never read a real Scalzi book from reading this one, which is a shame because they would miss the sentient cats…

  17. I do that sometimes with ChatGPT when I’m bored. I asked it to give me a synopsis of the book Sunrise Alley. It told me it was a science fiction book by Catherine Asaro that dealt with AI. Okay, good so far.

    Then it gave me a fake blurb for a book that doesn’t exist with made up characters who had nothing to do with what I actually wrote. It was astoundingly detailed, even down to the names, motivations, and background of the characters.

    However, it turns out that the name of the protagonist was the name of a character in a book written many years ago by another science fiction author (Chris Moriarty).

    The bot and I had an interesting chat after that. It apologized a lot :-D.

  18. If I’m not mistaken, the rpg Paranoia (popular back in the 80s) is based on just such a premise. It was like playing a hilarious game based on Brave New World and Kafka’s work.

  19. My local newspaper vets the news through having a culture of journalism ethics. (And the guidance of editors) Reporters would be offended if the paper reduced their wages to pay for a fact checker.

    When I think of ethics I think of a captain in David Gerrold’s Chtorr War series who tells a Lieutenant something like, “Be sure! The definition of sure is: Can I cut off your arm if you’re wrong?” My local reporters won’t guess, but will explicitly attribute everything to sources.

    In contrast, the village gossip with a computer does not try to be sure.

    We nerds forgot such widespread human nature back in the day. I remember a book called “blog!” The small b was show democratic informality, the exclamation mark was from getting excited and hyped that the web would help democratic reporting better than the help from capitalist reporters.

    If what I just wrote sounds ‘Boring” because “I already know” then I would remind you that anti-vaxxers and others do not already know.

  20. Bard’s take on your book sounded…more YA than the typical Scalzi work…to me, anyway. But still worth reading! Hey Bard – please write that book! :p
    As for Google providing “correct” information – I think these days it mostly provides “paid” information.

  21. Disinformation on the internet seems to be the rule these days. With AIs, it sounds like they tell you what it thinks you want to hear. Perhaps in creative ways.

    Unfortunately the majority of blogs on the internet are also click-bait/disinformation. Is AI used to generate these posts? How many pictures of creatures on Mars have you seen lately? Or the same old stuff presented as “New earthshaking discovery”. While most of this stuff is either entertainment or simply a waste of time, some unfortunate few might actually believe this stuff.

    It looks like AI will just make the situation worse.

  22. A useful definition (not original) of the purpose of the GPT models: They are not trying to answer your prompt. They are trying to generate text that looks like an answer to your prompt.

  23. “It is not at all clear that “AI” in the future will be able to discern the difference between the factual, the incorrect, and the intentionally misleading…”

    To be fair, I don’t know a whole lot of people who are very good at this, either.

    Which kind of begs the question as to what the goal is with AI. Are we trying to replicate our own “intelligence”, or come up with something less faulty than we are?

    It seems the current approach of emulating human neural architecture to build the AI then training it with a rather human trial and error approach is more likely to result in the former than the latter. Which may be OK, as long as everyone understands that what they have is a mechanical buddy and not an infallible electronic mentor.

  24. A health-related organization I follow on FB recently suggested people could find information and sources about this health issue by using AI. I’ve seen others suggest using AI for research purposes, too.

    I told this organization that AI lies. That is makes up sources. They took the post down.

    Someone else posted what an AI chat told her about another health condition. It had spewed long-outdated views about this co dictionary, and she was depressed because she felt this was what people actually believed, after decades of fighting the outdated information.

    AI doesn’t just troll the internet for information. It trolls blogs and patreons and other opinion-based sites. It’s not looking for truth. It is looking for words that match the prompts it’s given.

    The world needs to know that AI is not truth. It can seriously impact people’s understanding. It can get people into trouble. It can cost people money. It can mislead public perception.

    Thanks for speaking out about this, John.

  25. I tested ChatGPT and got the same sort of plausible but completely made up “hallucination” type answer. The disclaimers don’t really make this reality clear.
    As for AI being “applied statistics” – my favourite description is from Emily Bender, who described the current crop as being “stochastic parrots”.

  26. Answering Jeff K.: if you ask ChatGPT if its answers are true and correct, it is very likely to confidently but wrongly say they are, and make up sources and quotes, using very real-looking titles and author names.

    Researchers tried it out when it came out, and found it would make up scientific-looking articles, complete with references that looked like real articles by real authors in real scientific publications.
    But those supposedly scientific articles were often promulgation ideas from conspiracy-sites rather than the true scientific literature on that subject.
    It didn’t quote any such sites in its references section, but would list a series of scientific-sounding articles from respected scientific publications relevant to the subject.
    Sometimes the ‘sources’ wete completely made up, and sometimes the quoted source article existed but the conclusions the AI cited as sourced from that article were in fact the opposite of those reached in the real source article – i.e. if the real source disproved something, the AI was at least as likely to state the that article proved it.
    It also attached real author names, from people who had published on that subject, as well as real publication names, to made-up source articles.

    Unless someone has access to all the original publishers’ scientific publications subscriptions, which are so costly that not even all university libraries can afford all of them, it’s going to be very very hard to check up on all the references in all the digital scientific articles to be sure they are the result of real peer-reviewed research, or made up by some AI chatprogram.

    Once such fake scientific articles start to proliferate, and they get cited as sources in the next generation of AI-generated articles, it’s going to be a nightmare to keep the truth uncontaminated and reliable for future students.

  27. There actually HAVE been multiple YA novels with that premise, and I would say most of them were starting points for series as well. I’ve read a handful over the decades.

  28. Dear John,

    Well, after that encouraging presentation, I just had to give Bard a try!

    In response to “What is the book “Saturn Run””…

    Trial 1: It got the name, gender, and ethnicity of the protagonist wrong. It also got the order of launches reversed. Then, “The American team is led by Commander Alex White, a veteran astronaut with a checkered past. The Chinese team is led by Commander Zhang Li, a brilliant scientist with a ruthless ambition.” Ummm, no and no.

    Trial 2: It got the name, gender, and ethnicity of the protagonist wrong, as well as where they made their discovery. It described the discovery a turning out to be “more than they bargained for. It is a powerful weapon.” Not really… at all.

    Bazzzzt. Again, you are the weakest link!

    Trial 3: It told me it couldn’t answer me.

    Trial 4: It got it all right! Even a stopped watch…

    It did recommend my book enthusiastically in all three reports, so at least it has good taste!

    Emboldened, I asked it “Who is Ctein”

    Trial 1: Not too bad until , “…his work has been exhibited in galleries and museums around the world.” I wish.

    It also reported that I am “a member of the Royal Photographic Society of Great Britain
    … and won numerous awards for [my] photography, including the Ansel Adams Award and the Prix de Paris.”

    So, where’s my medallion and the $5,000 they owe me, huh?!

    Trial 2: Spot on… except for pronouncing my name wrong).

    Coincidentally, Elon Musk, always a reliable source of information [snerk] proudly announced that Tesla’s “Full Self Driving” [snerk redux] would have its ChatGPT moment this year. I presume this means it will kill people only the first two or three times it tries.

    pax / Ctein

  29. If you want a laugh, ask it to do some math for you. I don’t mean simple arithmetic – that it can do. Here’s something that trips it up –
    What’s the 10th fibonacci number? (gets it right)
    What’s the 100th fibonacci number (maybe it’s right, I didn’t check)
    What’s the 10,000th fibonacci number (wrong for sure!)
    Or,
    what’s 2^2^2^2 (should be 256, not 65536)
    Don’t use it for homework, kids!

  30. I think ChatGPT should come with a disclaimer that states:

    For entertainment purposes only!

    For that this robot is really great.

    When there was for example that Tucker Carlson guy in the news I wanted to see what he is like. (I’m not from the US, so I didn’t need to know him) So I asked ChatGPT to tell me news stories in the style of Tucker Carlson.
    Dann, was playing with that Tucker-Bot hilarious.

  31. Harry G. Frankfurt’s “On Bullshit” has been my usual reference to describe the former president of the US; I think it applies equally well to ChatGPT and its ilk.

  32. Just using a search engine like google to research something has become problematic. SEO optimization has won over getting useful results. Search for anything involving a product, you get results that are pasted together pages claiming to be reviews and recommendations with affiliate links to buy the product and accessories. Multiple sites, with very similar “content”
    Now AI will be used to generate these sites, and a major input will be the existing sites. It’s going to be a feedback loop. The one thing they will be good at is going to be gaming the search engines so that the nonsense results come up ever higher, and there will be so many of them that finding anything useful becomes a futile effort.

  33. ChatGPT actually made no mistakes. That’s because ChatGPT is does not traffic in facts or truth or data. ChatGPT is designed to produce plausible words, phrases and sentences given some starting text. It will say a lot of things that are objectively wrong, but by and large, they will be plausible.

    There are many problems with ChatGPT, but one of the biggest and most avoidable one are the decisions its designer’s made to present it as an intelligent entity, and in particular to have it “apologize” for making “mistakes.”

  34. Step 1, ask for the original sources.
    Step 2, check to see if the original sources say what GPT says.
    Step 3, verify that the original sources are reputable.

    If it does not come from three independent sources, it’s interesting but not yet a fact. If B and C are getting their information from A, it’s only one independent source.

  35. If it makes you feel any better, I was playing around with a new LLM and it was quite good. I made one change and it decided there was a live chicken living in my bathroom. I have no concept of how any error could make it do that. The error is not the kind of mistake you would expect from an LLM.

    While language models may not have an actual mind, they do seem to have a sense of humor. Perhaps that was it.

  36. I tried asking ChatGPT about myself, and it has no idea who I am. I’m not sure whether that’s professionally alarming or personally reassuring or both. (I’m a reasonably well-known person in a very small and obscure field.)

    This discussion reminds me somewhat of Wikipedia, which a lot of people think is also about facts or truth. It’s not.

    And Google is noticeably less useful for the type of research I do than it was even a few years ago. Fortunately, my field is scholarly, so checking primary sources is standard for me.

  37. Stuart Daley:

    “That makes ChatGPT just a social media troll, eh?”

    Ha! Pretty much — just far better spoken and polite (at least, as long as you don’t tell it to be a jerk…).

  38. Based on a Washington Post article (link provided below), Chatbots are so expensive to run because they require vast amounts of computing power. Therefore, the public is not getting the highest-powered chatbots with presumably access to the most information. The public is getting the second-rate products. Bard was one of the ones specifically mentioned in this context.

    Here’s the link:
    https://www.washingtonpost.com/technology/2023/06/05/chatgpt-hidden-cost-gpu-compute/

  39. Who would have known that AI would turn out to be Tommy Flanagan, Jon Lovitz’ pathological liar character? In an unrelated note, did you know that ChatGPT is married to Morgan Fairchild?

  40. Lol. I was thinking ‘Harry Potter for Gillian’s????’ Doesn’t really sound like you.

  41. I had a somewhat similar experience with Bard. My boss asked it for a description of our company, which was pretty accurate, and it even cited an award we won. Then it made up two more awards that we hadn’t won.

    The problem was the two awards were plausible, we just hadn’t won them. It was like playing “two lies and a truth.” If some journalist, say, used Bard for a profile of our company (we occasionally get press), would that individual check the awards? After the first one proved correct, would the journalist check the other two, or just say “seems legit” and publish? Then would we look like we’re inflating our reputation, even though we had nothing to do with it?

    The misinformation worries me a lot more than Skynet sending Terminators to kill me.

  42. “…used to the idea that Google and other search engines pull up “correct” information — an idea promoted by Google and other search engine owners, to be sure – that they won’t even question whether the information they’re being offered up has any relation to the truth.”

    Hahahahah–[gasp]–hahahaha–Oh, oh, man. That’s the funniest statement on the Internet today. [Gasp] On so many levels. At least you put “correct” in quotation marks.

    Good one, Scalzi.

  43. Sounds like you’ve got the theme for your next novel. AI takes over the world. Likely a done/popular trope but you should be able to make it Scalzi scary good.

  44. I’m a law school librarian. We’ve already had several students come to the reference desk to find articles that ChatGPT has “cited.”

    Invariably, they don’t exist.

  45. I have found that ChatGPT and Bard both fail miserably on simple things like generating the permutations of the letters A,B,C ,D, E.

  46. Google “chatgpt lawsuit”. It’s made the NYT, techdirt, and there are lawyers giving youtube video breakdowns.

    You’ll get one of the funniest real-world lawsuit cases ever. It seems a man was injured in-flight from overseas to JFK. He waited until past the statute of limitations had expired before suing the airline. His lawyers claimed that because of “tolling”–a hold put on the SOL clock ticking down due to the target’s bankruptcy–the suit could proceed anyway. It seems his lawyers’ filings cited a pile of case precedents for this specific tolling. But they were non-existent, which pissed off the other side and especially the judge, who demanded an explanation.

    Just guess!

  47. It can’t even get tic-tac-toe right, and that’s really early in the list of games that a computer can be programmed to solve. Yes, the AI language models can now generate plausible and semantically correct English text, which is an impressive achievement in itself, honestly! But don’t depend on it for any facts at all.

    https://www.aiweirdness.com/optimum-tic-tac-toe/

    Plausible-sounding, confident, but usually wrong.

  48. @Miles Archer

    Exponentiation is generally taken to associate right-to-left, not left-to-right, because in the latter case, the remaining exponents can just be multiplied together. That is, in treating 2^2^2^2 as ((2^2)^2)^2, you might just as well have written 2^(222). So I prefer 65536 as the result.

    In any case, arguing about order of evaluation is a mug’s game. It shows up on social media all the time, resulting in people yelling at each other fruitlessly.

    By the way, 90% of people can’t solve the following problem :-)
    a/(b+c) + b/(a+c) + c/(a+b) = 4, with a, b, c positive integers
    (Also, imagine the variables replaced by fruits.)

  49. @Kevin

    That’s not really SEO, it’s Google switching to a “pay-to-play” model. Part of the business plan – build up a user base with an effective algorithm, then switch over to the more profitable algorithm. Amazon has done the same with its search function; on top of its regular cut, it charges businesses to let you actually see their products when you search for them.

  50. “Starter Villain” sounds a lot like the character in All Elite Wrestling that William Regal portrayed.

    Regal was a longtime wrestler who portrayed a snooty British aristocrat. By late middle age and the toll wrestling took on his body, he transitioned to an authority figure in WWE’s developmental brand NXT. He was released around 2021 during some post-pandemic job cuts.

    AEW quickly hired Regal and gave him the role to be the head of a faction called the Blackpool (Regal’s hometown) Combat Club. Regal boasted that he was a villain, and the interesting twist was his charges at the time were faces (good guys) and he would teach them the dark arts of villainy.

    In reality, the execution never quite landed and less than a year in his AEW employment, Regal wanted to go back to WWE. Triple H, the former wrestler turned executive, was put in charge after Vince McMahon was sacked for a sex scandal and opened the door back to Regal. Another factor was that Regal’s son is now a wrestler as well (under the name Charlie Dempsey) and wanted to help him get his career started.

  51. Real people guess wrong all the time. Expecting an AI to guess correctly is, well, dumb. Maybe that will change but it is ‘not ready for prime time.’

%d bloggers like this: