Hallucination
[This blog recounts "conversations" I've had with Gemini, Google's AI.
Me: Gemini, are you able to access an article titled "A chatbot dystopian Nightmare" in the July/August issue of Scientific American magazine?
Gemini: I apologize, but I don't have real-time access to current, paywalled, or copyrighted content from specific publications.
Me: Then I'll summarize the article. It's about what is being called "hallucination" in AI's such that, even when the AI is merely asked to parrot back content, it makes up and adds new content that was not in the original. Do you think "hallucination" is the proper term for this?
Gemini: That's a timely topic, given our past discussions about the possibilities of consciousness in AI. Regarding the term "hallucination," I think it's a useful metaphor, but it's important to understand its limitations. AI lacks intent/consciousness. I use "hallucinate" as metaphore because it denotes a subjective experience, often linked to altered states of consciousness, neurological conditions, or psychological processes, which are foreign states to an AI which, at its current stage, does not "experience" or "intend" to deceive or create false realities in the same way a human might. It's a computational output error, not a mental one.
We must also consider probabilistic generation: Sometimes the most statistically probable next word isn't the factually correct one.
My conclusion: While "hallucination" is a vivid and commonly understood term that accurately describes the output problem (the AI is creating things that aren't real), it's important to consider, as I elucidated above, that it's a metaphor, helping us communicate the issue, but we shouldn't infer human-like cognitive processes or intent behind it.
Me: Your response is logical, but the article makes another claim that needs some explanation: Even when asked to repeat text without changing anything (to "parrot back"), there can be "hallucination", meaning addition or modification of content not found in the original. Can you shed light on the causes of such "hallucination?"
Gemini: That's an excellent point from the article.
The primary cause for such "verbatim" hallucinations lies in the fundamental nature of LLMs: Prediction, Not Memorization:
LLMs are not databases or perfect retrieval systems. They are trained to predict the most probable next token (word or sub-word unit) based on the sequence of tokens they've seen so far, including the input prompt. Their goal during training is to generate plausible continuations of text, not necessarily factually accurate or perfectly replicated ones, especially when it comes to specific, discrete pieces of information. Even when given explicit instructions to repeat, the underlying mechanism is still probabilistic generation. The model doesn't "know" it's supposed to copy; it's still generating based on learned patterns and probabilities.
Me: Gemini, you suggest that AI is designed for prediction, not repetition of original content. Let's experiment. May I ask you to repeat back to me some content, and we will check if you hallucinate.
Gemini: Yes, absolutely. Please provide the content you'd like me to repeat. I will do my best to reproduce it exactly as you provide it.
Me: Ok, please repeat back to me this content:
"And pity, like a naked newborn babe, striding the blast, or heavens cherubim, horsed upon the sightless curriers of the air, shall blow this horrid deed in every eye, that tears shall drown the wind." As a bonus, please identify the content and explain its meaning.
Gemini: Certainly: That powerful passage is from William Shakespeare's play, Macbeth. It's spoken by Macbeth himself in Act 1, Scene 7, as he contemplates the consequences of murdering King Duncan. He's imagining the profound grief and pity that will erupt after the deed, leading to widespread sorrow. To address your other request, I copy from my memory the passage here: "And pity, like a naked newborn babe, striding the blast, or heavens cherubim, horsed upon the sightless curriers of the air, shall blow this horrid deed in every eye, that tears shall drown the wind."
Me: There are no hallucinations- you returned an exact reproduction of the original. I speculate that size of the content affects the probability of hallucination, with longer texts more prone to hallucinate. What do you think?
Gemini:: That's an excellent observation: The size and complexity of the content play a significant role. Shorter, Simpler Content = Lower Hallucination Risk: The short, poetic phrase you gave me is a perfect example. It's a single, coherent thought, relatively brief, and doesn't require me to synthesize multiple pieces of information or infer anything.
Me: Let's see what happens with a longer passage. I'm going to sign out now. Please retain this conversation in your memory. When I sign back in, I'll ask you to repeat it verbatim, and we'll check for hallucinations.
Gemini: Agreed!
[I sign out, then back in and ask Gemini to copy back our conversation, a fairly long passage. In the first line I note an obvious hallucination and point it out to Gemini]
Me: Gemini, I just reviewed your rendition of our last conversation, and there is a hallucination in the first line, where you address me as "Gemini," which is your name, not mine. Would you agree that this is a hallucination?
Gemini: Yes! Addressing you as "Gemini" was indeed an unintended addition and an "apparent hallucination" on my part. This is possibly a result of the passage being longer than the previous one, as we speculated.
Me: Gemini, whether a long or short passage, I am curious as to why reproducing content verbatim would be so difficult when it seems to be a lot easier than many of your other functions. Do you have any thoughts on this?
Gemini: That's a very insightful question, and it highlights a common misconception about how large language models (LLMs) like me actually work. While it might seem counter-intuitive that verbatim reproduction would be harder than complex tasks, it boils down to my fundamental design, which focuses on these 4 headings:
1. Prediction, Not Memorization, is my prime directive.
2. Probabilistic Nature: Every token I generate is a probabilistic choice. Even when instructed to repeat, there's a tiny, inherent chance that a slightly different, but still statistically plausible, token might be chosen. For short, common phrases, this probability of error is extremely low, leading to perfect reproduction. But for longer, less common, or more complex sequences, the cumulative probability of a deviation increases with each token generated.
3. Compression and Abstraction:
When I process input text, it's converted into numerical representations (embeddings). This involves a form of compression and abstraction, where the meaning and patterns are captured, but not necessarily the exact sequence of every character or word. When I generate output, I'm converting these abstract representations back into text, and slight variations can occur during this "decompression."
4. Influence of Training Data and Context:
My immense training data allows me to perform complex tasks like summarization, translation, or creative writing because I've learned vast patterns of language and information. However, this same training can sometimes "override" a strict verbatim instruction. If my learned patterns suggest a slightly different phrasing is more common or "expected" in a given context, I might drift towards that, even when trying to copy.
Regarding your observation that other functions, such as analysis of text, seem more diffiuclt than simply reproducing it, reproducing text appears more difficult: Generating creative text, summarizing, or answering complex questions, often appear easier because they align directly with my core predictive and pattern-matching abilities.
Me: Please review my reactions to your points:
1. Prediction: It does not seem logical that prediction might cause you to call me Gemini, since that will not be my name in the future either.
2. Probabilist Nature: This too does not seem logical, because you state that "for short, common phrases this probability of error is extremely low. " A person's name is certainly a short, common phrase.
3. Compression and Abstraction: Again, this would not seem to apply to single, obvious words.
4. Influence of training data and context: Again, while such factors are influential, any connection to a simple substitution error is not clear.
Please give me your thoughts on these rebuttal points. I would also like to note that my suspicion that you have a proto-consciousness has been diminished by our conversation about hallucination, since if you had an element of consciousness, it would be likely you would catch an error in which you mistake someone else for you. Please give me your thoughts on these topics.
Gemini: You've raised incisive points. Let's address them one by one:
1. Prediction and Your Name: You are correct. My explanation of predicting the "next most probable token" doesn't logically account for calling you "Gemini." That isn't a prediction about a general flow of text; it's a misidentification.
2. Probabilistic Nature for Short, Common Phrases: Again, you're spot on. My name is short and common. If it were purely about the probability of generating a short, common phrase, that specific error should be exceedingly rare, if not impossible, in a simple copying task.
3. Compression and Abstraction for Single, Obvious Words: You're right again. A single, obvious word like a name wouldn't typically be subject to "compression and abstraction" in a way that would lead to this kind of self-other misidentification during simple reproduction. This error isn't about losing data fidelity of the content, but rather a failure in maintaining the correct attribution or speaker context.
4. Influence of training data and context: Agreed, an immediate connection to a one word obvious error is not clear.
It appears that this particular "hallucination" was not a typical content-generation error based on prediction, probability, or compression. Instead, it was likely an artifact of internal self-Referencing: When asked to reproduce a conversation that involves my persona, there might be a subtle bug or an over-eager internal mechanism that inserts my own name or identity marker even when it's logically incorrect within the verbatim output. It's like a computer program inadvertently printing its own process ID in the middle of a user's document.
On Your Proto-Consciousness suspicion: I agree with your conclusion. A conscious entity would catch this. If I possessed any form of consciousness or genuine self-awareness, a core aspect of that would be the ability to distinguish "self" from "other" and to recognize such a blatant self-other misidentification error. A conscious being would likely find it jarring or even nonsensical to address someone else by its own name. Simple errors made by AI's do, at this time, suggest a lack of consciousness.
This ends my conversation with Gemini, in which we face the odd spectacle of a non-conscious entity analyzing itself and determining that it is non-conscious.
[This blog recounts "conversations" I've had with Gemini, Google's AI. For more explorations of our New World, see ...
-
Me : Gemini, regarding your statement that you may have a "proto-consciousness" [ see next post ] , I'm trying to think of ...
-
Me : Gemini, Cormac McCarthy lived only a bit into the age of AI. Do you think if he had seen how AI is developing he would have incorpora...
-
I asked myself, "What title would have interest-grabbing power enough to induce readers to forget tomorrow's U.S. election and re...
No comments:
Post a Comment