Claude Code Hallucinations

I have started experimenting with Claude Code, and discovered that it (like most other AIs) has a tendency to hallucinate.

I gave it this prompt:

You are an expert Delphi programmer. How do I use TurboPower Abbrevia to unpack a CAB file?

Claude came back with some useful information, but there were probems in “Method 2: Extract with progress monitoring and error handling”:

// Set up event handlers
CabKit.OnProgress := CabProgressHandler;
CabKit.OnProcessItemFailure := CabErrorHandler;

However TAbCabKit doesn’t provide these event handlers!

Has anybody else experienced similar problems with Claude Code?

Cheers,
Lex Edmonds

1 Like

Thats a problem with all AIs. They don’t think. They are summarising existing work by solving multiple simultaneous equations. Where its all but impossible to find the real minima or maxima.

If you wait a year. We will have real AIs using neural networks.

I think this is an unsolvable problem, at least for a decade or two. No machine currently, or in design, has the ability to “understand”, so never has any idea of whether wrong or not. No extra data or processing power will not solve this problem. Tech progress is never linear. It goes in quick leaps, then significant periods of not much better than slow gradual improvement, then another big leap, but that may be 20 years later. There is no immutable law that says every problem has an answer – some may never have one (cold fusion, real quantum computing). The tech industry sells a lie that they can “solve” all problems, whereas all they can do is “try to solve” the problems. It’s in nobody’s interest to admit that they may never have a solution

Look at the history of tech progress and in the main the big leaps come “out of the blue”, rather than with huge money in investment. In general, nobody ahead of time has predicted the big leaps, and when we expect big progress, almost always we are disappointed

Cheers, Misha

1 Like

The difference between solving for specific use-cases and the general use-case is enormous. For the technologies below, I think the general use-cases may not be “solved” in my lifetime, even though specific use-cases will continue to grow and be useful:

  • Fully autonomous cars
  • Flying cars
  • Cold fusion
  • Quantum computing
  • AI to do what the human brain can

It looks like the full source code only comes to about 1.5 mb.

That might be small enough to make Claude aware of it all, and then ask questions off of that.

But … a guy on twitter last year offered a $10,000 prize for an Ai-refactoring strategy.
He got a suitable answer within 24 hours.
They split up the code into multiple coherent chunks, and sent them out to 100 Deep Seek queries, asking them which chunks were relevant to a particular refactoring. And then they sent the filtered-down selection as background info to (say) ChatGPT to do the actual refactor.

So when @ianbarker did his series of AI-related webinars, I started with Claude to try to come up with a simple example of this kind of codebase summariser. As we went along, the aim expanded a bit from ridiculously simple to a bit more ambitious. I haven’t been back to it for a little bit now, but at the last iteration the output was intended to look something like below, except more compact, less redundant in the end :

UNIT: MyUnit.pas
------------------

  Sections:
    interface (Line 5)
    implementation (Line 45)
    initialization (Line 80)
    
  Uses:
    uses (Line 7)
      uses SysUtils, Classes, Generics.Collections
      Units: 3 units
        - SysUtils
        - Classes
        - Generics.Collections
    
  Interfaces:
    IMyInterface (Line 12)
      IMyInterface = interface ['{12345678-1234-1234-1234-123456789ABC}']
      Interface Type
        GUID: 12345678-1234-1234-1234-123456789ABC
        Methods: 0 methods
    
  Types:
    TMyClass (Line 18)
      TMyClass = class(TObject)
      Type: class(TObject)
      
  Variables:
    MyVar (Line 25)
      MyVar: Integer
      VarType: Integer
      
  Functions:
    DoSomething (Line 30)
      procedure DoSomething(const AParam: string)
      Params: (const AParam: string)

I suspect something like provided as background might help get a good answer to a query like the one you were asking @lexedmonds (and hopefully using up many less tokens than the actual code itself)

I went and asked Claude what it ‘thought’ …


Based on the information I found, I can give you a good assessment of whether 1.5 MB of code from the TurboPack Abbrevia source directory is reasonable for Claude to understand and answer queries about.

Yes, 1.5 MB is definitely a reasonable size for Claude to work with, but there are some important considerations:

Size Assessment

What Makes This Particularly Suitable:

  1. Well-structured codebase: Being a mature compression library, it likely has good organization and documentation
  2. Single-purpose library: It’s focused on compression/archiving functionality, making it easier to understand as a cohesive unit
  3. Pascal/Delphi code: Generally more readable than some other languages due to its verbose, self-documenting nature

Limitations to Keep in Mind:

  • Context window: While 1.5 MB fits, you’ll want to be strategic about which files you include if asking about specific functionality
  • Incremental queries work best: Rather than trying to understand the entire codebase at once, focus on specific components or questions
  • Dependencies: Some files may reference external dependencies that wouldn’t be included

Recommended Approach:

  1. Start with core files (like the main archive types and base classes)
  2. Include specific modules related to your questions
  3. Ask targeted questions about particular functionality rather than broad “explain everything” queries

The TurboPack Abbrevia library would be an excellent candidate for code analysis with Claude, especially if you have specific questions about compression algorithms, archive formats, or how to use/extend the library.

I typically just paste the error insight info back into the AI program - no need to add anything else. It will generally then fix the problem. I think of AI bit more like a human. It can be very clever, but it still gets things wrong. But give it some extra info and it can often correct the problem.

The new AI coding engines Claude Code, Cursor, etc use this approach, along with the tools writing Unit Tests (although they can get the Unit Tests wrong as well).

and this is the worrying thing. Once we rely on “well the tests pass” but, in fact, the tests are bad, then it’s a problem.

We have a webinar coming up with a company who do AI backed automated PR review for GitHub repositories. Interesting stuff but I still remain an optimistic skeptic when it comes to AI - progress is good, but it’s not quite there yet and there are a lot of people who seem a little too blind to that fact.

As mentioned, work is being done on real AIs using neural networks (the way it should have been done in the first place). There has also been significant advances in neural network hardware chips.

Here is an article in the IET Magazine ( Institution of Engineering and Technology -UK of which I am a member).

Agentic AI is learning to think like us – without relying on your copyrighted content