Thats a problem with all AIs. They don’t think. They are summarising existing work by solving multiple simultaneous equations. Where its all but impossible to find the real minima or maxima.
If you wait a year. We will have real AIs using neural networks.
I think this is an unsolvable problem, at least for a decade or two. No machine currently, or in design, has the ability to “understand”, so never has any idea of whether wrong or not. No extra data or processing power will not solve this problem. Tech progress is never linear. It goes in quick leaps, then significant periods of not much better than slow gradual improvement, then another big leap, but that may be 20 years later. There is no immutable law that says every problem has an answer – some may never have one (cold fusion, real quantum computing). The tech industry sells a lie that they can “solve” all problems, whereas all they can do is “try to solve” the problems. It’s in nobody’s interest to admit that they may never have a solution
Look at the history of tech progress and in the main the big leaps come “out of the blue”, rather than with huge money in investment. In general, nobody ahead of time has predicted the big leaps, and when we expect big progress, almost always we are disappointed
The difference between solving for specific use-cases and the general use-case is enormous. For the technologies below, I think the general use-cases may not be “solved” in my lifetime, even though specific use-cases will continue to grow and be useful:
It looks like the full source code only comes to about 1.5 mb.
That might be small enough to make Claude aware of it all, and then ask questions off of that.
But … a guy on twitter last year offered a $10,000 prize for an Ai-refactoring strategy.
He got a suitable answer within 24 hours.
They split up the code into multiple coherent chunks, and sent them out to 100 Deep Seek queries, asking them which chunks were relevant to a particular refactoring. And then they sent the filtered-down selection as background info to (say) ChatGPT to do the actual refactor.
So when @ianbarker did his series of AI-related webinars, I started with Claude to try to come up with a simple example of this kind of codebase summariser. As we went along, the aim expanded a bit from ridiculously simple to a bit more ambitious. I haven’t been back to it for a little bit now, but at the last iteration the output was intended to look something like below, except more compact, less redundant in the end :
I suspect something like provided as background might help get a good answer to a query like the one you were asking @lexedmonds (and hopefully using up many less tokens than the actual code itself)
Based on the information I found, I can give you a good assessment of whether 1.5 MB of code from the TurboPack Abbrevia source directory is reasonable for Claude to understand and answer queries about.
Yes, 1.5 MB is definitely a reasonable size for Claude to work with, but there are some important considerations:
The codebase appears to be written primarily in Pascal/Delphi (Object Pascal)
1.5 MB of source code is well within Claude’s context window capabilities
What Makes This Particularly Suitable:
Well-structured codebase: Being a mature compression library, it likely has good organization and documentation
Single-purpose library: It’s focused on compression/archiving functionality, making it easier to understand as a cohesive unit
Pascal/Delphi code: Generally more readable than some other languages due to its verbose, self-documenting nature
Limitations to Keep in Mind:
Context window: While 1.5 MB fits, you’ll want to be strategic about which files you include if asking about specific functionality
Incremental queries work best: Rather than trying to understand the entire codebase at once, focus on specific components or questions
Dependencies: Some files may reference external dependencies that wouldn’t be included
Recommended Approach:
Start with core files (like the main archive types and base classes)
Include specific modules related to your questions
Ask targeted questions about particular functionality rather than broad “explain everything” queries
The TurboPack Abbrevia library would be an excellent candidate for code analysis with Claude, especially if you have specific questions about compression algorithms, archive formats, or how to use/extend the library.
I typically just paste the error insight info back into the AI program - no need to add anything else. It will generally then fix the problem. I think of AI bit more like a human. It can be very clever, but it still gets things wrong. But give it some extra info and it can often correct the problem.
The new AI coding engines Claude Code, Cursor, etc use this approach, along with the tools writing Unit Tests (although they can get the Unit Tests wrong as well).
and this is the worrying thing. Once we rely on “well the tests pass” but, in fact, the tests are bad, then it’s a problem.
We have a webinar coming up with a company who do AI backed automated PR review for GitHub repositories. Interesting stuff but I still remain an optimistic skeptic when it comes to AI - progress is good, but it’s not quite there yet and there are a lot of people who seem a little too blind to that fact.
As mentioned, work is being done on real AIs using neural networks (the way it should have been done in the first place). There has also been significant advances in neural network hardware chips.
Here is an article in the IET Magazine ( Institution of Engineering and Technology -UK of which I am a member).