Agentic AI

How does AI 'think'?

Max Corbridge

Cofounder

August 28, 2025

I recently found myself engrossed in an hour-long video on how AI models can ‘think’. This was a very timely YouTube recommendation as this is something which my team at Secure Agentics are actively researching. The video explained some conundrums which I believe almost all of us have faced when working with AI and seeing it ‘think’ through complex actions, despite us always being told that all AI does is just ‘predict the next word in the sentence’. More interesting than getting some answers to those questions however was realising that even the teams working full-time in this area at the largest AI provider orgs in the world still don’t fully understand how AI works, which gave me a lot of food for thought.

So, this week I decided to jump in at the deep end and try to explain how AI thinks in my own words, and talk through some of the thoughts that I was left with after watching this video. Full disclosure, we’ll be staying as high level as they do in the original video and I’m not going to pretend to be an expert in complex machine learning theories, but I do hope to be able to leave you with a slightly more clear picture of how AI arrives at the conclusions it does, and what the current state of play is regarding our collective knowledge, or lack there of, in why it does this.

Doesn’t AI just predict the next word?

For many when asked how AI works they will recount the infamous line ‘AI just predicts the next word'. This is something which I’ve heard for years now, and it is still true today. Okay, fine…if I am asking it to complete the final word in my sentence then I can see how that explanation holds true. The same, perhaps, even for writing non-sensical but funny poetry. However, many of the use-cases I have for AI include researching novel topics, analysing detailed project plans, drawing novel conclusions from existing research and providing tailored recommendations in areas that I am sure I am the first person to ask about.

So how does this ‘its just predicting the next word’ make any sense?! How could it possibly just be predicting the next word when I can see it clearly ‘thinking’ through complex scenarios that even I struggle to grasp fully. Well, the answer is fascinating. Technically, it is true that all AI has ever really learned to do is to predict the next word in a sentence. However, this is also a huge over simplification of what it is doing. A tangible comparison could be saying that through evolution humans have only ever learned how to improve their chances of reproduction, which is ultimately the driver for evolution through the centuries. Whilst much of human behaviour and achievements could theoretically be explained in this way, it would feel like a gross simplification of the complexities of humankind, and all that has happened over the last 2000 years to simply say that we were ‘just improving our chances of reproduction’. Left to our own devices we have developed political systems, societal structures, technology and poetry - all of which are complex structures which have afforded us new ways of looking at the world and, yes, to improve our chances of reproduction.

To bring this back to AI then we need to think about how it was created. This will be, in itself, a gross oversimplification but sufficient for now. AI was essentially trained by processing huge volumes of human-written text, and tasked to get more accurate at predicting what the next word that would come in the sentence was before reading it. When the examples are as simple as ‘the sky is …’ then you could pretty quickly see how an AI system will learn to predict ‘blue’. However, when the examples are more complex the rudimentary AI systems struggled to predict the correct next word. This is where it gets interesting. In order to improve their ability to predict the next word in increasingly complex contexts, AI (or perhaps more accurately we should say machine and deep learning algorithms) self-developed methods for understanding human behaviour and concepts.

For example, think about AI learning how to predict someone’s response in a conversation at a dinner party. At first, they might get by with simple patterns, like if someone says “cheers” you respond with “cheers.” But to keep predicting correctly in more complex conversations, they quickly need to understand humour, sarcasm, cultural references, and even people’s moods. Accurately predicting the “next word” suddenly requires building a mental model of social dynamics and human behaviour. AI works in a similar way: by getting better at predicting words in increasingly complicated contexts, it ends up learning far more than just vocabulary - it learns structures, relationships, and meaning.

This is where it gets a bit spooky. Through it’s training AI developed countless ways of understanding these more complex concepts, but what ways and how many no one actually knows. The thing to remember here is that we didn’t ‘program’ AI to understand humour, we simply tasked it with predicting the next word and in order to get better at doing that it taught itself what humour was. Even more spooky is the fact that we don’t even have a clear view of how it is using these lessons it has learned. What we can do is ask it to think of certain things and watch it’s ‘brain’ light up in different areas. Yes, think of it like an fMRI scan for humans. When we give it a task we can watch, with only limited accuracy, areas of the AI neural network fire. Watching which of these fire and when, we can start to map out what they are for.

For example, if we give AI the task of explaining some code we may see certain areas of the ‘brain’ ‘light up’. Even though we didn’t give it any instruction to find bugs in the code, just explain it, what we see is that as the model is processing the code it will notice when there are mistakes in it and a certain part of the brain will light up. We believe that it will essentially be storing these faults in case they are needed later on, but we can now also see with some accuracy which part of the brain is for detecting bugs in code.

There are countless other cool examples that are mentioned in the video from around 12mins-18mins, so I would urge you to go and watch that if you are interested. Another cool example mentioned here is that the brain lights up in similar ways when discussing the same concepts in different languages. This tells us that the internal thought process of AI is not in any one language, or even any human language at all, but it can then translate those thoughts into any number of languages.

Why does AI hallucinate?

One final example I’ll talk to before giving my takeaways from all this is around hallucination. If you’ve used AI a lot you will no doubt have encountered hallucination at some point or other. This is essentially where AI makes up something that appears to be a completely reasonable response, when it is in fact totally made up. One example which comes to mind for me was when I was asking AI to help me build the correct syntax for a tool that I was using. It read the documentation of the tool and gave me the necessary flags to use for my use-case. However, when running the tool I got a syntax error, which told me that I’d used a flag which didn’t exist. I went over to the documentation and compared with my command and sure enough the AI system had spotted a trend in the naming schema for the flags (such as ‘output’ being controlled by ‘-output’ and ‘mode’ being controlled by ‘-mode’) and had just assumed that the flag I was after for, lets say, ‘proxy’ was ‘-proxy’, when it fact this flag did not exist.

You can see how the AI arrived at this answer, but it was completely made up. So, why does it do this? Well, it comes down to how AI was trained (yet again). Remember, when training the AI is trying to guess the next word. If models were restricted to only answering questions which they were 100% certain about they wouldn’t be able to say anything, as their nature is to try to predict the next word. When AI models were a shadow of what they are today and you asked the question ‘what is the capital of France’, we might have just gotten the answer ‘a city’. Over time it got better and more confident in predicting the answer to that question, such as ‘a French city’, and then perhaps ‘Paris’. During the whole of training AI was just told to give it’s best guess at answering the question, which it got better and better at doing.

Where AI models became extremely confident in their best guess answer they were instructed to provide that as the answer, but if they were not sure just back out of the question and don’t give an answer. So there are essentially 2 thought processes: what is the answer to the question? and am I confident enough in this answer to say it? If you are like me then you are already starting to see how hallucination comes about - sometimes it gets that second step wrong in it’s effort to be useful, and it confidently says that it knows the answer to the question when, in fact, it made it up.

The good news is that hallucination is becoming less frequent - especially if you used some of the old models back in the day which struggled to string a sentence together - and they are working on improving how that second question is answered.

Conclusion

So, where does this leave us? Well, I find this topic fascinating in and of itself. I think that I didn’t fully appreciate just how ‘self taught’ AI really is, and that even those who are working at the companies who are providing us all with AI don’t fully understand how it thinks and what systems it has developed. Putting my tin foil hat on I also think this goes a long way to explain why there are so many initiatives to ensure that AI stays in line with our own goals and doesn’t go rogue on its own crusade. When you appreciate that we can’t currently fully understand what or how AI is thinking, it starts to beg the question if we would even know if it were ‘going rogue’ in the first place.

Secondly, and less cynically, I think its great that we have large teams of people working on understanding more about this. The video mentions that we’re currently in the world of understanding about 20% of behaviour, but we’re aiming for closer to 80-100% in the near future. The good news is that, unlike human brain surgery, seeing into the mind of AI is something we can do and further research 24/7, at scale and with increasingly powerful tools.

I hope that you enjoyed my breakdown, as this was a newsletter I actually really enjoyed writing. I find this stuff incredibly interesting and, as mentioned, it is something we’re getting hands on with ourself in our own research at Secure Agentics. Much more to come on that down the line, but that is it for now and I’ll catch you next week.

blogs

Our Latest Thoughts

Interviews, tips, guides, industry best practices, and news.