Seeing like a human
Human experience is innately structured in a way foreign to AI systems

In a previous essay, I illustrated some differences between humans and AI by contrasting the empiricism of David Hume with the ‘transcendental idealism’ of Immanuel Kant. Put simply, Hume believed that all human knowledge was derived from human experiences while Kant argued there are certain concepts or structures hardwired into the way we experience the world.1
My argument was that we train generative AI in a Humean way. We feed it lots of data or experiences and let it derive all its knowledge from the data. However, we humans have essential concepts or structures hardwired into us, which means our cognition and our knowledge will be structurally different to that of any generative AI system.
Importantly, it is structurally different in ways that make it easier for us to have accurate knowledge of the world. These essential structures (or, to use Kant’s term, ‘categories’) form neat building blocks for the theories, stories or world models that make up our knowledge. We don’t have to start from scratch. There is important information about the structure of the world hardwired into our minds that we don’t need to learn, and this gives us an advantage over other potential minds or knowers.
Our experiences are inherently structured
In the previous essay I focused on causation as an example of one of these essential concepts or structures. However, I have found another philosophy essay on Substack that provides a clearer example, even though the essay is on a different topic. The essay starts from a famous article called “What is it like to be a bat?” in which Thomas Nagel argues that humans can never acquire certain types of knowledge, such as the subjective experience of what is like being a bat (or any other creature). Nagel picked this example to make a clear point. Humans see with their eyes while bats echo-locate with their ears, so we have fundamentally different ways of experiencing the world.
In the essay, Barnes critiques this core point of Nagel’s argument on the basis that there are blind humans who have taught themselves to echo-locate. Therefore, there cannot be as big a chasm between our subjective experiences and those of bats as Nagel assumed. The substance of that argument is not relevant here, but Barnes made a telling observation along the way.
We tend to assume that we see space. That is, our understanding of spatial dimensions is derived from the way we see with our eyes. However, as Barnes observes:
Congenitally blind individuals report structured, oriented, navigable space. Not a stream of tactile contacts but rooms with corners and centers, hallways with length, environments with layout.
Barnes’ point is that we tend to assume our understanding of the world is tied to the sensory input. Instead, we have an underlying structure that we use to structure this input.
spatial consciousness does not require vision. The felt sense of inhabiting a structured world (surfaces at distances, paths between them, a body moving through) can be achieved through non-visual means. The maps exist without pictures.
In my words, this shows that humans are hardwired to think of or imagine the world in three dimensional space. We cannot help but make sense of the world in this way, regardless of what our sensory experiences are. This is exactly the point that Kant’s ‘transcendental idealism’ was based on. There are structures in our minds that structure our experiences. And, interestingly, space is one of the transcendental categories, or hardwired structures, that Kant included in his list.2
AI lacks these innate structures
The fact that we humans are hardwired to experience the world in three spatial dimensions is incredibly useful. Given the world (as far as we know) has three spatial dimensions, our minds are structured in a way that matches the world we live in. It’s a bit like cheating. We don’t have to work to understand space and figure it out as our minds already work in the right way. In turn, this innate structure provides a foundation or logic for us to use as we build theories, world models and our understanding of the world. We don’t have to develop a mental model of three dimensional space, but rather we can just put things or objects into the hardwired model we have and play around with them.
The Kantian argument is that space is only one of many such structures or concepts that are hardwired into us. 3 And all of them serve the same function. They provide ready made structures that enable us to understand the world that we don’t have to figure out or derive. In turn, these structures provide powerful building blocks for our theories and understanding of the world.
As noted, generative AI systems are not trained in this way, but in a highly Humean or empiricist manner. They don’t have existing structures to work with but have to derive everything from scratch. This is, for example, one reason why it takes vastly more training data to teach AI a concept than it does for humans.
Building blocks for world models
In my most recent article, I looked at the push to add world models into generative AI systems, but noted there are a few wrinkles that might get in our way. The first was that we don’t know how to meaningfully translate between human language and the world (or world models). A second one was that we don’t know how we humans build our world models.
We have started to get some clues on what this might look like here, namely that we could use our hardwired concepts as building blocks. If we want to take this approach, we need to find some way to code these into generative AI systems, or to train AI systems so they develop them. The wrinkle is that we don’t know how to define many of these or train someone in them who doesn’t already think in the right way. Causation is a classic example. We all know what it means in practice, but the definition and real meaning of it is a live and heated debate.4
Space and time seem like obvious exceptions to this skeptical observation. Computers calculate spatial dimensions and render spatial objects easily, all the time. Computer games and self-driving cars run up-to-date spatial and temporal models of the world they care about. However, they may be the exceptions that prove the rule.
Computers reason spatially and temporally in a very different way to humans. Computers run sophisticated mathematical calculations on Cartesian (or other) coordinate systems - they reduce space and time to numbers and then reverse the process. By contrast, humans think spatially in a highly analogue, non-exact way that is only occasionally quantifiable. And we all know that for humans, time never feels like it progresses at a consistent, mathematical rate. Readers can disagree about whether this means computers can genuinely reason spatially or temporally, but if they do it is in a different way to humans.
This all means that AI systems face a huge challenge if we want to include useful world models in them that connect to our human knowledge. We humans have in-built, hardwired concepts and structures we use to make sense of and think about the world - and we use these to build our models. We would need to build these from scratch for AI systems but we cannot define many of them and so have no clear idea of how to even start coding them. The critical point is that these concepts aren’t equations or definitions, but inherent ways we experience, relate to and think about the world.
Kant tied this to more metaphysical claims about whether reality is knowable or not, but we will leave all that to one side in this essay.
If you are curious, the full list of twelve is in my previous post.
Kant listed twelve foundational ‘categories’. While I don’t think it is an accurate list, it is a useful start and illustration of the type of thing we are talking about.
If you want to dip your toes into this debate, start with one of the eleven articles on the Stanford Encyclopedia of Philosophy with the concept of causation in the title.

Loved this. I wonder if another example might involve how we 'choose' what is right and what is wrong. For the great majority of humans, my inexpert impression (though I may be wrong) some basic rights and wrongs seem to be hardwired. For an AI, rights and wrongs must be derived in some way mathematically. Our fascination as humans for evil, results in us writing many words on acts and behaviours that most of us consider evil but which occur relatively rarely in society. Yet for an AI these are simply words that can be used in deriving the 'next most likely'. I guess my question is that, without human guidance, does this weight what AI produces towards a more 'evil' response than a true reflection of human behaviour?