“History consists of a corpus ascertained facts. The facts are available to the historian in documents, inscriptions and so on, like fish in the fishmonger’s slab. The historian collects them, takes them home, and cooks and serves them in whatever style appeals to him.”
EH Carr, What is History?

Although I have had a career in technology for nearly thirty years, my academic background is in history, and EH Carr’s book What is History? was one of the first books I read as I started out at University.
You may of course be wondering how this is relevant to Copilot and Generative AI, but please bear with me.
E.H. Carr’s What is History? explores the nature of historical facts, emphasizing that they are not objective truths but are selected and interpreted by historians based on context, perspective, and significance. Historical facts, for Carr, are those deemed relevant by the historian’s judgment, shaped by their present concerns and the questions they ask of the past. This theoretical framework can be applied to how Generative AI uses material for its learning.
Selection of “Facts” in Training Data: Just as historians select facts from the vast pool of past events, Generative AI models are trained on curated datasets chosen by developers. These datasets, namely text, images, or other media, are not neutral or comprehensive but reflect biases, priorities, and availability. For instance, if an AI is trained primarily on English-language texts from the internet, it may prioritise certain cultural or temporal perspectives, akin to how a historian’s focus shapes which events become “historical facts.”
Interpretation and Contextualization: Carr argues that facts only become meaningful through interpretation. Similarly, Generative AI processes raw data through algorithms, assigning weights and patterns based on its architecture and training objectives. The AI’s output is an interpretation of the training data, not a direct reproduction of “truth.” For example, when generating a response, the AI constructs meaning based on patterns in its training data, much like a historian constructs a narrative from selected facts.
Subjectivity and Bias: Carr highlights the subjectivity inherent in historical inquiry, as historians’ values and contexts influence their work. In Generative AI, biases in training data or model design shape outputs. For instance, if historical texts in the training data overrepresent certain viewpoints, the AI may reproduce these biases, treating them as “facts” unless corrected through fine-tuning or diverse data inclusion.
Facts of the Past vs. Historical Facts: Carr distinguishes between raw events (facts of the past) and those elevated to significance (historical facts). In AI, raw data (e.g., billions of web pages) is analogous to facts of the past, while the processed, weighted patterns the model learns are akin to historical facts. The AI’s learning algorithm determines which data points are significant based on statistical relevance, not necessarily truth or historical importance, potentially skewing its “understanding.”
Evolving Narratives: Carr notes that history evolves as new questions arise. Similarly, Generative AI models are updated with new data or fine-tuned to reflect current priorities. This mirrors how historians revisit the past with fresh perspectives, reshaping which facts matter.
So, and apologies for getting way to theoretical, the ‘knowledge’ contained within Copilot is based on the data is can see. They way in which we use this potentially weighted information can be further twisted through our own personal interpretations. There is, and never has been a single universal truth, and Copilot reflects this.
All of which is a very long winded way of saying Copilot is only the Copilot, we are still the Pilots and our judgements are key. We need to be mindful and careful with how we present information, as we always have been. Copilot may do some of the heavy lifting, but the moment we add out name to a document, we have to be certain it reflects the truth we wish to present.
