A.I. and Assyriology

To study Assyriology is both an intellectual joy and an exercise in sadness. The latter may come as a surprise to those who have read my breezy blogs about the fascinating world of the ancient Near East. Because, while it is true that we possess many useful sources from which we have learned a lot over the decades, we still lack a lot of essential information.1 For example, to use the story-based methodologies that can be found within the environmental humanities, with which we aim to establish how the people back then used to think about and deal with nature, we do need some stories.2 When these specific sources have been lost to time, such as with respect to ancient Elam – roughly the southwest and east of modern Iran – applying such methodologies is quite impossible.3 But we should not abandon all hope! Not only are there probably still clay tablets and other documents preserved in undiscovered archeological find spots across the Near East, but there are also thousands of unread fragments of clay tablets, papyri, and the other media just waiting to be studied, which sometimes haven’t been read since they were taken out of the ground. Extensively celebrating the latter reassurance would be premature, though. Because, there are at the moment simply not enough experts to study all these documents – even if they can be read.4 Luckily, we came into some good news recently: A.I., an abbreviation of the term ‘artificial intelligence,’ can be of service! Or can it?

One of the frustrating aspects of writing this blog, is the knowledge that my writings will probably be fed into the large language models that are currently the most prominent product styled with the moniker A.I. – though whether they represent true artificial intelligence is debated.5 All so these models can produce mostly mediocre – and probably wrongly sourced – essays that straddle the line between non-fiction and, well, fiction.6 Not to mention the adjacent environmental burden.7 But this is, of course, merely the current way some businesses have created, employed, and marketed their so-called chatbots. It is not necessarily an indication of the potential of these programs nor of A.I. in general.8 And not all is ill in this regard, as a recent flurry of news showed. Programs and technologies which are grouped in with the development of A.I., such as machine learning and neural networks, could perhaps also be deployed to further the research conducted within the humanities – even including translating cuneiform texts!9 This week, I am therefore going to highlight some of the potential and some of the obstacles of adding A.I. to the arsenal of the digital humanities.10

This blog is also available in Dutch.

The Digital Humanities

The youngest A.I. craze, which can be said to have started with the introduction of the chatbot ChatGPT to the wider public in 2022, is naturally not the first time the humanities turned to digital tools.11 Much of the ground-breaking research of the last few decades, and especially with regard to the study of the ancient Near East, would not have been possible on that scale and in the short amount of time that it took, without digital resources!12 In the first place, there is the accessibility brought about by the digital realm.13 One can, for instance, view many ancient texts and artefacts through the worldwide web – even if they have not been properly been published in a fancy monograph. In addition, digital tools can be employed, for example, to visualize the relation between ancient families, professionals, and administrators that are mentioned in our sources, to create databases of ancient vocabularies, and to compare texts on a scale heretofore unimaginable.14 Part of such endeavors may be aided by machine learning. This technology has been deployed extensively to mine data from large, digitalized archives and can also be utilized to clarify hitherto unintelligible sources.15 And the knowledge acquired in this manner can be shared faster than perhaps during any time in history.16 In this way, the humanities can not only make progress at an accelerated speed, but also conceive of new questions that may be answered through further research.17

Notwithstanding these great advantages granted by the digital humanities, we should keep in mind that these tools and their uses have been brought about by humans beings with all their usual blind spots.18 As such, it remains important to be on the lookout for biases, unfounded promises, and tunnel vision – both with regard to our research as well as the (digital) tools at our disposal. For instance, just feeding data into an algorithm without care for the origins of a data collection or which data may still elude us, is not productive.19 And this same cautioned approach – embracing the possibilities while being mindful of the risks – applies to the way A.I. has been employed with regard to the research into cuneiform texts in recent years.

Computing Cuneiform

The study of the ancient world is characterized by many elegant intricacies. Using ancient sources is not just a matter of reading or observing them. One has to understand the nuances and multifaceted meanings of languages and scripts, dissect the purposes and perhaps even hidden motives behind the documents and objects that we found, and place the art of the deep past in its proper context.20 And that is just the beginning! It truly is, as Hans-Georg Gadamer famously observed, a hermeneutic undertaking.21 Newly acquired knowledge begets more knowledge, because it recontextualizes or reinterprets what we already thought we knew. And this is especially true with regard to cuneiform texts, which make up the bulk of the writing on the clay tablets that we have found in the Near East. These texts can be read in various ways, because cuneiform signs often have multiple meanings – representing both sounds and words – as do the vocabularies that are written with them.22 Which begets the question: can the various digital tools that we designate with the moniker A.I. handle such complexities?

The technologies that are commonly grouped in with other concepts of A.I. – like the aforementioned machine learning, neural networks, and large language models – and which have been applied to cuneiform texts seem to have, until now, mainly focused on the reconstruction and translation of clay tablets.23 Because these tablets are often found in a fragmentary state, with pieces spread in collections all over the world, the reunion of a single text can be a Sisyphean effort.24 And there are so many texts that await translation, interpretation, and publication that new solutions which enable us to outsource or speed up the latter activity would be more than welcome. These efforts to create computational aides are therefore both commendable and useful. The underlying codebases and digital tools have really fitting names, by the way! Like Akkademia and Atrahasis – respectively referring to the ancient language Akkadian that was mostly written in cuneiform and the hero of the eponymous Near Eastern epic.25 Such an eye for detail really illustrates the zeal of the researchers involved to make these projects a success.

All things considered, we are – as far as I can see – still in the developmental stage with respect to these tools, though.26 The translations, for example, remain kind of basic. Not only are the programs mostly incapable of reading the clay tablets themselves – humans have to put the cuneiform signs into Unicode signs or even enter the sign values into the programs themselves – they also seem to fail to incorporate the intricacies of the texts they are fed.27 In one example, the program translated a stated profession differently, depending on being fed the transliteration – when the sign values of the cuneiform are already converted into the alphabet – or the cuneiform writing itself.28 Similar problems are observed with reuniting broken texts. The programs are not suited yet to recombine either 3D fragments or 2D cuneiform representations. As such, it is mostly texts that were already manually transliterated which are entered into the programs that compare them for possible reunifications.29 We can expect that these technologies will keep advancing, but there are a few stumbling blocks that I can see on the road ahead, which may keep causing us digital headaches for a very long time.

Pixelated Pitfalls

The hardest part of working with clay tablets is perhaps the fact that these are not mere texts but also objects. Objects on which people happened to write a few thousand years ago. These  texts are now often in a state of a reduced legibility and, as we saw above, the tablets on which they were written are often fragmented and incomplete. As such, the real challenge of making a clay tablet accessible to the wider world, is the part before the current digital tools come in: it is recombining various pieces of clay and figuring out the exact signs written on them, before going on to guess their meaning within that context.30 As is noted in an interview with Alwin Kloekhorst, an associate professor at Leiden University, it is the 3D nature of clay tablets that makes the interaction between them and our current digital tools so cumbersome.31

Another difficulty with making cuneiform texts on clay tablets comprehensible for historians, anthropologists, the general public, and other interested parties, happens after the fragments of a tablet are reunited, transliterated, and translated. Not only do we have to determine the meaning of the cuneiform signs in their present context – as far as they are still legible! – but we also need to interpret the text itself. Because the understanding of ancient documents often requires foreknowledge that is hard to come by – if it even is available to us in the twenty-first century CE.32 As such, we need a proper background with regard to the society that produced the text, which is primarily achieved through reading as much of the secondary literature and as many other ancient texts as possible. Only then may we hope to establish, for instance, whether and in what measure a text presents us with historical truths as opposed to a potentially biased account of events that does not aid us when we want to learn more about the actual human condition in the deep past.33 To put it in less abstract terms: a contemporary commercial for dishwasher tablets would not be of much use for future archeologists, if they want to determine their actual efficacy! Though it may teach us about the way producer’s went about their marketing in the current day and age
 And this kind of prerequisite foreknowledge is often difficult to build into our current digital tools.34

Then there are the problems peculiar to the creation and training of the most common digital tools that are currently in use. Many of these tools need large training sets – enormous piles of relevant data, so to say – to become useful for the tasks that they are built for. And with regard to some ancient languages that were written on clay tablets, as again noted by Kloekhorst, we simply do not have enough material!35 As such, the results of digital translations may not be as accurate as we would like them to be, just because our tools cannot be trained sufficiently.

Finally, there is the risk that we prejudice our tools. There were many times in the history of the study of the ancient Near East, when widely researched hypotheses turned out to be spectacularly wrong.36 This is the proverbial name of the academic game, of course, and also an indispensable part of the hermeneutic mechanisms that scaffold this kind of research. But it is difficult to endow our A.I. tools with this kind of hermeneutic attitude. As a result, we run the very real risk that we innocently embed wrong scholarly turns or ideas in our digital programs and that these will subsequently keep pursuing dead ends. Dead ends that shall not be immediately obvious to many researchers, because these turns and ideas are no longer explicitly visible to most of us when they become a part of the architecture of our digital tools.37

In summation, the current digital tools that are adjacent to the developments which are often summarized with the term A.I. seem to be, at the moment, only suitable for a rather minor part of the process with which an excavated clay tablet can be made accessible and be used to reconstruct the world of the deep past, including the lives and attitudes of its inhabitants. But the underlying mechanisms are fascinating and would sanction a cautious optimism.

Conclusion: Cautious Optimism

Though the progress looks promising, many of the digital tools that are categorized as A.I. are a long way off from really propelling the study of the ancient Near East into the future, pun intended. And here we should not only fear that our present ideas of and views on the past inadvertently color the output of our computer programs. Because the mĂ©tier of large language models, the machine learning techniques, and the technology of neural networks are notorious for including harmful societal prejudices.38 As such, we should remain vigilant with regard to the input as well as the output of our digital tools – in general and when studying the ancient past. Because the humanities have the potential to be an antidote against prejudices, to be a window into the variety of the human experience throughout the ages, and to be a reminder of our shared humanity.39 This potential would be ill-served if we do not interrogate that which might get lost in the black boxes that the digital tools which I discussed today still represent to many of us who are not specialized in the digital humanities. But this vigilance must not inhibit our fair assessment of any new resources that may enlighten and uncover the human condition – including the existence of those who lived in the antiquity! – better than ever before.

Because in the end, that is what this is all about. All those digital marvels that you read about above are mere instruments that serve a purpose: to retrieve our history from the claws of time, decay, and destruction. And it is the results of this endeavor that should encourage us to keep innovating our methods, digital or otherwise. Soon we are therefore going to continue to learn about these results. And I think the first of these blogs will be about the discovery of one of the coolest action scenes in Mesopotamian literature.

Please enable JavaScript in your browser to complete this form.

References

  1. Alan Lenzi, An Introduction to Akkadian Literature: Contexts and Content (Winona Lake: Eisenbrauns, 2019), p. 29-33, 39-41. The existence of missing texts, for instance, is confirmed by the references to them in texts we do have, see: Ibidem, p. 80, note 12.
  2. Christopher Schliephake, The Environmental Humanities and the Ancient World: Questions and Perspectives (Cambridge: Cambridge University Press, 2020), p. 25.
  3. Jan Tavernier, “Elamite”, in: Rebecca Hasselbach-Andee (ed.), A Companion to Ancient Near Eastern Languages (Hoboken: John Wiley & Sons, 2020), p. 164-166; Margaret L. Khachikđiưan, The Elamite Language (Roma: Consiglio Nazionale delle Ricerche Istituto per gli Studi Micenei ed Egeo-anatolici, 1998), p. 1-2. And the literary fragments that are said to have been discovered in the Elamite language are often subject to continuing scholarly debate, see: Igor M. Diakonoff & N.B. Jankowska. “An Elamite GilgameĆĄ Text from ArgiĆĄtihenele, Urartu (Armavir-Blur, 8th Century B.C.)”, Zeitschrift FĂŒr Assyriologie Und Vorderasiatische ArchĂ€ologie 1990, 80 (1–2), p. 102–23; Andrew R. George, The Babylonian Gilgamesh Epic: Introduction, Critical Edition and Cuneiform Texts – Volume 1 (Oxford: Oxford University Press, 2003), p. 24, note 67.
  4. Lenzi, An Introduction to Akkadian Literature, p. 193.
  5. Susan Ariel Aaronson, Data Disquiet: Concerns about the Governance of Data for Generative AI (Waterloo: Centre for International Governance Innovation, 2024), p. 6-8.
  6. Suzanne de Winter, “Studenten Laten ChatGPT Massaal Verslagen Schrijven”, De Gelderlander June 8th 2023, Algemeen, p. 3; Jos de Mul, “Het Parasitaire Karakter van ChatGPT”, Groene Amsterdammer 2024, 148 (13), p. 40-43.
  7. Matthias C Rillig et al, “Risks and Benefits of Large Language Models for the Environment”, Environmental Science & Technology 2023, 57 (9), p. 3465.
  8. James Manyika, “Getting AI Right: Introductory Notes on AI & Society” Daedalus, 151 (2),  p. 5-27. For the history of less-than-accurate predictions regarding (future) uses of the various technologies that are often lumped together under the moniker ‘A.I.’, see: Walter J. Scheirer, A History of Fake Things on the Internet (Stanford: Stanford University Press, 2024), p. 155-169.
  9. Bart Funnekotter, “Computer Raadt Missend Spijkerschrift”, NRC Handelsblad September 8th 2020, Wetenschap, p. 16; Editorial Board Leiden University Website, “Artificial Intelligence and Clay Tablets: Not Yet a Perfect Match”, UniversiteitLeiden.nl October 10th 2023 (retrieved on August 6th 2024).
  10. In this blog, I focus on digital tools that aid our study of the past. For all the other phenomena that can be categorized as belonging to the digital humanities, see: Susan Schreibman, Ray Siemens & John Unsworth (eds.), A Companion to Digital Humanities (Malden: Blackwell 2004); Patrik Svensson, Big Digital Humanities: Imagining a Meeting Place for the Humanities and the Digital (Ann Arbor: University of Michigan Press, 2016), p. 5.
  11. V. Rajaraman, “From ELIZA to ChatGPT: History of Human-Computer Conversation”, Resonance 2023, 28 (6), p. 897.
  12. Rens Bod, De Vergeten Wetenschappen: Een Geschiedenis van de Humaniora (Amsterdam: Prometheus, 2020), p. 443.
  13. Svensson, Big Digital Humanities, p. 2.
  14. Caroline Waerzeggers, “Social Network Analysis of Cuneiform Archives – a New Approach”, in: Heather D. Baker & Michael Jursa (eds.), Documentary Sources in Ancient near Eastern and Greco-Roman Economic History (Philadelphia: Oxbow Books, 2014), p. 210-213; John Huehnergard, A Grammar of Akkadian (Winona Lake: Eisenbrauns, 2005), p. xxix-xxx; Bod, De Vergeten Wetenschappen, p. 443.
  15. Stephen Ramsay, On the Digital Humanities: Essays and Provocations (Minnesota: University of Minnesota, 2023), p. 135-136; Svensson, Big Digital Humanities, p. 5.
  16. Anne Burdick et al, Digital_Humanities (Cambridge MIT Press, 2012), p. 33.
  17. Bod, De Vergeten Wetenschappen, p. 443; Anne Burdick et al, Digital_Humanities, p. 30.
  18. Ramsay, On the Digital Humanities, p. 74; Alan Liu, “Where Is Cultural Criticism in the Digital Humanities?”, in: Matthew K. Gold & Lauren F. Klein (eds.) Debates in the Digital Humanities (Minneapolis: University of Minnesota Press, 2012), p. 491, 495.
  19. Anne Burdick et al, Digital_Humanities, p. 32-33. See in general: David J. Hand, Statistics: A Very Short Introduction (Oxford: Oxford University Press, 2008), p. 9-10.
  20. Levi, An Introduction to Akkadian Literature, p. 61-62; Zainab Bahrani, Mesopotamia: Ancient Art and Architecture (London: Thames and Hudson, 2017), p. 8-9.
  21. Hans-Georg Gadamer, Waarheid & Methode: Hoofdlijnen van een Filosofische Hermeneutiek, Translated by Mark Wildschut (Nijmegen: Uitgeverij vanTilt, 2014), p. 170-253. See in general: Jens Zimmermann, Hermeneutics: A Very Short Introduction (Oxford: Oxford University Press, 2015), p. 57-71.
  22. Marc van de Mieroop, Philosophy before the Greeks: The Pursuit of Truth in Ancient Babylonia (Woodstock: Princeton University Press, 2016), p. 59-84.
  23. For the most pertinent examples that I could find, see: Ethan Fetaya et al, “Restoration of Fragmentary Babylonian Texts Using Recurrent Neural Networks”, PNAS 2020, 117 (37), p. 22743–22751; Gai Gutherz et al, “Translating Akkadian To English with Neural Machine Translation”, PNAS Nexus 2023, 2 (1), p. 1:1-10; Shai Gordin et al, “Reading Akkadian Cuneiform Using Natural Language Processing. PLoS ONE 2020, 15 (10): e0240511.
  24. Ethan Fetaya et al, “Restoration of Fragmentary Babylonian Texts Using Recurrent Neural Networks”, p. 22743.
  25. Gutherz et al, “Translating Akkadian To English with Neural Machine Translation”, p. 1:1.
  26. Ethan Fetaya et al, “Restoration of Fragmentary Babylonian Texts Using Recurrent Neural Networks”, p. 22744.
  27. Gutherz et al, “Translating Akkadian To English with Neural Machine Translation”, p. 1:2, 1:5.
  28. Ibidem, p. 1:2.
  29. Ethan Fetaya et al, “Restoration of Fragmentary Babylonian Texts Using Recurrent Neural Networks”, p. 22744.
  30. Lenzi, An Introduction to Akkadian Literature, p. 9-21; Benjamin R. Foster, Before the Muses: An Anthology of Akkadian Literature (Bethesda: CDL press, 2005), p. 10.
  31. Editorial Board Leiden University Website, “Artificial Intelligence and Clay Tablets: Not Yet a Perfect Match”, UniversiteitLeiden.nl October 10th 2023 (retrieved on August 6th 2024).
  32. Lenzi, An Introduction to Akkadian Literature, p. 33-34.
  33. Though such pieces of what might be called propaganda are naturally still useful to glean the attitudes and interests of some persons back then, see: Ibidem, p. 132-149.
  34. This conundrum, I think, is the reason why many of these projects – such as those cited in note 23 – have up until now largely focused on more predictable, formulaic texts, like administrative documents.
  35. Editorial Board Leiden University Website, “Artificial Intelligence and Clay Tablets: Not Yet a Perfect Match”, UniversiteitLeiden.nl October 10th 2023 (retrieved on August 6th 2024).
  36. Mario Liverani, The Ancient Near East: History, Society and Economy, Translated by Soraia Tabatabai (New York: Routledge/Taylor & Francis Group, 2014), p. 1-16.
  37. Manyika, “Getting AI Right”, p. 12.
  38. Cathy O’Neil, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (London: Penguin Books, 2017), p. 87–91, 148–55; Evgeny Morozov, To Save Everything Click Here: Technology, Solutions and the Urge to Fix Problems that Don’t Exist (London: Penguin Books, 2014), p. 210–12.
  39. Michiel Leezenberg & Gerard H. de Vries, Wetenschapsfilosofie voor Geesteswetenschappen (Amsterdam: Amsterdam University Press, 2012), p. 297-315.