Uncertainty in Manuscript Technologies and the Potential of Computational Tools
Everybody you're very welcome tonight to the second in our series of lectures on the topic of trust and authority in the digital age. This is an initiative of the what we call ourselves the Digital Humanities Network lecture series. There are many digital humanities networks, but this one in particular is a collaboration between the University of Birmingham in the UK and my own university Trinity College Dublin in Ireland. And, in particular, what we are very interested in in this network is.
Not only questions of the digital humanities, and how it can make a greater impact on society, perhaps, but also questions of what we can learn actually from the humanities. In a an increasingly digitized and digital age, so this is the second of our lectures in the series and I’m going to turn over to my colleague in Birmingham Aengus Ward to introduce our speaker tonight Aengus take it away. Thank you Jennifer so I’m sure to pretty much all of you, Professor Treharne will need no introduction, but I’m going to use my position and do that anyway so. Professor Treharne from Stanford university is of course very distinguished medievalist and one of the principal Scott is working on manuscript studies some time. Going back to, for instance, that project on the production and uses of English menus of English manuscript some time ago.
Which is also to the forefront of digital humanities initiatives in. medieval studies and manuscript studies, and that is one of the reasons that we have asked her to give one of our lectures. Today she's also the director of the Stanford text technologies and has been working on digital interpretive frameworks more recently. She has a book coming out with oh up on entitled perceptions of many manuscripts the phenomenal book want to be looked out, for I think and she's going to speak this evening to us on the title uncertainty in many of the manuscript studies and the potential of computational tools. More than welcome to.
Thank thanks so thanks very much Aengus for the introduction and also for inviting me to participate. In. This series of three lectures on trust and authority in the digital age and. My talk will be principally on uncertainty in manuscript studies and especially palaeography the study of old handwriting but you'll see. I think that this issue of uncertainty is directly linked with both trust and authority those kind of larger themes that you're interested in and there's a whole. kind of cogent analysis to be made of that, but sadly not today, not this evening so I’m going to limit it more to uncertainty alright, so of course I am going to.
Just share my screen with you. So 1939 as World War Two ominously approached basil brown made a discovery that changed the world of archaeology. And the entire understanding of early medieval history in England at certain who. On the state of Edith pretty he excavated the shadow of a ship in the midst of which are dissolved Royal body had lain buried. In a chamber and the events of the summer of 1939 have been recently reimagined in the film The dig starring Carey mulligan and the themes and in that film only a handful of the fines. are depicted so here's the gold garnet and military glass to the person ID.
Among the scores of objects meticulously retrieved are a pair of six to 37th century silver spoons inscribed in Greek letters with the names so loss and power loss, and this is my own photo from the exhibition in English museum few years ago. These are not the most glamorous expensive extraordinary or attention grabbing items found in the burial chamber. Measuring 25.4 centimetres to the bowl of the spoon which is four centimetres they're
about the size of a small modern servings food. But the spoons came to early East Anglia from the Eastern Mediterranean there Byzantine and this crucially demonstrate east, west trading and cultural exchange, the earliest centuries as a medieval period. much about the interpretation of the spoons is uncertain from their route to eastern England from the Eastern Mediterranean, perhaps via North Africa. Their precise date and place of origin, their intended function in the burial assemblage and the significance of their inscriptions.
For some decades, it was assumed that these were baptismal spoons the south Los representing the unsafe state of saw before he was baptized the room and Christian Paul power loss. This baptismal interpretation suggested a meaning to the spoons far beyond their self-significations it was semis that they might have been given as a baptism gift. To King rodewald the widely accepted monarch at the heart of the Sutton who burial This meeting has obvious import for a scholarly agenda that wishes to promote the conversion narrative the East Anglians pagan settlers conforming to the dominant religion of what was to become England. But in an article in speculum in 1967 medievalist Bob cassie queried this interpretation and sort through pioneer graphical analysis, the analysis of ancient handwriting. To prove that these objects should be thought of as a like pair of power loss spoons this removing the highly charged conversion emphasis to the burial the spoons seemed to confirm. He based this reading despite himself saying I cannot speak with much authority as an author.
On the style he based the reading on the style of the writing on the side law school thinking of the stigma on the South Los lunas and the stake for the. pie on the powder spoon and he determined that the writing on the solo spoon is, and I quote crude inconsistent and executed, with no regard for symmetry. cascade concluded that the solo spoon must have been made in a different workshop to the other. As an inferior copy or a reproduction that bespeaks, and I quote, and obtuseness ignorance or carelessness and that it should effectively then be dismissed as representing anything larger than it spookiness. Here then probably a graphical ethnographical analysis is proposed to solve a conundrum unknown through the application of a possibility.
The pair of objects is complicated by their placement in the burial Chamber adjacent to silver balls with cruciform design of Eastern Mediterranean provenance. Providing with the bowls and the spoons deluxe utilitarian dinnerware for the afterlife. But what should be acknowledged here despite kaskida super imposed and derogatory aesthetic of obtuseness and lack of symmetry etc.
What should be acknowledged as the range of questions that emerged from the uncertainty posed by these objects. Questions include asking what the possibly unprovable function might have been, what do these spoons reveal about cultural exchange and about trade routes and the seventh century. What do these fine suggest about elite barrier beliefs and cultural mores who's was the vanished body what kinds of craft person ship emerged from the explanation. And our present state of knowledge, as in 1939 and subsequent subsequent vast array of scholarship that's emerged on certain who are present state of knowledge, means that we may never have definitive answers, there is no right or wrong here, there is only a fuzzy state of knowledge. For humanists perhaps particularly those who study peoples and cultures, for which the historical record is partially or even wholly unknowable.
Uncertainty is simply ingrained in what we do an integral component of our understanding and our sets of approaches. The idiosyncrasies of Cairo graphic technology manuscript technology and the uniqueness of each medical manuscript or scroll or single sheet document. is different, difficult to capture digitally and computationally there is a distrust of that, for which there is no evidence based truth. But new work underway suggests that exciting things lie ahead and interesting discoveries can be made if humanities can get a look in via interdisciplinary artificial intelligence and machine learning teams.
Including humanists and technological search and, especially, Ai is vital, a baby. A baby bear Hain in her article, the impossibility of automating ambiguity published in artificial life very recently. makes no bones about the parlous state of affairs presently. reminding readers of the unpredictable and messy way in which people interact with the world their haynes these the fixed and universalism tendencies of machine learning and Ai. To be dangerous reductive and replicating elitist sexist racist and classist past patterns into the future.
She says when machine learning systems pick up stable patterns, they also identify harmful current and historical norms prejudices and injustices and go on to promote those. Behind comments that, and I quote the very practice of scoring characterizing and assigning algorithmic identities about people's awareness. Risks treating people like object and she shows ways in which the idiosyncrasies of individual humans are not accounted for in machine learning. She finishes with a suggestion that, and I quote one way of moving forward to adjust society is to envisage a fundamentally different kind of technology that is grounded in ambiguity fluidity and diversity of experience. I heal my Bambi more apocalyptically still understands technological escalation is leading to the emergence of a computational capitalism that threatens to transform the human into an artifact generating further forms of injustice and racial rationalization.
To counter this or diminish the eXtensible inevitability that comes along with technological determinism I would argue very strongly. That technologists and computer scientists must include humanities humanists humanities on their teams and humanists must take responsibility for participating collaboratively. arguably the most ideally placed in terms of their working with uncertainty other pre modernists and more significantly, those from cultures and countries his own traditions have been destroyed or elated by colonization. A human Bambi reminds us that, in order to counter and ever mutating racism globally it's our absolute duty to widen the archive for many excellent archives, this means working with manuscript traditions in many different languages and from many different points of origin. widen the archival bring the idiosyncrasies and uncertainties of the record to the fall. uneasiness might be felt by collaborating scientists or engineers in relation to that which is not known or knowable not certain or definable.
As DDA dubois and only proud reminders for artificial intelligence, and I quote a piece of information is said to be uncertain for an agent when the latter does not know whether this piece of information is true or false. What, though, of nuance and degrees of certainty. On a spectrum of knowingness How does artificial intelligence and machine learning account for what medievalist deals with every day on a spectrum like this. And you can make your own order out of this and there are lots of other potential possible probable entries, you could put into this spectrum, but on the spectrum like this.
It is the right side as we're looking at this, that dominates the manuscript studies and party or coffee and writing in particular. The uses of Ai in analytics means we do get answers, but a consequence for say mathematicians or some of them at least. In terms of these answers is disquiet about deep neural networks artificial intelligence which gives an answer is useful, but for which there is no way to interpret the answer, or, to be sure about how the answer was arrived at. By contrast, for for perhaps the majority of humanists uncertainty, rather than causing uneasiness. can or should engender an excitement and an interpretive open ended. For many humanists medievalist in particular is simply the mode in which we work, there is no deriving a right or wrong answer no incontestable truth in much research that we do much research that we undertake.
In terms of interdisciplinary collaboration, then, is it possible to embrace uncertainty to regard it positively. Is it possible to spread that message that uncertainty should be regarded positively positively. Uncertainty is productive and generative it inspires or indeed urges the asking of new questions and new approaches. And here to this is where computation and digital tools come into play. They force us to look at information as data as quantifiable they encourage the encountering of data in a compute completely new night to see objects of study in completely fresh ways and just ask innovative and potentially field changing questions.
As such, if Ai specialists or machine learning experts, more broadly, were to engage with humanists as a matter of course and vice versa, obviously. Being CEO and collaboration of methods to will approach and domain expertise could be most effective way of expanding horizons of research, the recognition that single solutions are not always required to move tricky questions forward is surely one of the most significant. Now, probably on prefers a pig refers, and those who work with early handmade technologies of text deal with the most comprehensive set of uncertainties about the principal objects of their study. For the sake of clarity, when I talk about manuscript technologies and genuinely referring to anything that's written by hand, including cuneiform tablets graffiti charters autograph books.
Letters lecture notes from the earliest writing systems globally to the present day. My principal case studies in the remainder of the lecture will be medieval manuscripts and roles. And in this lecture i'll talk about the nature of a handful of these uncertainties and the kinds of expectations that's dollars might realistically have. have some computational and digital tools and methods i'll also refer to digitize transcripts those images available. And through varieties of processing and within which, and there are holy and standardized into interpretive frameworks for the various repositories that promote their digitized connections.
And also refer to projects that use machine learning and artificial intelligence tools to provide data for further analysis. In terms of definitions with computation that computational let me define machine learning as the capability to produce huge amounts of data algorithmically to discover patterns. and artificial intelligence and defining as using classification your conceptual capabilities to examine the data produced. Digital objects like images digitized images are data at the level of the pixel not just individually visible items to be compared or annotated and within the use of digital I mean working with actual materials online.
especially through the use of digitized aspects of manuscripts and their metadata. So the difficulty having defined these various categories which many times, I think, a very loosely used. The difficulty in a public lecture like this is knowing where to start, so my aim now is twofold.
To demonstrate some of the ways in which my own work as a manuscript scholar is benefiting from machine learning Ai and the world of digital aspects. And to point out what is already clear that the way forward as i've said is spirited collaboration that turns on reciprocity, respect for domain expertise. My main point is that uncertainty human idiosyncrasy is a phenomenon to be fully accounted for and fully appreciated, it is not something to be erased or dismissed, either in the field of medieval studies or in computer science more broadly. For those of us in the beleaguered humanities, and as as a central I must do this, let me protest, the redundancies and closures of departments of English of Greek and Latin, and of archaeology. Let me object now to the myopic eradication of the earliest English and English language at university of Leicester. These fields these subjects, these objects of study the past itself cannot be eradicated or ignored.
So, for those of us in these beleaguered areas of research and teaching, there is now and has always been a need to engage with other disciplines, whether that is physics computer science or library and information sciences. And each scholar has their own sources of knowledge that can be used to be brought to bear in the work of teams, seeking to solve particular problems. As French philosophers already back saw and Maurice Neto continue advancing the human effort relies on an openness to approaches and methods of analysis. Even a scientific field, seek to expose define and model physical world space time and data to grossly oversimplified. Yet that which is peculiarly human cannot be made subject to predictable laws and concrete facts such positivism cannot easily be applied to thoughts creativity.
feelings and imagination and it's these that are manifested through the preparation and production of text in manuscript form. And I think actually creativity and outside of the world of the arts as we, as we separate them out at my institution arts, a sort of disciplinary separate to humanities creativity involves the calligraphy of scribes in medical manuscripts the production of a q&a is to. create the cuneiform itself, I mean creativity in that sense of the production of something from the heart of the spirited individual. So we must must not set aside creativity as a kind of fundamental part of this collaboration, just before the world went into. The global lockdown Stanford text technologies, held a workshop with leading computer scientist Professor Mohammed sherry who runs the synchro media lab at the University of Quebec.
And, with whom i'd collaborated previously on a project called global currents in this workshop uncertainty and artificial intelligence situations. We asked researchers to consider what an Ai can do when faced with uncertainty machine learning algorithms use classifications rely on posterior probabilities of membership. often present ambiguous results were due to unavailable training data on biggest cases the likelihood of any outcome is approximately even. In such situations, the human programmers must decide how the machine handles ambiguity, whether making a best fit classification or reporting potential error. There is always a possible conflict between the mathematical rigor of the model and the ambiguity of real world use cases.
Since humanists in particular are adept and professionally skilled at working with interpreters of paradigms that are neither right or wrong. Our insights into uncertainty ambiguity and indecision a crucial to Ai research and development. Moreover, in working in depth on all manner of research and let's take if you're a literary scholar Irian satire. genres that are deliberately ambiguous and indeterminate or for manuscript studies color description, one of the most ambiguous of all linguistic and conceptual castles. humanists expose nuance and evaluate attributions of earn clarity on multiple levels to frame and advanced more effective questions but take uncertainty into account. In research institutions worldwide researchers who are creating training sets that engage with uncertainty need a humanist or social scientific domain experts.
Particularly when deciding between reflecting real world data and curate curating data sets to avoid bias. In other project teams again it is the humanists you can effectively frame ontology type policies and epistemologies that can account for and help solve ambiguity and data and indecision in Ai. In a real world practical case and specifically in relation to medieval manuscripts a couple of years ago I chance to meet a person. heading up the handwriting recognition ocr team at one of the major global tech companies in Silicon Valley. This person told me the team was finally, after a very, very long time decades far along in bringing effective software to market. Their revealed they do use later medieval Gothic script as the training data and then proceeded to tell me all about set hands and how medieval scribes practice their craft.
Having been thoroughly explained, I asked how many professional humanists particularly geographers work on their team. I think you all know the answer to that question, so it taken them decades, the question then is not how successful these tools eventually are, but how much more successful they could have been. Or how much more effective and timely the tools development could have been had they bothered to ask a domain expert. So, working with textual objects in the West in the 11th to 13th centuries, but extrapolated or to other writing traditions times and places. is to engage in some of the richest, but most frustrating research. From the 17th century onwards, the related disciplines of pornography and diplomatic the study of ancient handwriting and the study of documentary practice.
Of busy themselves with identifying and describing the historical literary records. classifying scripts determining dates from handwriting and formally and acid ascertaining how and why textual materials were produced. As Dr oriented the role that the university Cambridge has pointed out to me and does my research into page layout and particularly pricking and routing has shown. Many manuscripts scholars in the 19th and 20th centuries with themselves essentially data gathers manually accumulating vast amounts of information that they used to make informed judgments. About place and time method and means of preparing substrate and laying out and describing text and image so.
data gathering came in all forms from collecting the categories of scripts to beginning to think about ways in which the manuscript pages are prepared. And these scholars and scholars now of course spend time with thousands of manuscripts manually gathering vast amounts of data. In his work on thousands of manuscripts pricking and ruling patents for example Leslie Weber Jones. In the mid 20th century ascertained probability of dates and modes of production for hundreds of manuscripts. His work is extraordinary and meticulous to the level of individual products in the in the margins of the earliest Western courtesies.
verse he deduced in the case of Rome Vatican library Pal lot manuscript to 59 an eighth century collection of Gregory the great familiar is written by multiple scribes. That it represented a failure to use any of the insula systems of pricking showing the insula foundations on the continent eventually break away from the insula influences. Jones goes on to say presciently we need more data about the duration of these influences in individual centers and he said this in 1944 and here we are nearly 80 years later, with a chance, perhaps to revive interest at scale. In things like writing grid in this fascinating and it's a very under investigated element of manuscript production, the preparation for the writing on the page.
Jones like most of the scholars until the 21st century did his primary research, where he was able, and especially through the second reward a lot of the times, he was not able. to work with the manuscripts themselves in repositories and his ancillary research was with plates and published in a current de Los. and other facsimile collections and i've talked about this elsewhere.
But working with plates is analogous to working with fragments manuscript fragments the information is obviously the limited by the extracted textual object. And object that emphasizes, through its presence on the pages of the facsimile and that's which is absent, its entire host volume. This absence is visually remedy now by the digitization of many of the manuscripts previously only remotely accessible.
or accessible as individual plates, and this is the first and most significant consequence of digitization and open access. That the field of manuscript research, as we know it, as we have received it must in ever simply transform. Because there is now a growing recognition that many of the foundational studies will need to be revised in the light of massive amounts of newly visible material.
And so, this is that one example from Leslie Weber Jones of manuscript power last 259. now available in fall to us through the digitization work of the Vatican library, and this is the manuscript that Jones declares does not adhere to ensue the practices heralding a break in tradition on the continent so. there's a huge amount of weight put on this one Codex but on this occasion Jones has been following the plates provided by EA low in the cla. And those plates have provided Jones with the wrong information because, looking at the whole manuscripts of scholars can now do because of digitization.
shows that there are in fact pickings in the inner and outer margins, the practice the practice followed by instant ascribed. When this is not an isolated example I could pull out many, many more instances of more recent work than jones's in palio griffey were categorical statements and made. As if quantitatively about the data and appearance that particular letter forms that can be dismissed by scrutinizing the manuscript now online in its entirety. The certainty declared by the earliest scholar is demonstrably misplaced, and this is unimportant and challenging consequence of proliferation of the opening up of access to digital images.
What is important now is how this will affect the practice of and responses to the field of party geography. Such previous scholarly certainties in geography, a product of the late 19th century self defining of a field by grapes, such as. So Walter degree birch and so Edward Edward mourned Thompson they saw Pal geography, is a science capable of being separated from hesitation and doubt and firmly aligned with fact uncertainty. In the light of the then contemporary positivist turn in science of the development of photography is truth tanning and then making available of plates in the volumes of the cla or the new Paula graphical society.
The scholar reader could be assured that what they see would be what they get, of course. under scrutiny now, though images, do not tell incontrovertible truth and individual folders do not represent the full evidence of even ascribable stint nevermind an entire manuscript. letter forms change perhaps my new leaf from line to line they change according to precise context of continuity and they can transform significant be ascribed develop this style through space or time or age or proclivity. individually letter forms being excised manually or automatically for evaluation often deeply unrepresentative of certainty.
And this becomes apparent, even when looking at whole digitized manuscripts. And under parent, even at scale and perhaps especially at scale so I’m referring to specifically individual graphs extrapolated for penny a graphical letter by letter analysis. In a project like Eric crackles quantitative work on script in the last century it's very useful in promoting particular kinds of trends. But since scriber hands I’m individual lifelong and often and localize double even certain predominance is or trends fail to provide as much as an assurance that that particular letter forms nevermind a certainty. Currently, then the more that's revealed through digitize efforts of digitization efforts of repository. The more that we discovered about what we do not in fact know and actually have never really known, so the more that's revealed through digitization the more is discovered about what we don't know and have never known.
Moreover, the more that's open access and scrutinize double the less early scholarship can be taken as providing the shore foundation, and this is not to diminish the work of the early scholars, it was groundbreaking it still has a significant that is. Critical to our work going forward, but it is time to respect to the challenge certain kinds of paragraph more grand narratives about script, particularly those relying principally on individual graphs or letter forms or on categories and labels or on issues, the static form and grade. The aims of Paleo graffiti art, to quote the manuscript scholar Julian brown first to read ancient texts with accuracy, secondly, to date, and localize their handwriting. In the first case, to read ancient texts with accuracy is of course one area where computational tools have really become to make a difference. But I would argue that it is not the primary concern because to think of reading the content of the manuscript without its context is to perform.
extractive scholarship and I would hope that we're beginning slowly to get past that now acknowledging the significance of the dimensionality and materiality of the textual artifact in its wholeness. The digital material that may be treat script as if it were font flat and transcribed bubble, whereas income substrate is three dimensional, but this is how textual editors have always treated manuscripts as flat landscapes for extraction. So, to the dating and localization of handwriting, this is what Paleo has tried to do and what thus far machines, have not been especially useful in a system in traditional palio graffiti data manuscript. features of the handwriting and the whole physical makeup of the book or document or assessed in the light of what is already known.
Uncertain though that knowing may be to reduce uncertainty, the scholar will often begin with a dated hand and it can be relatively subjective and it's obviously limited by the evidence that we have. The most recent of Kevin kiernan publication on the base of manuscript, for example, is in the spring 2021 issue of manuscripts studies published by the gender. In his article kiernan returns to the issue of the dating of the second hand in the Bay with manuscript London British live because got manuscript potosi 15. part to image on the slide in this long quite brilliant article Karen and comprehensively dismisses David denver's dating of this scriber work is pretend 16. David duckenfield categorically says the manuscript cannot have been produced prior kind of been produced after the rain of King addlestone I will read our words. And kiernan thematically lays out the evidence from dated charters manuscripts that can be contextually do dated.
He shows, please read the article he shows unequivocably, to my mind that Vitaly is a 15 is in fact updatable to post 1016 or could very well be database post 1016 to therefore constrain. The angler Scandinavian king and that of course dramatically transformed it's reception the manuscripts reception the poems reception. They will reception and it's interpreters of potential and via ultimately the precise day saying will never be known. I would argue, it can still be made less than certain through this kind kiernan kind of careful precise and holistic detective work. And, of course, is digital work and still, we must acknowledge, even through this.
wide range of evidence that kieran and brings to bear that all scholars have their agenda their own methods and their own determinism evidence gathering no data, as we know, is ever neutral, but neither is any scholar. In most cases, the work of the palio Griffin involves many specialist skills textual evaluation language skills, knowledge of the practices tools and portion of scribes. understanding of writing environments and potential institutions of production, even with expertise in the scholarly skills which are increasingly more supply. And even with a data textual object, with which to work.
And an excellent knowledge of the types of scripts that one might expect to discover, it is a field palio buffy manuscript studies built on expertise and authority. Assessing assessment often involves uncertainty and it always involves opinion opinion we know is not equal to authority, but can all too often be. As many of us have to deal with treated as if it were, and so palio griffey has always had its critics you see it not even as an ancillary field, but as a parallel mode of scholarly investigation. And in an article published in 2018 and based on his john coffin lecture as john edwards is determined to crush palio griffey.
As a single mode of study, seeing it as lacking authority uncertainty as filled with possibilities and perhaps is. But I feel as issue in that article is less about the validity of the discipline as a whole. And more with a way in which paragraph is explained their work and collaborate with other methods of study, like textual criticism.
In this, then the work of pornography is akin to the work of a computer scientist, the method can complete can quickly become a black box with expertise and specialized knowledge. having to be taken on trust by those looking at the outcomes of a project or computation so uncertainty here, in this sense is viewed as negative and expertise is distrusted. But surely there is another component to all research that is important to acknowledge, both in computer science in physics in music and in palio coffee, and that is of course as i've mentioned the the scholars own agenda.
In the case of the article by St edward's. What is palio griffey for is the name of the article and Edward seems actually ultimately desirous of dismantling the work of specific medievalists you ascribe the theory that Adam pincus was choices scribe. Thought to be responsible for this manuscript penny off 392 D, the hand good manuscript choice manuscript in aberystwyth.
He was also the payment of numerous literally manuscripts and the data 14th and 15th century and for those of you who have not been following this very long standing debate there's a lot now being written on it. And each kind of person who contributes has their own scholarly driving agenda. It was criticism is is aimed justifiably up the making certain of that which cannot be certain assertion rather than suggestion of mooney and to entry for pink hearst and the Oxford dictionary national biography Edward says, and I quote, that she has. created a trap for the credulous that can cause unquantifiable trouble for posterity given the authority, the work the Oxford dictionary national biography possesses and the authority of their by gives here to questionable polygraph assertion. Yet this particular issue of Adam pinker's is not an issue that can be laid squarely at the door of an entire field so palio buffy is trivialized by some researchers and Edward partially warns us of this, because opinion becomes evidence, he says and uncertainty is the only certainty.
But opinion is very often what passes for evidence and scholarship and the allied fields with which edwards wants palio prefers to couple. are also broadly interpretative subjects specializations like editorial and literary studies. Now I don't recognize the narrowness of Paleo professors he describes it, but I think he is writing point for expensive scholarship. accrual of evidence based assertion and collaborative spirit, and this is something for all of us to take on board going forward, but in the fashion of mutual respect, mutual respect of disciplines subfield. And most importantly, the appreciation of certainty of uncertainty, not as something to be dismissed and made negative, but actually as a contributor to move scholarship forward as a productive mode of research.
less fearful of uncertainty and more certain of its representation of human endeavor. is critical for Ai being best fearful uncertainty is critical for Ai, and this is where there can be fruitful collaboration between machine learning artificial intelligence and manuscript studies in its broadest sense. There must be acknowledgement that transparency not obfuscation obfuscation of parts of the mess mess method is vital, and there must be care taken to admit the subjective nature of data processing whether it's manual or computation. That is, we require an acceptance of the generative uses of uncertainty.
And certainties exist in abundance in machine learning and Ai but a submerged beneath the facade of results that can masquerade as fact. These uncertainties, is my colleagues mark algae Hewitt at Stanford and Mohammed chariots for the University of Quebec have pointed out. are different from that which we do not or cannot know about a topic phenomenon or in my case, a manuscript. uncertainties in manuscript studies range from context list books and their contents.
To inherited labels for model scripts as kiernan discusses in relation to owning your squirming score in his article on the build scripts. lack of clarity and scholarly uncertainties about categories of scripts and various forms of terminology. Uncertainty exists in the ways in which one can go about the identification of scriber hands as a St edward's discusses with regards to the rumbling on of persons in shorter scribe will circle.
In pre conquest manuscripts studies, this kind of scribe identification has particularly important implications for understanding what kinds of writing environments existed affiliated to or situated in religious institutions. In the case of Winchester in the earlier 10th century in the post are free and effort to establish a set of texts or freeman acquiring literacy. it's important to establish how many scribes who engaged in this effort of manuscript production and i've written briefly about this case in the context of an article that I wrote about called the role of polyolefin manuscript studies which came out. In late 2019 I think in the Lisi Oliver memorial volume this manuscript corpus Christi coach Cambridge one send three is the Parker chronicle of the. Pre conquest English Chronicle and there are, it is said, one, two or even six scribes writing folio 16 verses 25 verse. of Cambridge corpus Christi college 173 now this has implications for the assessment of the writing offices Winchester either, if there are six scribes as a well staffed highly trained group of scribes endeavoring.
To get primary text produced or there's one or two scribes in a much less professionally provisioned writing scenario so here's the. here's the argument this kind of situation typifies uncertainty in manuscript top technologies we don't know who or where or when or how. But we do know where this book and its associated manuscripts was made in actuality and I have shown them with David dumbbell. That there is one scribe at work here, and my method different my method differed from anybody else's rather than they were they were previous scholars, have been working on individual graphs and based on the assessment of letter forms.
Rather like Japan and polo graphical study Jen generally, but my method was that the assessment at the level of the next team incorporating the space between letter forms and graphs. And this in itself the lexi lexi approach is one that can be well suited to automated detection and possibly even as a byproduct of new tools for handwriting recognition. So let me show you another example that derives from our NIH funded part of the project called global currents and at Stanford our team completed the funded phase in late 2018, and this was the last part of the larger international collaboration. spearheaded by Professor Andrew Piper at mcgill and involving teams to scholars exploring diverse textual traditions, to see what could be discovered.
Through automated processes, the feature extraction our corpus of manuscripts was supplied from the park library old English manuscripts. A collaborative project in Stanford University in corpus Christi college Cambridge consisting of 210 manuscripts data between 1016 1220. And 63,000 total page images an early collaboration in the project was between Professor Lambert Schumacher and his team at running gun and the Stanford team. Who are developing the automated handwriting recognition to monk using training data supplied by Stanford undergraduate research assistance from a range of these tall century manuscripts. Now I was much less interested in the end, and how the actual reading process was effected.
Then in the byproduct, which was the in the identification of scribe through dialect or and or particular practices in their writing. And this is demonstrable at the level of the lexi the unit of meaning and its efficiency as a method of identification. Results from not simply the graph the individual letter form and it's scrutiny, but rather from the letter form plus latent adjacent see that is that which lies between letter forms space and contiguous see. This Green reinforces that research that I showed you on the winter subscribes, which was a mutually benefit, that was a manual process, so it became clear to me how. beneficial automating that process this global recognition would become. This part of the project, though, was superseded by a focus on feature extraction and i've discussed different aspects of this on a number of occasions.
Andrew Piper and Mohammed chariots have recently published the findings of their particular research focus to which was on the feature extraction. Of I think it's the 18th and 19th century footnotes from a variety of text or corporate. But what did the medieval part of global currency achieve, and how does it assist in mitigating or highlighting uncertainty. it's difficult to qualify or quantify the moment because why the data exists in the public domain we're only now returning to its analysis and considering our next steps. But the research questions that motivated Stanford global currency concern the key moments of change in these on posh page layout in manuscript production in Western European 12th century. And it's in this century, and I say this with certainty that information retrieval tools that are still a key feature of textual production were first brought into combined regular and consolidated use.
And among these are running headers clear textual demarcation through rubrics intellectual space and enlarged or decorated initials. And these have been seen in various configurations previously, but in the 20th century, new scripts evolve new designs for books and new functions, the books are developed. The motivations for these changes, concern the emergence of Scholasticism increases in literacy and innovation in genre, but there is uncertainty about how these changes were affected and disseminated and the timeline for these changes. is still uncertain and what the impact might have been across writing institutions. By categorizing key elements of news on pars and providing training data through mobbed at pdfs we wanted to see if machine learning could automate the extraction of these elements. Now we only have time in the process, the two year grants sort of latter part of the two grants to look at four principal features, as you can see, and the team's.
collaboration with Mohammed chariots synchro media lab proved immensely successful we tested the capability of computational processes to find in retrieved. And we learned as much from the errors that were returned as the positive extractions. And our data has been transformed into a palatable selective gallery of tiles behind which, if you click on them sits the full folio for context. And this is, of course, where the work starts all of the uncertainty present in manuscript studies really emerges now. In the labeling in the fact that we know or cannot know in the ways in which these textual materials are described and handled by their various repositories.
Is it actually impossible to standardize this kind of work without an overwhelming degree of data cleaning not if we set the parameters effectively. So there are all kinds of immediate parcels of information to be garnered from cursory work and a rapid pace pace on these tiles, and this is kind of visualization. And as an automated process can really change the way that we perceive the manuscripts we've worked on.
In terms of the big bibles that were popular in the middle decades the 20th century, the berry and the do have a Bible being chief among these in the Caucus connection. just looking at this slide, this is the berry a snippet of the berry Bible. You can see that the color yellow was used relatively frequently, and in fact it was used much more frequently in the do have a Bible been in the berry Bible. Which is shown with this set of tiles since the berry Bible was made for the priory of St martin's Dover by its mother house Christchurch Canterbury. And since yellow is only account only accounts for naught Point seven 5% of the total of our sample data. We can deduce from this quite simply that where the outlier yellow appears in other mid 20th century manuscript from England.
The first possible localization to be tested must surely be Christ church, so this is feature focusing at scale and in ways, through which these elements, not usually seen. And we indirectly discovered a great deal about categorization, as happens when training data has to be supplied we discovered that the literal notes ability or the. embellished initial as a descriptor was not enough and the category had itself to be subdivided.
But we also discovered that further subdivisions of practice from literal literal notably or decorated initial to enlarge capitals and then other forms of capital. But also essential for clarity we also discovered that errors that we returned through this automated processes are as interesting as precision. This the algorithm occasionally thinks that parts of literary, notably or is or enlarged capitals are the whole feature as you can see, on the right there. And just show you with my cursor here. And also that furthers which might suggest that further subdivisions of letter forms might be a useful way of moving this research forward, and we also discovered that errors like. holes in manuscripts that are OSHA have thought of as oh shaped by the machine learning process so is there a way in which we could develop this kind of technology to focus at scale on particular elements of a graph or letter form or, in fact, to identify and categorize damage to membrane.
For dating and localization potential we discovered the rarest kinds of initials. So here's a lot of teas and I am really interested in this form of tea here on the lower right and we discovered that the various kinds of initials. can help us in thinking through localization of manuscripts, so I am beginning to discover that these appear to be southeastern in origin and detachable to around 1170 or 11 AC. And also it's interesting to see how automated searching itself kind of visualizes textural materials it's interesting to note that there are particular kinds of clustering. Of scriber practices in particular manuscript volumes, such as the use of the fancy brackets that you see here in the latter choirs only have the do have a Bible, so these things leap out in this visualized representation. What does not leap out as with pat pat geography through graphic analysis, rather than through the the lexi mother word.
What does not leave out is white space, one of my main areas of interest is the empty the blank the space between texts and in the margins what goes into building the book behind and beyond the ink and this form to one of our four principal categories of exploration. This was undertaken to determine if textual separation become systematized and the 12th century as I in fact anticipated. Rather, remarkably, though with the lowest precision rated 60% into textual into textual space was detectable through our algorithm I don't know if I thought it would be, but it, but it is. And a blank of tiles in fact shows that nothing is is actually blank space. The algorithm detects space that follows a punctures whether it's a small space, as you can see here or more expensive expensive, but even the galleries show that the detection of space does more than find text listeners. These galleries show how the program envisages color, though, so it shows it also shows show through as you can see, on the other side of the folio.
But recognizes that show through is is not the blankness does not come to the blankness on the page that it's actually examining. So show through does not get picked up, which is good but less good is the fact that pale color favorite color ink does not get retreat and is anticipated as being blankness okay here in the prefer to or here in these pale blue. rubrics so the blank blank the Bank of tiles also demonstrates the very, very varied range of off whiteness of the substrate. And all of this kind of lead Mathias with penske at hyatt University in Berlin, Mohammed Chariot a codec and me to think about.
The way that color is depicted through these images through this automated process and to begin the development of a new project called global Hughes. which would deploy multi spectrum and terahertz emerging to determine granular color and to bring to the fore through feature modeling the dominance of visible and invisible patterns. In the layered folio of the already existing triple if images it's clear from this work that we can now move a paste other related research questions or based on questions that emerged. From the dominance of uncertainty and manuscript technologies, that is, can we automate the process of dating.
Taking core components determine palio graphically the scholar would need to train a model and specific features across a large enough set of materials to do that work. Could this be done through the hands of the scribe perhaps components of the hand, coupled with color recognition of ink can we correlate this date. This data with other features of museum posh decoration and illuminations the visible writing grid. With teams need to split out the features, for which we had a degree of dating and localization a lot of this these questions. emanate from conversations that i've had with Benjamin or Britain at Stanford university libraries with mark algae Hewitt in English department at Stanford with meditation for penske.
oriented enrolled Andrew Prescott, and other scholars. Are the advances in knowledge likely to reveal likely to make a real difference in the authority of computational power your briefing code ecology to reveal. What the iron simply cannot accomplish how much authority does data provide in relation to the information retrievable by the loan expert. whose expertise is called into question by skeptics like St edward's and others. The deterministic of single that singular approaches and the absolute need now for collaborative enterprises as well illustrated.
Briefly, by my last case study medieval networks of memory, and this is a project that's been running for the last three months, managed by meat and potatoes pinsky with for undergraduate research. Assistants entering data from two independent textual witnesses to the class of medieval artifact called the mortar evil. It aims to reveal a new and dynamic picture of 13th century, religious and social networks and Community collaboration. achieved through the describing mapping visualizing visualizing and analyzing of this role London bridge slightly edited manuscript 22849. The multi role of Lucy price of heading and which is database to about 1225 to 1230. And Cambridge St john's college manuscript and 31 the more tree role of amp Felisa progressively or church database to circuit 1210 to 1225.
These more tree roles comprise hundreds of written prayers setting comin commemoration of the deceased. Prior cases and entered by scribes of houses visited by the brevity at all the carrier of the role. we're interested, also in trying to find women scribes and one of the most fun certain phenomenon of all in this tradition in this period. So our research will show which English institutions of men and women were united in their efforts to remember prior so Lucy divya.
and privacy and police and how they sought to inscribe their respects to them the roles reviewed, a great deal about the varieties religious houses, the nature of scriber practices and, as you can see these these two entries are from each role. But the same House same sub sub focus in come to cancel Canterbury and they also show something of the esteem in which a nun Mike Lucy or for the symbols house. So our team is producing a database and let me just run you through moving example which retails for penske has prepared for me. So we're producing a database that contains information for each inscription and religious house written into the role. Creating manipulate data for an interactive map behind which will be location or descriptive and evaluative evidence.
Such data permits and much close the account of spiritual networks in this period, together with an assessment, the religious houses resources and abilities to connect perceptively with each other. New questions already emerging from this projects work that we hope will allow subsequent innovative research on the hoodie women. Their communities and their scriber capability script types and trends in the earlier 13th century and the significance and methods of collective memory formation in medieval England. And then, like so many other projects like manuscripts rather like those and global currents with the more tree roles we have many certainties, and we know roughly when. They were produced by whom and why, but how they were produced and what they transmitted at the status of writing and institutional representation is yet to be determined.
So, to conclude, there is a great promise in machine learning and Ai it heralds the autumn, as the automation of processes that improve on humans work. In including clinical workflows critical workflows or predicting legal needs, but there is a substantial cynicism about its safety Ai its potential to be unbiased and it's regulation. Presently imitate machine learning and Ai almost entirely stem centered with some social sciences input and the unfortunate conflation of humanities. under the umbrella of ethics, which is everywhere in relation to Ai at the moment, cultural heritage, the record of human endeavor has had very little impact on. machine learning and Ai but as i've discussed here, where this where this world of the algorithm really falls down. is in its inability to narrate stories to convey nuance and emotion, to understand ambiguity and uncertainty and it's why the humanities research i've shown is important.
Uncertainty is our bread and butter, and we should be advocates for use in Ai advocates because uncertainty unpredictability nuance and idiosyncrasy is how humans function, we are not machines and machines cannot be us. So this is the moment for scholars to come together to collaborate in interdisciplinary teams, with mutual respect for distinctive domain expertise. Plenty of lip service is paid to this ambition that humanists must be meaning for the integrated into broad scholarly. ecosystems, however little in the last three years has occurred to demonstrate interdisciplinary it in practice we don't talk such different languages, there are commonalities that must be a shared and shared. Principle among these this uncertainty, the bravado new world we've created in the last few decades is not living up too much. we're learning that a world of innovative technologies does not bring unequivocal benefits any kind of global health for harmony.
or social justice technology is not in charge, though it'll take acknowledgement that scholars all disciplines and all periods are needed to share their skill sets and improve the human condition. But you can only improve the human condition, when you already understand what is meant by it, we ignore and undervalue humanities medieval studies and the human record then at our peril Thank you. Thank you elaine for that fantastic talk and you would have to imagine a wave of applause running around the room, but unfortunately the nature of. Here the setup here is that I think nobody is able actually to to to applaud either by by reaction or. My bike clapping but on their behalf that I would just like to say thank you so much, I think there are many, many things in there, which.
We could fruitfully discuss for for some time, just to say to the weather right there is a possibility of reaction so. We have about half an hour, you have about 30 minutes so some 30 UK time, so there is plenty of time for all of you in the audience, to be able to. To raise questions, if you would like to do so there's been as vicki has done feel free to raise a hand in the attendees list the way this works, however, is that. The quite the Q amp a button is the way to to ask questions as it sounds, at the moment, so if you could, if you have a question, if you could put it into the Q amp a box and then i'll make sure that. line will be able to see it as well and i'll make sure that we get a response, so I don't know vicki I guess what has the question if you can put that please into into the Q Q amp a if you can open the question in the Q amp a and then we will address that and any others. Which which come up.
I’m sure there are many other questions I’m going to abuse my position hasn't traditionally the case. just to say a few things first of all, I cannot ECHO more what it is that you said, and I know about about many things about transparency that reciprocity and. Many other abstract nouns, but I would also especially like to say. To echo what it is, you said about her disappearance about the threat to them, but I think fester in particular. And I’m sure many others of us would stay the same, and vicki didn't actually have a question she just wanted to say thank you.
I would like elaine just to kick this off, perhaps by starting at the macro level, and you mentioned a couple of times about reciprocity and the respect for domain expertise and I think. We we would all echo that I just wonder, are we speaking to the right people, so the world in a way we who are humanists are trained to respect creativity and uncertainty as things which are positive. But are we still just talking to ourselves.
And is there a way of communicating that more widely, that we are perhaps missing, and I appreciate this is not really the objective of your talk, but it just it was something that struck me on the macro level about what it is we. also met evil us, but as we as as humanity scholars what to be doing to place the human in the the algorithm showbiz. yeah I think that's probably the $64,000 question isn't it because I’m.
That question are we are we speaking to the right people not not not here now, no Aengus because you know, a spec were that we're all humanists humanistic inclined here and already know, this issue is a major problem. And how do we speak to the people, we need to speak so first of all, you know we, we need to kind of confirm this among ourselves that this is, those are people who are interested in in. automated processes, whatever they are. Those people should really. kind of I think loudon the surge of voices so humanists themselves need to.
really respect what other humanists are doing in different fields and also social sciences and I just I don't see that as a kind of side but, honestly, you know the social science. Social scientists can be as bad and not including humanists as anyone in any other sort of sets of schools and and again sort of vice versa, but also social, as I say, humanities is has been kind of really diluted Lee. put under the umbrella of ethics and because ethics and Ai is such a huge topic at the moment and Ai CS engineering. Colleagues, think that they are including humanists and their work because that's not that's not necessarily the case at all, because a lot of ethics work is done by.
By social scientists and besides it's not the whole of humanities, so how to get our voices heard that that really is that really is the question certainly at Stanford and in other places where we're working with carburetors. The loudest the squeaky wheel gets the oil right So if you keep on and on kind of shouting about it, I think, at some point there's a kind of we better just pay a bit of attention, I certainly was hot is is eventually happening here after. A goodly number of years i've given presentations that are very technical to. The human centered artificial intelligence institute here and the data science institute here and they all applaud me all applaud me think it's lovely and Nice pictures.
And then I never I never hear a thing, and I just think well, maybe it's the way I’m says got to be. The way I’m selling it, and that is, that is, that is the issue it's the I think it's it's partly that we don't recognize the value of our own work, and that might be as a result of decades of battering. From funding agencies governments our own institutions, there is no doubt that this we are in an absolutely critical tipping point moment for many humanistic. specializations in America and in, certainly in the UK and, possibly, I think, actually more broadly in northern Europe. It is absolutely critical that we galvanize ourselves now and start to. Really kind of knock on doors and sing our own praises and show why the work that we do is so is so important, and I think one of those key areas would be in ontology is epistemologies and type allergies right our work on categories are and our work on.
Things like uncertainty and open indigenous. It seemed I just you know I think we need to muscle in where the action is and to keep and when you don't when you apply for grant and you don't get it don't accept that just kind of. Say demand feedback find people who will collaborate with us, there are lots of human humanistic inclined CS involves the students in our work in our labs and in our just our projects, wherever they are. So I think a multi pronged kind of approach because right now know where our messages not getting through and i've said this before and it's a little bit rude but i've said again.
What I think really needs to be made a parents through the. cloud player. positive message that we send out what needs to be made apparent. Is that humanities, did you just because you're a human being doesn't make you a humanist just because you can read a book doesn't make you a history scholar just because you play. The flute on a Friday night with your orchestra does not make you a humanist right, these are skills that have taken us years to acquire What are those skills. If I hear critical thinking, one more time I think my head will explode, because that that what is, what does that even mean and are we saying, the only human list of critical skills of course not.
biologists are critical skills mechanical engineers have critical skills So what is it that humanists do without which frankly. The scholarly world would be much poorer, that is what we have to articulate and I know shape the British academy has the shape or in shape or whatever project and. Various kind of conglomerations of people have got together to do kind of save the humanities, and you know shaped out for the humanities kinds of initiatives, and I think those are all fine, but the professional associations need to lobby senior scholars need to lobby in their own institutions.
We need to involve a kind of transgenerational. ripple effect, that is, that is loud because I again and I think this is true, the squeaky wheels will get the oil, and so we need to be very, very squeaky wheels. There, though I couldn't agree more, I think perhaps I was going to be even less polite and say, we just need to be more shameless.
About about what it is that we do, and why it is important, because then you guys sorry I’m taking over this discussion and there is a. there's a specific question from camellia Mr Oliver who I don't know I don't know if you can see that, but she was just wondering if. You all to take into account the size of letters and manuscripts and how different plates and then could possibly affect the handwriting of a certain scribe. and So.
I don't know what different plates to make different plates in place in volumes. So the size, the letters in manuscripts well. I am obsessed with size, I can tell you that phenomena, logically, I think Size matters, arguably, in the first in the first kind of layer of interpretation Size matters more than more than anything, and. It matters in all areas of life, so I bought recently and artists book of chauce