Computers & Texts No. 11
Table of Contents
March 1996

Computer-Assisted Literary Analysis Using the TACT Text-Retrieval Program

Guyda Armstrong
Department of Italian
University of Edinburgh

TACT is a text-retrieval program which allows the user to work with a text in order to discover the location of lexical units and internal patterns. It is therefore a potentially sophisticated tool for literary analysis, permitting investigation of the text's stylistic, grammatical and lexical features, as well as acting on the simplest level as a computerised concordance. This article describes how TACT has been integrated into the teaching of Italian literature at Edinburgh University.

The TACT program has now been incorporated into the Italian undergraduate teaching curriculum at Edinburgh University as part of the second year literature course. The text chosen for this is Machiavelli's Il Principe (The Prince). The course is taught in the traditional way, with a series of lectures on the text and its context in Renaissance Florence, but parallel to this, the computerised textual analysis program is also presented as a means of familiarisation with the book. Students are therefore able to apply the knowledge they have gained from their own reading and the course of lectures to the textual database of Il Principe, to investigate key words and concepts that they have identified, and come to an understanding of their meaning.

Il Principe was selected for this work not only because of its place in the second year teaching curriculum, but rather because it is a key text in the development of political thought. Machiavelli, in this work, creates new meanings from old words, and through this lexical development we can see the beginning of political concepts which have become fundamental to later generations. However, it is this semantic development of terms which proves most problematic for students to grasp, and so the TACT program offers them various ways to analyse the occurrence and context of these new terms.

By using certain features of the program, students are able to juxtapose textual data in such a way as to establish relationships between words which illuminate each other. The intellectual purpose of this exercise is to develop the ability to select relevant critical concepts for investigation; the student must first identify clear critical concepts, and then translate these questions into a series of commands. The fact that the program only responds to clearly specified commands has the advantage that the students are encouraged to approach the text in a creative but organised manner, considering the whole text, but also predicting specific results and testing hypotheses. The real value of the exercise lies in the speed and ease with which the student can prove or disprove his or her 'hunches' about various aspects of the text, in a readily accessible graphical format.

The Machiavelli database has been placed on the menus in the Languages CAL laboratory, and so all students of the Italian department (and other language departments) have access to it. However, many students who are not following this particular course have expressed an interest in using the program, and are therefore free to familiarise themselves with it in their own time.

TACT Basics

The TACT program does not work on a simple text file; instead it demands a textual database which contains not only the full text of the book, but also complete indexes of all the positions of the words in the text. The textual database also contains all the information about the text's formal structure, for example, where a new chapter begins, or the position of the chapter heading (which provides a summary of the theme for each chapter, and is therefore significant to an understanding of Machiavelli's work). Once completed, this textual database cannot be modified; however, the TACT program allows the users to create a parallel personal database where they can store and manipulate their findings.

The TACT program is menu-based, and is controlled in the first instance by using the 'Action Bar' at the top of the Introduction Screen. This allows the user to move along the menus in order to implement a command. The Select menu allows the user to view the Word List of the entire text and to select a keyword from it. More sophisticated searches can be done using the 'Autoselection' command, where combinations of characters or words can be specified.

Once the lexical item is selected, TACT offers a number of different displays where the results of the search are presented, for example, to see which adjectives are used with particular terms. These displays are found in the Current menu and include:

From this brief outline of some of the functions of the TACT program, it is obvious that such an application offers many advantages to the literature student (and teacher). Electronic text analysis is of value to the study of structure, style, vocabulary and content of the text. It can evidently provide no analysis of the meaning of the text alone, but in response to an intelligently structured command, it will provide quantitative data for critical appreciation. The TACT program encourages sensitivity to the linguistic content of the text, and allows the students to carry out their own independent research from an empirical basis.

TACT and Il Principe

The second year students were introduced to the TACT program in the two weeks of their course in a 'hands-on' introductory session. Many of these students have had no experience in using computers, and so the group of 45 was divided into three smaller groups so that everyone could benefit from individual attention from the demonstrator.

The course was organised so that every student had to attend one of the presentation hours, and then follow this up with further work done alone in the public access computing labs.

The first session consisted of a guided exercise where the students were able to familiarise themselves with the various functions of the program, allowing them to work at their own pace but to achieve a tangible result at the end of the session.

The exercise that was devised for them centres around the key concepts of virtu ('prowess') and fortuna ('fate'), and the associated concept of prudenzia ('caution'). They investigated these words using the various displays, initially individually before moving on to more sophisticated searches using them together.

The final part of this exercise was to compare the distribution graphs of all three words and draw some putative conclusions about the overall structure of Il Principe. As an example, I have included extracts from this exercise.

The students were asked to select virtu from the word list, and consider it using the various displays available, particularly the collocate display. This completed, they were then asked to repeat the process for fortuna.

The word list shows immediately that virtu occurs 59 times in the whole text, and also shows the words which are lexically linked to virtu, such as virtuosamente ('virtuously'/ skilfully), virtuoso (able), and even virtuosissimamente (extremely skilfully). The next step for the student is to present this keyword, and so the Index display (one-line concordance) offers an initial context of all the occurrences of this word (see screen shot 1). This data can be further refined using the collocate display, which presents the results as a table of percentages. The word with the highest Z-score associated with virto is fortuna, which occurs 54 times in the whole text, 15 of these within a margin of five words of virtu (see screen shot 2). This is not surprising when we consider that one of Machiavelli's crucial arguments in the text is to demonstrate how human ability (virtu) is able to overcome bad fortune (fortuna). Machiavelli considers fortuna to be an intractable force which wreaks havoc on human affairs, but which can, exceptionally, be mastered if one possesses enough virtu.

The TACT program therefore makes no startling discoveries when it demonstrates the relationship between virtu and fortuna, but it does offer evidence which can be used to prove or disprove theories, or 'hunches'. The above example is very useful as a starting point for the exercise because it demonstrates that the relationship between virtu and fortuna (of which the student is already aware from the lectures) can be deduced from the data in front of them. Of course, the program is not fool-proof and may easily miss relevant evidence if it does not fall within the strictly specified parameters of the search. The above search, for example, would automatically exclude any occurrences of fortuna which lie more than five words away from virtu, even if they are part of the same phrase. TACT, however, allows the parameters of each search to be modified according to personal specifications, so the context search could be done several times, with widening word-margins, in order to avoid this problem. The change of emphasis produced by these 'innocent' modifications can be quite revealing.

After they have produced the display, the students are invited to make judgements on the co-occurrence of these two words and use the data provided by TACT to consider the relationship between these keywords. A further search is to create a distribution graph for each of the words, where they are able to locate which chapters have the highest use of these words, and to analyse this lexical choice in the context of the text.

The second part of the exercise concerned the word prudenzia, a keyword which is conceptually linked to virtu and fortuna. The students were asked to consider this keyword's distribution and draw conclusions about Machiavelli's use of the word (see screen shot 3).

The word prudenzia only emerges as lexical choice in the later chapters of the book, and from their own knowledge of the text, the students should be able to identify that Machiavelli does not advise 'prudence' in the earlier chapters which concentrate on obtaining a state. The keyword does, however, feature in the later chapters which are concerned with the maintenance and ruling of the state, and so the students should ideally be able to get an idea of the logical development of the themes of the text, and its overall structure. Thus Machiavelli advocates 'throwing caution to the winds' when obtaining the state, and prudence when maintaining it.

Most students were comfortable dealing with the quantitative information generated by the displays, but had more trouble when forced to draw qualitative conclusions from them. However, this problem is certainly not limited to computerised texts.

It can be seen from this example that TACT can only provide evidence, and not concrete answers. It cannot compensate for inadequate preparation, but it can offer a new perspective on the text, and produce unexpected leads which can be followed up elsewhere. From their work in the lab, the students can continue their work in more traditional ways, by investigating the etymology of these keywords, the development of these concepts, synonyms and associated words, and so on.

One advantage of using the TACT program with the Machiavelli database is the fact that the students are forced to investigate the text in the original Italian. Because of its linguistic difficulty, Il Principe is taught to the second year students in translation, but by using TACT the students are immersed in the original language and so constantly develop their linguistic knowledge.

The motivation in using TACT to teach Il Principe is predominantly academic. The program is most useful as a tool to quickly prove or disprove the user's 'hunches' about the text, and also for its quick contextualisation. However, it offers no sure answers, and can only prompt the user to draw conclusions from the data provided. Far from absolving the student from having to think about the text, the Machiavelli database demands a logical thought process when forming commands to use the program, and this skill can subsequently be applied when analysing computerised and non-computerised texts.

The students are given the option of writing an essay paper based on their original research. TACT is very user-friendly in this respect because the personal database created by the student can be saved as an ASCII file and accessed by any word-processing program.

Some of the searches which TACT performs could, of course, be done with a simple wordprocessor, but the TACT program is much faster and offers many more features, the most important of these for our purposes being the swift contextualisation facility.

I have also recruited a group of students who are particularly interested in this method of literary analysis to assist me in my own research towards developing a coherent pedagogy for teaching Italian Literature using computerised teaching aids, and would be interested to hear from anyone who is engaged in similar work.

We hope to later apply this work to an English-language textual database of The Prince, which will be of use to students of political science and history.

Though computer-assisted literary analysis is a useful end in itself, it becomes even more so when university resources seem likely to continue to be under greater financial pressure. The Machiavelli database allows students to investigate the meaning of the text from an empirical basis, and therefore continues the tradition of academic literary analysis in a new medium which is well suited to the demands on resources.

References

1. Virtu has a range of meanings in the context of Machiavelli's work, and there is no exact equivalent in either modern Italian or English. Among its meanings in Il Principe are 'prowess'; 'ability'; 'courage'; and 'intelligence', depending on context.

The project outlined in this article was done using TACT version 1.2. This version of TACT and also the newer version (TACT 2.1) are freely available from the Centre for Computing in the Humanities (University of Toronto) web page: http://www.chass.utoronto.ca:8080/cch/tact.html TACT is also distributed by the CTI Centre for History, University of Glasgow, 1 University Gardens, Glasgow, G12 8QQ. Tel: 0141 330 4942; Fax: 0141 330 5518; Email: ctich@glasgow.ac.uk.


[Table of Contents] [Letter to the Editor]


Computers & Texts 11 (1996), 8. Not to be republished in any form without the author's permission.

HTML Author: Michael Fraser (mike.fraser@oucs.ox.ac.uk)
Document Created: 25 April 1996
Document Modified: 27 April 1996

The URL of this document is http://www.ox.ac.uk/ctitext/publish/comtxt/ct11/armstron.html