Transaction counts by age, gender, and publication Dates.
if someone read A, what else did they read?.
A spreadsheet which may help with understanding serial reading.
Another spreadsheet which may help with understanding the percentage of Southern reading in a person's fiction reading for all years) and for 1898 to 1902 .
New transaction count spreadsheets: 1898 to 1902 (all transactions) and 1898 to 1902 (just with demographics)
A spreadsheet listing the contents of the Lost Cause corpus. Note that the listing includes 11 rows for novels for which we could not find a text. And one listing transaction counts for titles (i.e., colapsing titles with multiple volumes or copies into one row).
A second listing of transaction counts for titles. This one includes two counts: one for all transactions, and a second for transactions for which we have demographic data.
A start on time slices (scroll down for graphs). I.e.,graphs of the number of checkouts per month for the novels in our corpus.
A network graph visualizing "overlapping" Lost Cause books. Books, represented by circles, are connected if at least 15 people checked out both.
Two simple lists: one, of readers and the Lost Cause books they read, and the other, of Lost Cause books and the "overlapping" Lost Cause books.
Bubble graphs for ALL of the titles in the Lost Cause corpus, and for ALL of the authors in the Lost Cause corpus. Note that the mouse-over legend was changed from "median" to "mean"; these, like the earlier graphs, actually report mean age.
Bubble graphs (1898 to 1902) for ALL of the titles in the Lost Cause corpus, and for ALL of the authors in the Lost Cause corpus.
Comparing the lost cause corpus to the whole of the Muncie corpus. In the graphs, books in the lost cause corpus are colored red; other books in the Muncie corpus are colored blue. The next to the last graph ("T-SNE -- lda") is the only one which suggests that there may be some clustering/grouping worth looking into, although it's not entirely persuasive.
Demographic characteristings -- lost cause readers vs readers generally. I don't see much difference . . .
Network graph of readers. Readers are connected if they read similar books. Green readers read Red Rock.
Our entire corpus contains 159 volumes, and we have record of 11,838 checkouts of those volumes. 61 of those volumes (3,365 transactions) are represented by texts from Internet Archive or Hathi Trust (note the "source" column in the spreadsheet listing the contents of the corpus), and 98 (8,473 transactions) by texts from Project Gutenberg). For most of our puposes, the text source (IA? PG?) doesn't matter. However, for the chunk topic modeling which follows, it does matter, because I use paragraphs (available only in PG) as chunks, my assumption being that paragraphs mark a reasonably coherent, authorial intention of "aboutness" (i.e., a paragraph is "about" something in a way that some arbitrary chunk would not be).
Project Gutenberg-only bubble graphs for titles in the Lost Cause corpus, and for authors in the Lost Cause corpus. The graphs are intended only to give some insight into how the demographics of readers of the PG-only texts differed from readers of the corpus as a whole. This was important because the chunked topic modeling was done only on PG texts. These graphs offered us some assurance that the chunked topic modeling was relating to a set of demographics wildly different from the readers of the entire corpus.
A chunk topic modeling browser (50 topics).
Paragraph/chunk topic modeling for the Project Gutenberg texts in our corpus. Intended as a reading interface, and not an argument (50 topics).
Topic correlation (50 topics).
A chunk topic modeling browser (100 topics).
Paragraph/chunk topic modeling for the Project Gutenberg texts in our corpus. Intended as a reading interface, and not an argument (100 topics).
Topic correlation (100 topics).
Lynne Tatlock, Steve Pentecost, and Doug Knox propose to present their (preliminary) findings on reading about the American South in the Muncie Public Library, 30-35 years after the end of the Civil War and approximately 20 years after the end of reconstruction. We propose to follow the methods our team (plus Matt Erlin) employed in Lynne Tatlock, Matt Erlin, Douglas Knox, and Stephen Pentecost, “Crossing Over: Gendered Reading Formations at the Muncie Public Library, 1891-1902,” Journal of Cultural Analytics, 3.22.18, namely to study and connect readers (demographic information and book selections) and texts. We have preliminarily identified 177 pertinent titles with southern contents; we will home in on a selection of these based on their circulation numbers. Even a preliminary look suggests that the texts sort in various ways, each of these groups offering not just different takes on the "Old South" but different ways of presenting these views. We are particularly interested in tracing the presence of elements of the complex of ideas associated with the "Lost Cause" according to genre, narratives, tropes, and themes and in determining what reading selections told reading populations to think and feel. Perhaps needless to say, the presence or absence of African Americans and their textual representations will play a critical role as will more generally the vocabulary deployed in depictions of landscapes, social relations, and built environments. At the very least, we will be able to present a descriptive account, but based on this work we hope to propose what Muncie reading tells us more generally about the diffusion of cultural myths.
A chunk topic modeling browser (50 topics).
Paragraph/chunk topic modeling for the Project Gutenberg texts in our corpus. Intended as a reading interface, and not an argument (50 topics).
A chunk topic modeling browser (100 topics).
Paragraph/chunk topic modeling for the Project Gutenberg texts in our corpus. Intended as a reading interface, and not an argument (100 topics).
(OBSOLETE -- SEE ABOVE) Bubble graphs for the titles in the Lost Cause corpus.
(OBSOLETE) Bubble graphs for the authors in the Lost Cause corpus.
(OBSOLETE) Bubble graphs for the **(Project Gutenberg only)** titles in the Lost Cause corpus.
(OBSOLETE) Bubble graphs for the **(Project Gutenberg only)** authors in the Lost Cause corpus.
Two simple lists: one, of readers and the Lost Cause books they read, and the other, of Lost Cause books and the "overlapping" Lost Cause books. The reader report is easy to undestand; it's like
Knowlton, Bobbie
Adams, William T. Bear and forbear.
Adams, William T. Down the river.
Adams, William T. Fighting for the right.
Adams, William T. Taken by the enemy.
Allen, James Lane. The blue-grass region of Kentucky.
Allen, James Lane. The reign of law.
where "Knowlton, Bobbie" is a reader, and indented below his name are the Lost Cause books he checked out.
The "book" report is not much more complicated, although it's harder to explain. Here's an example:
Adams, William T. A victorious union. 137
Adams, William T. Within the enemys lines. 86
Adams, William T. Stand by the Union. 78
Adams, William T. On the blockade. 77
Adams, William T. Fighting for the right. 76
Fosdick, Charles Austin. Frank on the lower Mississippi. 63
This report says that one book, "A victorious union" was checked out 137 times. Beneath that line, the indented lines mean that "Within the enemys lines" was checked out by 86 people who also checked out "A victorious union"; that "Stand by the Union" wa checked out 78 people who also checked out "A victorious union"; as so forth.
The Lost Cause corpus is available on box, along with a bunch of old notebooks. The notebooks include the topic modeling code (see the next paragraph below), as well as two experiments which we may not have completed: 1) code to identify cliches, and to cluster texts based on the cliches they use; and 2) code to identify passages of pastoral.
A spreadsheet listing the contents of the Lost Cause corpus. Please note that the listing includes texts we wanted to find, but were not able to:
A couple of topic modeling runs using the lost cause corpus. if you look at this output from topic modelling chunks, and search for "TOPIC 13", you can get a quick sense of just how much dialect is in this corpus.
A couple of simple, lightweight search interfaces: