Greiffenberg -- sliding windows

Try to find words which are "clumped" in the Greiffenberg poem, "clumps" being a proxy for progression or word frequency changes. Do some words occur more often in specific parts of the poem?

This process looks only at non-stopwords which occur 25 times or more in the poem (159 words; called words_for_analysis in the code). It processes the poem as a set of shingles instead as a set of chunks (shingles are like chunks except that, unlike chunks as we've used them in the past, shingles overlap each other. Shingles produce a more smoothed set of bar plots than chunks. Any given word can appear in more than one shingle, depending on where the word falls in the poem; however, shingles mitigate some of the arbitrary chopping apart of the poem that results from chunks.

The process is set to run with a shingle size of 2100 and a shingle overlap of 400. This produces 10 shingles of roughly 18,000 words each. Word counts--not word frequencies--are used in determining variance, finding "clumps", etc; since the shingles are about the same size, word counts are functionally the same as word frequencies.

The process tries to find words_for_analysis which have a high variance in their distribution across shingles. Words which a high variance are words which a "clumped" in the text. For example, they may tend to occur mostly in the middle of the poem, or at the beginning and the end of the poem.

Outputs are described in detail in the section "Main process loop" below.

stopwords

I'm using the standard (i.e., modern) set of nltk german stopwords. It's not the best set, perhaps, although I don't think it makes a lot of difference . . . I also added "u" to the list.

In [2]:
import codecs, re, textwrap
from collections import defaultdict, Counter
from nltk.corpus import stopwords

sw = set(stopwords.words('german') + ['u'])
wrapper = textwrap.TextWrapper(width=60)

print 'stopwords:', 
print
print '\n' + '\n'.join(textwrap.wrap(' '.join(sorted(list(sw))), 80))
stopwords:

aber alle allem allen aller alles als also am an ander andere anderem anderen
anderer anderes anderm andern anderr anders auch auf aus bei bin bis bist da
damit dann das dasselbe dazu daß dein deine deinem deinen deiner deines dem
demselben den denn denselben der derer derselbe derselben des desselben dessen
dich die dies diese dieselbe dieselben diesem diesen dieser dieses dir doch dort
du durch ein eine einem einen einer eines einig einige einigem einigen einiger
einiges einmal er es etwas euch euer eure eurem euren eurer eures für gegen
gewesen hab habe haben hat hatte hatten hier hin hinter ich ihm ihn ihnen ihr
ihre ihrem ihren ihrer ihres im in indem ins ist jede jedem jeden jeder jedes
jene jenem jenen jener jenes jetzt kann kein keine keinem keinen keiner keines
können könnte machen man manche manchem manchen mancher manches mein meine
meinem meinen meiner meines mich mir mit muss musste nach nicht nichts noch nun
nur ob oder ohne sehr sein seine seinem seinen seiner seines selbst sich sie
sind so solche solchem solchen solcher solches soll sollte sondern sonst u um
und uns unse unsem unsen unser unses unter viel vom von vor war waren warst was
weg weil weiter welche welchem welchen welcher welches wenn werde werden wie
wieder will wir wird wirst wo wollen wollte während würde würden zu zum zur zwar
zwischen über

Load the poem

Strip out blank lines, normalize spaces.

In [3]:
poem_lines = []
for l in codecs.open('Siegessäule_corrected.txt', 'r', encoding='utf-8').read().split('\n'):
    if l.strip() > '':
        poem_lines.append(re.sub('\s+', ' ', l.strip()))

print 'len(poem_lines)', len(poem_lines)
len(poem_lines) 6078

tokenization, word counting, etc

How many times do words occur? On wnat lines?

Of particular importance is word_lines, which for every word contains a list of the lines on which the word occurs, like:

u'herr': [602, 654, 698, 768, 784, 890, 915, 926, 929, 1194, 1594, 2929, 2993, 4120, 5057, 5063, 5373, 5379, 5931,
    6066], 
u'dorn': [4801],
u'truge': [1864, 3208, 3772, 4158]

"herr", for example, occurs on lines 602, 654, 698, etc; "dorn" only on line 4801; and "truge" on lines 1864, 3208, etc.

This cell also creates words_for_analysis, a list of non-stopwords which occur 25 times or more in the poem (controlled, and easily changed, by variable LOWER_WORD_LIMIT).

In [4]:
word_counts = defaultdict(int)
n_words = 0

for t in re.split(u'\s+|\.|!|/|’|;|:|\'|-|’|\)|\(|\?|\,', ' '.join(poem_lines).lower()):
    if t > '' and t not in sw:
        n_words += 1
        word_counts[t] += 1
            
print
print 'n_words', n_words
print 'len(word_counts)', len(word_counts)
    
word_lines = defaultdict(list)
lines_words = {}

for line_n, line in enumerate(poem_lines):
    
    lines_words[line_n] = []

    for t in re.split(u'\s+|\.|!|/|’|;|:|\'|-|’|\)|\(|\?|\,', line.lower()):
        if t > '' and t not in sw:
            word_lines[t].append(line_n)
            lines_words[line_n].append(t)

LOWER_WORD_LIMIT = 25

words_for_analysis = []
for word, lines in word_lines.iteritems():
    if len(lines) >= LOWER_WORD_LIMIT and word not in sw:
        words_for_analysis.append(word)
                    
print
print 'len(word_lines)', len(word_lines)
print 'len(lines_words)', len(lines_words)
print 'len(words_for_analysis)', len(words_for_analysis)
print
print 'words_for_analysis'
print
print '\n' + '\n'.join(textwrap.wrap(' '.join(sorted(words_for_analysis)), 80))
n_words 28464
len(word_counts) 8459

len(word_lines) 8459
len(lines_words) 6078
len(words_for_analysis) 158

words_for_analysis


ab ach all allein allmacht augen bald beut bey biß blut buß christ christen
christenheit dadurch diß drum eh ehr ehren end engel erd erden erst erz ewig
feind feld feur flammen fort freuden fried frucht furcht gab ganz gar geben
gefahr gehn geht geist geistes gibt glauben gleich glück gnad gnaden gott gottes
grund gut hand haubt heer heil held helden her herz herzen himmel himmels hätt
höchsten hülf hülff ja je jesu jezt kan kommen krafft krieg kron könig land
lassen lauf laß leben lieb liebe ließ lob lust macht mann meer mehr muht mund
must muß nie nit noht o ohn pflegt raht recht reich ruh schaar schlacht schon
schutz seelen selber sey seyn sieg sieges sinn sohn solt sonn stadt stark
sternen streit stäts tausend theil thun tod treu trieb trost tugend türk türken
unsre volk voll waffen wann ward welt wer werd werk wider wol wolt wort wunder
wunsch wurd wär zeit ziel

Helper functions

The next three cells contain functions called from the "main process loop" (see below).

get_shingles breaks the poem into overlapping "shingles" (shingles are like chunks, except that they overlap). Note that shingle_size and shingle_overlap as passed into this routine as parameters, so it's very easy to change them, and to run this notebook with different settings. Interestingly enough, if shingle_size = shingle_overlap, then this routine will produce non-overlapping shingles (i.e., "chunks" as we usually have understood them).

graph_word produces the bar plots that appear below.

find_local_maximums locates the "peak" or "peaks" in the bar plots. It works with the shingle_size and shingle_overlap settings which produced the bar plots below; however, this line of code:

window_size = int(len(shingle_scores) * 0.25)

may cause the function to not work correctly with other shingle_size and shingle_overlap settings, the problem being the fixed 0.25 factor used to set window_size.

In [5]:
import itertools

def get_shingles(lines_words, shingle_size, shingle_overlap):
    
    shingles = []
    
    n_shingles = (len(lines_words) / shingle_overlap) + 1

    for a in range(0, n_shingles):

        shingle_start = (a * shingle_overlap)
        shingle_stop = ((a * shingle_overlap) + shingle_size)
    
        shingles.append(list(itertools.chain.from_iterable(lines_words[shingle_start: shingle_stop])))
        
        if shingle_stop >= len(lines_words):
            break
    
    return shingles
In [6]:
%matplotlib inline

import matplotlib
import numpy as np
import matplotlib.pyplot as plt
import unicodecsv as csv

from pylab import rcParams
rcParams['figure.figsize'] = 15, 3

import seaborn as sns

def graph_word(variance, word, n_occurences, shingle_scores, high_score, local_maximums):
    
    print
    print word, 'n_occurences', n_occurences, \
            'variance', variance, \
            'local_maximums', local_maximums
    
    plt.bar(range(len(shingle_scores)), shingle_scores.values(), align='center', color='#98AFC7', alpha=1.0)

    plt.title(word)
    plt.xlabel('shingle')
    plt.ylabel('n words')
    plt.ylim(0, high_score)
    
    plt.show()
In [7]:
def find_local_maximums(shingle_scores):
        
    window_size = int(len(shingle_scores) * 0.25)
    
    local_maximums = []
    
    for a in range(0, len(shingle_scores)):
        
        slice_start = a - window_size
        slice_end = a + window_size
        
        a_is_local_max = True
        for b in range(slice_start, slice_end):
            if b != a and b >= 0 and b < len(shingle_scores.values()):
                if shingle_scores.values()[b] > shingle_scores.values()[a]:
                    a_is_local_max = False
        
        if a_is_local_max == True and shingle_scores.values()[a] != 0:
            local_maximums.append((a, shingle_scores.values()[a]))
        
    local_maximums = sorted(list(set(local_maximums)))
    
    return local_maximums

Main process loop

This cell, which calls the functions listed in the previous three cells, produces two outputs:

  • For every word in words_for_analysis (159 words, in this run), output a line of text listing its number of occurrencs, its variance (i.e., its amount of "clumpiness"), and its local maximums. For example:

    gott n_occurences 228 variance 15.6544929059 local_maximums [(0, 69), (10, 130)]

    Local maximums are expressed as pairs (shingle_number, number of occurrences). "gott", for example, has two local maximums, one in shingle 0 (69 occurences; shingle counting starts with zero, not one), and one in shingle 10 (130 occurrences). So "gott" is clumped at the beginning and the end of the poem.

    Words are listed in variance ("clumpiness") order, high to low.

  • After the graphs, the process lists for each shingle the words which have a local maximum in that shingle. In other words, the list shows which words are clumped where. I include only words which a variance > 1.0, and with three or fewer local maximum; i.e., the list includes only clearly "clumpy" words. This section (scroll to the bottom) would seem to be the most critically interesting. The graphs serve two purposes: one, to demonstrate the method; and two, to provide background for the list of "clumpy" words. For example, "türken" appears in the list at shingle 5; however, if you look at its graph, you'll see that shingle 5 is its peak, and that it also occurs frequently in shingles before and after 5.

This cell contains a lot of commented-out code (the lines prefixed with "#"), where I experiment with different shingle sizes, check the number of words in the resulting shingles, etc.

Bottom line?

  • There's a lot of clumping, and a lot of similar words clumping, in shingles 0 and 10 (i.e., at the beginning and end of the poem). Does the poem begin and end with similar concerns?

  • There's significant clumping in shingles 4, 5 and 6, although not as much as in 0, and 10. One of these shingles (5, the middle of poem) shows that "turk", etc appears there, much as we expected. Interesting, and unlike, 0 and 10, the clumpy words in 4, 5 and 6 are different.

In [8]:
from gensim import corpora, models
import numpy as np

highest_score = -1.0

#for SHINGLE_SIZE, SHINGLE_OVERLAP, HIGH_SCORE in [[1000, 200, 76], [2000, 400, 125]]:
for SHINGLE_SIZE, SHINGLE_OVERLAP, HIGH_SCORE in [[2100, 400, 135]]:
    
    shingles = get_shingles(lines_words.values(), SHINGLE_SIZE, SHINGLE_OVERLAP)
    
    #for sn, s in enumerate(shingles):
    #    print 'shingle number', sn, 'number of words', len(s)
    #print
    
    print
    print '************************************************************'
    print 'SHINGLE_SIZE', SHINGLE_SIZE, 'SHINGLE_OVERLAP', SHINGLE_OVERLAP, 'len(shingles)', len(shingles)
    print '************************************************************'

    dictionary = corpora.Dictionary(shingles)
    corpus = [dictionary.doc2bow(doc) for doc in shingles]
    
    #tfidf = models.TfidfModel(corpus)
    #corpus_tfidf = tfidf[corpus]

    #corpus_tf = []
    #for a in range(0, len(corpus)):
    #    new_row = []
    #    for b in corpus[a]:
    #        new_row.append([b[0], float(b[1]) / float(len(shingles[a]))])
    #    corpus_tf.append(new_row)
    
    doc_word_scores = []
    #for doc in corpus_tfidf:
    #for doc in corpus_tf:
    for doc in corpus:
        word_scores = {}
        for id, value in doc:
            
            word = dictionary.get(id)
            
            if word in words_for_analysis:
                word_scores[word] = value
            
                if value > highest_score:
                    highest_score = value
                
        doc_word_scores.append(word_scores)
    
    scores_by_variance = []
    
    for word in words_for_analysis:
        
        plot_results = {}
        
        for dn, d in enumerate(doc_word_scores):
            plot_results[dn] = 0.0
            try:
                plot_results[dn] = d[word]
            except KeyError:
                pass
            
        plot_results_total = 0.0
        for v in plot_results.values():
            plot_results_total += v
        
        plot_results_scaled = []
        for v in plot_results.values():
            plot_results_scaled.append(v / plot_results_total)
        
        #  COMPUTE VARIANCE USING THE RAW DF SCORES, OR SCALED SCORES?
        #scores_by_variance.append([np.var(plot_results_scaled), word, len(word_lines[word]), plot_results])
        #scores_by_variance.append([np.var(plot_results.values()), word, len(word_lines[word]), plot_results])
        
        scores_by_variance.append([(np.var(plot_results.values()) / np.mean(plot_results.values())), 
                                       word, len(word_lines[word]), plot_results])
        
    all_local_maximums = {}
        
    scores_by_variance.sort(reverse=True)
    
    print
    print 'ALL ********************************************************'
    #print 'HIGH *******************************************************'
    #print 'LOW ********************************************************'
    
    for s in scores_by_variance:
    #for s in scores_by_variance[:10]:
    #for s in scores_by_variance[-10:]:
        
        local_maximums = find_local_maximums(s[3])
    
        if s[0] > 1.0 and len(local_maximums) <= 3:
        
            for l in local_maximums:
                try:
                    all_local_maximums[l[0]].append([s[1], l[1]])
                except KeyError:
                    all_local_maximums[l[0]] = [[s[1], l[1]]]
    
        graph_word(s[0], s[1], s[2], s[3], HIGH_SCORE, local_maximums)
    
    print
    print 'LOCAL MAXIMUMS **********************************************'
    print
    for shingle_n in sorted(all_local_maximums.keys()):
        print 'shingle', shingle_n, 'words:',
        for wn, w in enumerate(all_local_maximums[shingle_n]):
            if wn == len(all_local_maximums[shingle_n]) - 1:
                print w[0] + ' ' + str(w[1])
            else:
                print w[0] + ' ' + str(w[1]) + ',',
        print
        
print
print 'highest_score', highest_score
        
************************************************************
SHINGLE_SIZE 2100 SHINGLE_OVERLAP 400 len(shingles) 11
************************************************************

ALL ********************************************************

ward n_occurences 106 variance 18.2782117334 local_maximums [(4, 98)]
gott n_occurences 228 variance 15.6544929059 local_maximums [(0, 69), (10, 130)]
ach n_occurences 104 variance 12.0020352782 local_maximums [(0, 61), (10, 38)]
wann n_occurences 171 variance 11.6178615905 local_maximums [(0, 73), (10, 85)]
bald n_occurences 108 variance 10.932096475 local_maximums [(4, 75)]
türken n_occurences 91 variance 10.2868593124 local_maximums [(5, 70)]
kan n_occurences 139 variance 10.0681818182 local_maximums [(0, 48), (9, 73)]
herz n_occurences 109 variance 10.055078472 local_maximums [(0, 41), (10, 61)]
stadt n_occurences 49 variance 8.69672727273 local_maximums [(4, 43)]
gnaden n_occurences 58 variance 8.19865319865 local_maximums [(0, 33), (10, 23)]
herzen n_occurences 56 variance 8.125 local_maximums [(0, 13), (6, 18), (10, 40)]
schlacht n_occurences 38 variance 7.73857493857 local_maximums [(4, 33)]
gnad n_occurences 64 variance 7.70022172949 local_maximums [(0, 39), (10, 21)]
je n_occurences 43 variance 6.82406801831 local_maximums [(0, 29), (6, 6), (9, 10), (10, 10)]
laß n_occurences 26 variance 6.67736185383 local_maximums [(0, 15), (10, 11)]
wort n_occurences 49 variance 6.50474898236 local_maximums [(0, 17), (10, 29)]
heer n_occurences 59 variance 6.42657342657 local_maximums [(4, 47)]
türk n_occurences 31 variance 6.24395857307 local_maximums [(5, 27)]
krafft n_occurences 82 variance 6.23079503522 local_maximums [(0, 32), (10, 45)]
lieb n_occurences 72 variance 5.9739195231 local_maximums [(0, 27), (9, 40), (10, 40)]
wer n_occurences 83 variance 5.66885585897 local_maximums [(0, 34), (10, 43)]
ja n_occurences 118 variance 5.46021364726 local_maximums [(0, 42), (1, 42), (9, 59)]
welt n_occurences 123 variance 5.44575526255 local_maximums [(0, 51), (10, 55)]
must n_occurences 48 variance 5.38041357784 local_maximums [(4, 36)]
himmel n_occurences 85 variance 5.24209840339 local_maximums [(0, 40), (10, 40)]
buß n_occurences 27 variance 5.20068610635 local_maximums [(0, 20), (5, 3), (6, 3), (7, 3), (9, 6)]
wurd n_occurences 47 variance 4.94164350933 local_maximums [(5, 37)]
o n_occurences 46 variance 4.73863636364 local_maximums [(0, 21), (10, 24)]
liebe n_occurences 29 variance 4.41306266549 local_maximums [(0, 6), (6, 14), (10, 20)]
volk n_occurences 51 variance 4.15791379935 local_maximums [(5, 34)]
thun n_occurences 38 variance 3.98863636364 local_maximums [(0, 12), (6, 16), (8, 21)]
ziel n_occurences 46 variance 3.91051136364 local_maximums [(0, 17), (6, 6), (10, 25)]
schaar n_occurences 36 variance 3.88084232152 local_maximums [(4, 27), (5, 27)]
werd n_occurences 34 variance 3.82461786002 local_maximums [(0, 10), (5, 8), (8, 19), (10, 19)]
heil n_occurences 33 variance 3.75593775594 local_maximums [(0, 12), (10, 20)]
tausend n_occurences 55 variance 3.69440559441 local_maximums [(4, 33), (6, 37)]
land n_occurences 54 variance 3.64155844156 local_maximums [(4, 37)]
glauben n_occurences 44 variance 3.58764111705 local_maximums [(0, 16), (9, 24), (10, 24)]
geist n_occurences 86 variance 3.58713488677 local_maximums [(0, 39), (9, 37)]
seelen n_occurences 30 variance 3.56909090909 local_maximums [(0, 6), (4, 6), (10, 20)]
gab n_occurences 29 variance 3.55770470664 local_maximums [(5, 22), (6, 22)]
unsre n_occurences 32 variance 3.50707070707 local_maximums [(0, 20), (5, 3), (8, 11)]
meer n_occurences 43 variance 3.4976076555 local_maximums [(1, 24), (10, 14)]
jesu n_occurences 25 variance 3.45933014354 local_maximums [(0, 14), (1, 14), (10, 10)]
werk n_occurences 42 variance 3.45797598628 local_maximums [(0, 15), (6, 6), (10, 23)]
wunsch n_occurences 25 variance 3.3986013986 local_maximums [(0, 8), (7, 6), (10, 15)]
mund n_occurences 27 variance 3.36786469345 local_maximums [(0, 7), (4, 3), (7, 13), (10, 18)]
könig n_occurences 25 variance 3.29090909091 local_maximums [(4, 19)]
mann n_occurences 27 variance 3.26613965744 local_maximums [(4, 18), (6, 23)]
helden n_occurences 40 variance 3.24195804196 local_maximums [(4, 29)]
beut n_occurences 26 variance 3.19696969697 local_maximums [(4, 21)]
ehr n_occurences 63 variance 3.08237302603 local_maximums [(1, 17), (10, 40)]
sieg n_occurences 99 variance 3.06702057067 local_maximums [(4, 43), (6, 49), (9, 47)]
muß n_occurences 63 variance 3.04802744425 local_maximums [(0, 33), (10, 23)]
augen n_occurences 27 variance 2.90909090909 local_maximums [(0, 11), (10, 15)]
seyn n_occurences 138 variance 2.90457368718 local_maximums [(0, 53), (9, 60)]
gottes n_occurences 146 variance 2.86622320769 local_maximums [(0, 53), (10, 69)]
ganz n_occurences 74 variance 2.84308048639 local_maximums [(1, 40)]
geistes n_occurences 27 variance 2.8014354067 local_maximums [(0, 10), (1, 10), (6, 5), (10, 16)]
engel n_occurences 35 variance 2.73500967118 local_maximums [(0, 7), (2, 8), (7, 20), (9, 20)]
feind n_occurences 101 variance 2.68013468013 local_maximums [(4, 54), (5, 54), (7, 55)]
sonn n_occurences 27 variance 2.65837320574 local_maximums [(0, 16), (6, 5), (9, 9)]
fried n_occurences 37 variance 2.51210328133 local_maximums [(4, 24), (5, 24)]
allmacht n_occurences 32 variance 2.47164716472 local_maximums [(0, 12), (5, 5), (9, 18)]
stäts n_occurences 30 variance 2.41306266549 local_maximums [(0, 17), (10, 10)]
wunder n_occurences 59 variance 2.38883888389 local_maximums [(0, 19), (10, 31)]
ab n_occurences 55 variance 2.35537190083 local_maximums [(0, 16), (5, 32), (6, 32)]
christen n_occurences 105 variance 2.34880998829 local_maximums [(3, 57), (4, 57)]
sohn n_occurences 25 variance 2.24687239366 local_maximums [(3, 15), (5, 16)]
höchsten n_occurences 40 variance 2.22068008328 local_maximums [(0, 21), (6, 7), (10, 17)]
erden n_occurences 42 variance 2.21231671554 local_maximums [(0, 22), (1, 22), (9, 16), (10, 16)]
pflegt n_occurences 40 variance 2.17103235747 local_maximums [(0, 18), (6, 8), (10, 18)]
jezt n_occurences 57 variance 2.11038961039 local_maximums [(0, 25), (9, 26)]
zeit n_occurences 71 variance 2.06843341161 local_maximums [(2, 35)]
sey n_occurences 54 variance 2.03805496829 local_maximums [(0, 19), (10, 24)]
krieg n_occurences 70 variance 2.02026456516 local_maximums [(5, 42)]
ließ n_occurences 28 variance 1.96920821114 local_maximums [(0, 7), (2, 13), (4, 18), (5, 18)]
gibt n_occurences 26 variance 1.96037296037 local_maximums [(0, 11), (10, 13)]
hand n_occurences 77 variance 1.88060606061 local_maximums [(0, 21), (8, 37)]
mehr n_occurences 115 variance 1.84514637904 local_maximums [(2, 54), (4, 54)]
erd n_occurences 28 variance 1.81818181818 local_maximums [(0, 10), (9, 16)]
hülff n_occurences 40 variance 1.7987012987 local_maximums [(0, 14), (3, 26), (10, 10)]
biß n_occurences 32 variance 1.7886977887 local_maximums [(5, 21)]
christ n_occurences 28 variance 1.76023976024 local_maximums [(0, 8), (4, 6), (8, 14), (10, 15)]
nie n_occurences 34 variance 1.74242424242 local_maximums [(0, 20), (4, 15)]
treu n_occurences 30 variance 1.72806324111 local_maximums [(0, 8), (7, 17)]
tugend n_occurences 47 variance 1.71231300345 local_maximums [(0, 23), (3, 19), (6, 17), (10, 10)]
nit n_occurences 25 variance 1.71159874608 local_maximums [(0, 16), (3, 11), (8, 6)]
noht n_occurences 95 variance 1.6854754441 local_maximums [(0, 47), (6, 31), (8, 32)]
held n_occurences 25 variance 1.66532582462 local_maximums [(7, 16)]
all n_occurences 47 variance 1.6444701795 local_maximums [(0, 20), (1, 20), (8, 18), (10, 20)]
gleich n_occurences 53 variance 1.61722488038 local_maximums [(1, 13), (2, 13), (7, 26), (8, 26)]
kron n_occurences 35 variance 1.55244755245 local_maximums [(3, 18), (5, 20), (10, 12)]
erz n_occurences 49 variance 1.54396423249 local_maximums [(0, 12), (1, 12), (3, 14), (8, 24), (10, 24)]
trost n_occurences 25 variance 1.53965183752 local_maximums [(1, 14), (9, 10), (10, 10)]
sternen n_occurences 25 variance 1.53759820426 local_maximums [(0, 13), (2, 13), (6, 6), (7, 6), (10, 7)]
tod n_occurences 59 variance 1.53129657228 local_maximums [(0, 11), (3, 28), (4, 28), (9, 27)]
himmels n_occurences 33 variance 1.50368550369 local_maximums [(0, 18), (6, 8), (8, 10), (9, 10), (10, 10)]
allein n_occurences 32 variance 1.46920821114 local_maximums [(1, 10), (6, 13), (8, 16), (10, 19)]
haubt n_occurences 51 variance 1.46683046683 local_maximums [(8, 27)]
christenheit n_occurences 37 variance 1.45920745921 local_maximums [(0, 9), (6, 20)]
voll n_occurences 37 variance 1.45916795069 local_maximums [(0, 14), (6, 9), (10, 18)]
sieges n_occurences 50 variance 1.42549203374 local_maximums [(3, 18), (5, 18), (8, 25), (10, 25)]
hülf n_occurences 31 variance 1.31923890063 local_maximums [(2, 16), (5, 16)]
ohn n_occurences 36 variance 1.25426136364 local_maximums [(0, 13), (6, 16), (7, 16), (10, 15)]
flammen n_occurences 29 variance 1.22727272727 local_maximums [(0, 11), (6, 7), (9, 14)]
frucht n_occurences 28 variance 1.19592476489 local_maximums [(0, 9), (3, 8), (7, 16), (8, 16)]
schon n_occurences 77 variance 1.19191919192 local_maximums [(0, 27), (8, 33), (10, 34)]
ewig n_occurences 28 variance 1.17373737374 local_maximums [(0, 12), (10, 12)]
trieb n_occurences 29 variance 1.17254174397 local_maximums [(0, 8), (2, 8), (3, 8), (6, 9), (10, 17)]
lob n_occurences 35 variance 1.14909090909 local_maximums [(0, 10), (1, 10), (2, 10), (5, 9), (10, 18)]
leben n_occurences 50 variance 1.13662337662 local_maximums [(0, 17), (3, 13), (7, 20), (9, 23)]
kommen n_occurences 28 variance 1.12727272727 local_maximums [(0, 12), (1, 12), (4, 14)]
wolt n_occurences 57 variance 1.107771261 local_maximums [(5, 30)]
eh n_occurences 52 variance 1.10716099543 local_maximums [(0, 24), (5, 15), (10, 14)]
grund n_occurences 31 variance 1.10197628458 local_maximums [(0, 15), (2, 15), (10, 12)]
glück n_occurences 85 variance 1.087562744 local_maximums [(0, 21), (3, 32), (6, 41), (9, 31), (10, 31)]
blut n_occurences 76 variance 1.03227716908 local_maximums [(0, 27), (3, 27), (7, 36), (8, 36)]
ruh n_occurences 40 variance 1.02233766234 local_maximums [(2, 17), (5, 23)]
her n_occurences 28 variance 1.01461038961 local_maximums [(0, 14), (3, 14), (8, 10), (10, 11)]
raht n_occurences 26 variance 0.982323232323 local_maximums [(0, 13), (7, 6), (10, 9)]
freuden n_occurences 28 variance 0.960227272727 local_maximums [(2, 9), (6, 8), (10, 16)]
dadurch n_occurences 25 variance 0.949090909091 local_maximums [(0, 7), (1, 7), (4, 8), (5, 8), (7, 14)]
ehren n_occurences 28 variance 0.947052947053 local_maximums [(0, 11), (1, 11), (10, 13)]
end n_occurences 39 variance 0.937584803256 local_maximums [(0, 11), (2, 11), (4, 11), (8, 17), (10, 19)]
lassen n_occurences 31 variance 0.918313570487 local_maximums [(2, 11), (6, 17), (7, 17)]
reich n_occurences 77 variance 0.895021645022 local_maximums [(3, 38), (4, 38), (8, 30), (9, 30)]
gefahr n_occurences 39 variance 0.858225108225 local_maximums [(0, 11), (2, 14), (5, 21)]
geht n_occurences 29 variance 0.848484848485 local_maximums [(0, 13), (6, 8), (10, 12)]
lust n_occurences 33 variance 0.834028356964 local_maximums [(0, 9), (2, 10), (3, 10), (7, 9), (10, 17)]
feld n_occurences 32 variance 0.816864295125 local_maximums [(1, 11), (4, 17), (5, 17)]
streit n_occurences 33 variance 0.775184275184 local_maximums [(4, 18), (5, 18)]
lauf n_occurences 33 variance 0.712369597615 local_maximums [(0, 14), (2, 15), (3, 15), (10, 12)]
recht n_occurences 64 variance 0.703689615824 local_maximums [(0, 31), (7, 24), (8, 24)]
furcht n_occurences 29 variance 0.68601986249 local_maximums [(0, 12), (1, 12), (3, 15), (10, 7)]
erst n_occurences 28 variance 0.669090909091 local_maximums [(0, 12), (1, 12), (2, 12), (8, 8), (10, 9)]
macht n_occurences 198 variance 0.6665084263 local_maximums [(2, 82), (8, 69)]
diß n_occurences 46 variance 0.655844155844 local_maximums [(1, 12), (2, 12), (4, 17), (5, 17), (8, 20)]
muht n_occurences 48 variance 0.529644268775 local_maximums [(3, 24), (7, 22)]
hätt n_occurences 26 variance 0.472727272727 local_maximums [(5, 14), (9, 10)]
geben n_occurences 26 variance 0.460227272727 local_maximums [(0, 12), (6, 10), (7, 10)]
fort n_occurences 29 variance 0.454545454545 local_maximums [(4, 14), (5, 14), (6, 14), (8, 15)]
gar n_occurences 75 variance 0.431121026213 local_maximums [(2, 35)]
sinn n_occurences 36 variance 0.427972027972 local_maximums [(2, 13), (6, 12), (10, 18)]
theil n_occurences 25 variance 0.420240137221 local_maximums [(0, 8), (1, 8), (2, 8), (4, 11), (6, 12), (7, 12), (8, 12)]
gut n_occurences 26 variance 0.356060606061 local_maximums [(0, 8), (4, 10), (6, 10), (7, 10), (10, 11)]
solt n_occurences 27 variance 0.353146853147 local_maximums [(0, 8), (2, 9), (3, 9), (5, 12), (6, 12), (9, 12)]
wär n_occurences 31 variance 0.342817487856 local_maximums [(1, 13), (8, 15)]
waffen n_occurences 38 variance 0.338791643139 local_maximums [(4, 18), (7, 17)]
schutz n_occurences 25 variance 0.335392762577 local_maximums [(1, 9), (5, 11), (7, 12), (8, 12)]
gehn n_occurences 35 variance 0.316261203585 local_maximums [(0, 14), (3, 15), (5, 15)]
wider n_occurences 49 variance 0.300737100737 local_maximums [(0, 17), (3, 15), (6, 21)]
selber n_occurences 26 variance 0.3004784689 local_maximums [(0, 10), (3, 10), (4, 10), (9, 10), (10, 10)]
feur n_occurences 32 variance 0.291486291486 local_maximums [(0, 13), (5, 12), (6, 12), (9, 14)]
stark n_occurences 28 variance 0.223011363636 local_maximums [(0, 10), (4, 15), (7, 13)]
drum n_occurences 44 variance 0.197582764057 local_maximums [(0, 15), (2, 17), (6, 19)]
bey n_occurences 81 variance 0.179552534693 local_maximums [(1, 29), (3, 32), (6, 33), (10, 30)]
wol n_occurences 48 variance 0.143939393939 local_maximums [(0, 18), (6, 16), (7, 16), (10, 17)]
LOCAL MAXIMUMS **********************************************

shingle 0 words: gott 69, ach 61, wann 73, kan 48, herz 41, gnaden 33, herzen 13, gnad 39, laß 15, wort 17, krafft 32, lieb 27, wer 34, ja 42, welt 51, himmel 40, o 21, liebe 6, thun 12, ziel 17, heil 12, glauben 16, geist 39, seelen 6, unsre 20, jesu 14, werk 15, wunsch 8, muß 33, augen 11, seyn 53, gottes 53, sonn 16, allmacht 12, stäts 17, wunder 19, ab 16, höchsten 21, pflegt 18, jezt 25, sey 19, gibt 11, hand 21, erd 10, hülff 14, nie 20, treu 8, nit 16, noht 47, christenheit 9, voll 14, flammen 11, schon 27, ewig 12, kommen 12, eh 24, grund 15

shingle 1 words: ja 42, meer 24, jesu 14, ehr 17, ganz 40, trost 14, kommen 12

shingle 2 words: zeit 35, mehr 54, hülf 16, grund 15, ruh 17

shingle 3 words: christen 57, sohn 15, hülff 26, nit 11, kron 18

shingle 4 words: ward 98, bald 75, stadt 43, schlacht 33, heer 47, must 36, schaar 27, tausend 33, land 37, seelen 6, könig 19, mann 18, helden 29, beut 21, sieg 43, feind 54, fried 24, christen 57, mehr 54, nie 15, kommen 14

shingle 5 words: türken 70, türk 27, wurd 37, volk 34, schaar 27, gab 22, unsre 3, feind 54, fried 24, allmacht 5, ab 32, sohn 16, krieg 42, biß 21, kron 20, hülf 16, wolt 30, eh 15, ruh 23

shingle 6 words: herzen 18, liebe 14, thun 16, ziel 6, tausend 37, gab 22, werk 6, mann 23, sieg 49, sonn 5, ab 32, höchsten 7, pflegt 8, noht 31, christenheit 20, voll 9, flammen 7

shingle 7 words: wunsch 6, feind 55, treu 17, held 16

shingle 8 words: thun 21, unsre 11, hand 37, nit 6, noht 32, haubt 27, schon 33

shingle 9 words: kan 73, lieb 40, ja 59, glauben 24, geist 37, sieg 47, seyn 60, sonn 9, allmacht 18, jezt 26, erd 16, trost 10, flammen 14

shingle 10 words: gott 130, ach 38, wann 85, herz 61, gnaden 23, herzen 40, gnad 21, laß 11, wort 29, krafft 45, lieb 40, wer 43, welt 55, himmel 40, o 24, liebe 20, ziel 25, heil 20, glauben 24, seelen 20, meer 14, jesu 10, werk 23, wunsch 15, ehr 40, muß 23, augen 15, gottes 69, stäts 10, wunder 31, höchsten 17, pflegt 18, sey 24, gibt 13, hülff 10, kron 12, trost 10, voll 18, schon 34, ewig 12, eh 14, grund 12


highest_score 130
In [ ]: