The software below has been developed exclusively for research purposes only. If you need any of these tools please contact me.
· GrTokenizer
A tokenizer for Modern Greek based on regular expressions written in PERL Converts text in vertical format, one token per line. Directions can be found to the Readme file inside the Stylometrics.zip [zip]
· FileSplitter
A utility for segmenting a text file in n equal word files, where n
is a user-selected value.
· stTTR
A software which calculates the Standardized Type/Token
ratio using equal samples of texts and thus avoiding the text-size
dependence of the particular index.
· Stylometrics
A
program that calculates over 100 stylometric
indices.
This version works for Modern Greek Corpora and produces text
tab-delimited results.
· Roman Stylometrics
A
program that calculates over 100 stylometric indices.
This version works for Latin script Corpora (e.g.
English, Italian etc.) and produces text tab-delimited results.
· TermCount
A PERL script that count the relative frequency of a
user-selected wordlist in a corpus.
·
VocabGrowth: A software that calculates the relative
growth of the types’ frequency in a text.
· Episimiotis
Software for error coding in learner’s texts combining
custom error taxonomies and metalanguage data in xml output files.
· CorpusManager
A software suite for managing megacorpora and producing
subcorpora using customized metalanguage criteria.