Friday, March 7, 2008

TextSTAT


TextSTAT is a simple program for the analysis of texts. It reads ASCII/ANSI texts (in different encodings) and HTML files (directly from the internet) and it produces word frequency lists and concordances from these files. This version includes a web-spider which reads as many pages as you want from a particular website and puts them in a TextSTAT-corpus. The new news-reader puts news messages in a TextSTAT-readable corpus file.
TextSTAT now reads MS Word and OpenOffice files (OOo 1 (.sxw) and 2 (.odt)). No conversion needed, just add the files to your corpus...
In TextSTAT you can use regular expression which provides you with powerful search possibilities. The programme is multilingual. Because it uses Unicode internally, TextSTAT can cope with many different languages and file encodings. The user interface comes in three languages: English, German, and Dutch.

Website

Download