RuDiRe, a Corpus Political Discourse of Russian Revolutions
RuDiRe is a text corpus of works and documents published in Russia during the period of 1890-1930. The corpus was compiled as a reference corpus for Lenin corpora.
The corpus includes texts written by Russian "influencers" of this period: politicians, writers, journalists, scholars, artists, activists. Texts represent different genres: essays, journalism, essays, political speeches, letters, diaries etc. No fiction texts were included.
Texts were collected from the internet.
Size of the corpus (January 2026): 181 documents, 2 M running words
Metadata includes titles of the works, volume, in which the document was included.
Hosting of the corpus: Linux server, Tampere University
Corpus manager: NoSketch Engine (non-commercial version of Sketch Engine)
Morpho-syntactic parsing: Turku neural parser pipeline
Parsing of corpus files: Python script developed by Juho Härme
To obtain access to the corpus contact Mikhail Mikhailov (mikhail.mikhailov(at)tuni.fi)