GREgORI - Softwares, linguistic data and tagged corpus for ancient GREek and ORIental languages
ciol |
ISSN 2736-7657
GREgORI logo by Anahita Nikdast |
Contact : Collège Érasme |
New online interfaces are now available
The GREgORI Project – hosted at the Institut orientaliste of the UCLouvain, Louvain-la-Neuve, Belgium – provides scholars with lemmatized corpora in Classical and Byzantine Greek as well as in some of the main languages of the Christian East, such as Ancient Armenian, Old Georgian, and Syriac.
These data are now available through online and open access interfaces combining a concordancer, a text browser and an editions viewer.
Read more about these interfaces here
Free access to the online interfaces here
The GREgORI Project analyses your texts, with you, or for you
The GREgORI Project analyses (lemmatization and part-of-speech tagging) texts in Greek, Armenian, Georgian, and Syriac by sharing tools and data with its collaborators, transforming texts into searchable tagged-corpus available online. The GREgORI Project also provides scholars with lemmatized concordances and other lexicographical tools (alphabetical index, reverse index, frequential index; see samples here).
Information and terms of collaboration: contact the GREgORI Project at
They already trust us:
- ERC Project “Florilegia Syriaca. The Intercultural Dissemination of Greek Christian Thought in Syriac and Arabic in the First Millennium CE” (lexical and part-of-speech tagging of Syriac texts from patristic anthologies).
- ERC Project “Hun@aynNet. Transmission of Classical Scientific and Philosophical Literature from Greek into Syriac and Arabic” (lexical and part-of-speech tagging of scientific and philosophical Syriac texts translated from Greek).
- “The digital future of a founding text : the Iliad and the Genavensis Græcus 44” (lemmatization and bilingual alignment of the Homeric text of the Iliad I-III from the ms. Genavensis Graecus 44 (XIIIe s.) and its Byzantine paraphrase).
- The Syriac Galen Palimpsest: Galen's On Simple Drugs and the Recovery of Lost Texts through Sophisticated Imaging Techniques (lexical analysis of the Greek text and Arabic and Syriac versions of the treatise On Simple Drugs by Galen of Pergamon).
- Projet « CRILEX. Crise et altérité dans la Méditerranée tardo-antique : une approche lexicale » (lemmatization and part-of-speech tagging of the Life of Apollonios by Philostratos of Athens).
- The eCAB project. A digital version of the Corpus des Astronomes Byzantins (CAB) (Lemmatisation and POS-tagging of the texts currently publihed in the Corpus des Astronomes Byzantins).
Papers and Documents about the GREgORI Project
Recently Published (2022-2023)
- B. Kindt, G. Vidal-Gorène, S. Delle Donne, Analyse automatique du grec ancien par réseau de neurones. Évaluation sur le corpus De Thessalonica Capta, dans Babelao, 10-11 (2022), p. 525-550.
- Kindt B., Du texte à l’index. L’étiquetage lexical du De Septem Orbis Spectaculis de Philon le Paradoxographe : méthode et finalité, dans G. Labarre, Sources, Histoire et Éditions. Les outils de la recherche. Formation et recherche en sciences de l’Antiquité (Institut des sciences et techniques de l'Antiquité), Besançon, 2021, p. 175‑218.
- Vidal-Gorène Ch., La reconnaissance automatique d'écriture à l'épreuve des langues peu dotées, dans Programming Historian en français, 5 (2023).
- Vidal-Gorène C., Kindt B., From manuscript to tagged corpora. An automated process for Ancient Armenian or other under resourced languages of the Christian East, in Armeniaca, 1 (2022), 73-95.
- A. Capone (ed.), Sancti Gregorii Nazianzeni Opera. Versio Latina, I. Epistulae 102-101 cum indice verborum a B. Kindt et B. Coulie confecto (Corpus Christianorum. Series Graeca, 99. Corpus Nazianzenum, 32), Turnhout, 2021 (Bilingual Greek-Latin lemmatized index, p. 25-99).
Previously published
- Chahan Vidal-Gorène, B. Kindt, Lemmatization and POS-tagging process by using joint learning approach. Experimental results on Classical Armenian, Old Georgian and Syriac, in Proceedings of the Twelfth International Conference on Language Resources and Evaluation (LREC 2020), Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA), 2020, p. 22-27.
- B. Kindt, J.-C. Haelewyck, A.B. Schmidt, N. Atas, La concordance bilingue grecque - syriaque des Discours de Grégoire de Nazianze, dans BABELAO, 7 (2018), p. 51-80.
- Bastien Kindt, La lemmatisation des sources patristiques et byzantines au service d'une description lexicale du grec ancien. Les principes de formulation des lemmes du Dictionnaire Automatique Grec (D.A.G.) in Byzantion, 74 (2004), p. 213-272.
- Bastien Kindt, Du texte à l'index. L'étiquetage lexical du De Septem Orbis Spectaculis de Philon le Paradoxographe: méthode et finalité. Paper presented at the Journée d’étude EDOCSA (École doctorale romande en Sciences de l’Antiquité). Les outils de la recherche. Histoire – Philologie – Littératures — Les outils informatiques et les ressources linguistiques du projet GREgORI, Fribourg (CH), septembre 25th 2014.
- Andrea Barbara Schmidt, Open Access Datenbank von ein- und zweisprachigen Konkordanzen der syrischen Literatur am Orientinstitut der Universität Louvain, in: Sh. Talay (ed.),Überleben im Schatten: Geschichte und Kultur des syrischen Christentums. Beiträge des 10. Deutschen Syrologentages an der FU Berlin 2018, (Göttinger Orientforschungen, I ; Syriaca, 58), Wiesbaden: Harrassowitz, 2020, p. 229-248
- Schmidt A.B., Eine syrische Amulettrolle mit Beschwörungen für Frauen : Erevan, Matenadaran, rot. syr. 72, Teil I. Edition und Übersetzung, in Tracing Written Heritage in a Digital Age, ed. E.A. Ishac, Th. Csanády, Th. Zammit Lupi (with an Introduction by Robert A. Kitchen), Wiesbaden, 2021, p. 13-58.
- Schmidt A.B., Kindt B., Eine syrische Amulettrolle mit Beschwörungen für Frauen : Erevan, Matenadaran, rot. syr. 72, Teil II. Wortindex, in Tracing Written Heritage in a Digital Age, ed. E.A. Ishac, Th. Csanády, Th. Zammit Lupi (with an Introduction by Robert A. Kitchen), Wiesbaden, 2021, p. 59-76 (Lemmatized Syriac index).