What AthDGC contributes; Symposium on Computational Methods for Intertextuality, Tel Aviv University, 22 June 2026
NKUA · Athens Digital Glossa Chronos Research Network
June 22, 2026
What AthDGC contributes
Nikolaos Lavidas, NKUA Athens Digital Glossa Chronos Research Network Διαχρονία Γλώσσας :Χρόνος
HFRI Project No. 20577 · Greece 2.0 NRRP · Compute GRNET ARIS pa260305
Nikolaos Lavidas (PI, NKUA) · Kiki Nikiforidou (NKUA) Dag Haug (Oslo, PROIEL Director) · Leonid Kulikov (Ghent) Vassiliki Geka (NKUA) · Vassileios Symeonidis (NKUA) · Theodoros Michalareas (NKUA) Sofia Chionidi (NKUA) · Anastasia Tsiropina (NKUA) · Eleni Plakoutsi (NKUA) · Evangelos Argyropoulos (NKUA)
Three universities (NKUA · Oslo · Ghent); eleven researchers; one platform. HFRI Project No. 20577 · Greece 2.0 NRRP.
AthDGC is the Athens node of the PROIEL family (“Athens-PROIEL”). Dag Haug, founding director of PROIEL at Oslo, is a co-author and co-PI. We adopt the PROIEL XML 2.0 schema verbatim and we extend it diachronically.
Historical linguistics has no native speakers. It has only texts.
The only window onto Archaic, Classical, Koine, Late Antique, Byzantine, Late Byzantine, and Early Modern Greek is the written record. Every claim about the syntax of a period reduces to a claim about what its surviving texts permit and what they exclude.
In a corpus that spans 3000 years on one language, intertextuality is not a stylistic ornament. It is the primary data-generating process.
The same canon is re-rendered by every generation:
Retelling. Same language, later century. Same story, new register. (Niketas Choniates retelling Homer in 12th c. Byzantine epitome.)
Retranslation. Same canon, target language shifts. (Hebrew Bible → Greek Septuagint → Latin Vulgate → Gothic Wulfila → OCS Marianus → Classical Armenian.)
Inner-textual citation. Surface re-use of one text inside another. (NT quoting LXX; Patristic homily quoting NT; Byzantine chronicle quoting Patristic.)
Under sustained written language contact, donor and recipient grammars do not displace one another. They coexist as distinct active systems inside the same writer.
Lavidas, N. 2021. The Diachrony of Written Language Contact: A Contrastive Approach. Brill’s Studies in Historical Linguistics. Leiden, Boston: Brill.
Written transmission has its own physics:
Each of these leaves a different signature in the syntactic record.
PROIEL XML 2.0 is the file format that stores each sentence as a tree of word-by-word grammatical relations. Developed at Oslo for the early Indo-European languages; extended in AthDGC to all of Greek.
The same 26-relation inventory annotates:
One schema; eight periods; queryable comparison.
The diachronic extension was first proposed by Lavidas and Haug (2012, Thessaloniki-Oslo PROIEL pilot on Sphrantzes’ Chronicle). AthDGC v0.4 ships that 2012 idea, generalised to the full diachronic span.
Every partition carries a stable identity:
athdgc.<author>.<work>.<src_lang>.<tgt_lang>.<translator>.v<revision>
Example:
athdgc.017.001.grc.000.000.v1 (Sphrantzes, Chronicle, Greek)
Every Text Id resolves to a row in a metadata register, to a source edition, and to a citable printed reference.
All open source. MIT · Apache 2.0 · BSD.
Eight diachronic periods, all on one PROIEL XML 2.0 spine:
| Period | Status | |
|---|---|---|
| Archaic, Classical, Hellenistic, Koine | partitions in flight | |
| Late Antique, Byzantine | partitions in flight | |
| Late Byzantine, Early Modern, Modern | partitions in flight |
New Testament verse-level alignment (cross-alignment: machine-readable matching of corresponding sentences and words between texts in different languages):
CC BY 4.0 for the corpus. Citable per-version DOIs on each release.
10.91 M tokens PROIEL-annotated 173 annotation batches complete 8 Greek periods covered 3 Indo-European parallels currently aligned at the New Testament
Dashboard snapshot: 19 June 2026 21:17 Athens. Latin, Gothic, Old Church Slavonic aligned in v0.4; Classical Armenian in ingestion for v0.5. v0.5 target 16 M tokens; v0.6 target 24 M.
Thread A. The Homeric chain. Iliad 1.1 in Archaic Greek → its Byzantine epitome (Tzetzes) → its 1955 modern Greek prose retelling (Kakridis-Kazantzakis).
Thread B. The New-Testament chain. John 1.1 in Koine → its Vulgate translation → its Gothic and Old Church Slavonic sisters.
Thread C. The Septuagint chain. Psalm 1.1 in Hebrew (target) ← LXX Greek → Vulgate → Modern Greek liturgy.
The Septuagint (LXX) is the most consequential single text in the Greek chain:
Two well-studied phenomena that AthDGC makes queryable across periods:
Both are syntactic transfers from Hebrew. Both propagate outward, through Vulgate, Gothic, Old Church Slavonic.
Genesis 1:3-4 (the fiat lux sequence)
Heb. וַיֹּ֥אמֶר אֱלֹהִ֖ים יְהִ֣י א֑וֹר וַֽיְהִי־א֖וֹר׃ וַיַּ֧רְא אֱלֹהִ֛ים אֶת־הָא֖וֹר כִּי־טֽוֹב
Tr. wa-yyōʾmer ʾĕlōhîm yəhî ʾôr wa-yhî ʾôr. wa-yyarʾ ʾĕlōhîm ʾeṯ ha-ʾôr kî ṭôḇ.
LXX καὶ εἶπεν ὁ θεός Γενηθήτω φῶς. καὶ ἐγένετο φῶς. καὶ εἶδεν ὁ θεὸς τὸ φῶς ὅτι καλόν.
Gloss and said the god let-become light. and became light. and saw the god the light that good.
Three Hebrew wayyiqtol forms (consecutive narrative tense) are calqued by three clause-initial καί + finite verb in the LXX. Classical Greek prose would subordinate (participle, μέν / δέ, ὅτε clause).
The eight periods of Greek on the AthDGC PROIEL spine:
Contact: nikolaos.lavidas@gmail.com · nlavidas@enl.uoa.gr
Διαχρονία Γλώσσας :Χρόνος · Athens Digital Glossa Chronos
Funded by the Hellenic Foundation for Research and Innovation (HFRI) under the 3rd Call for HFRI Research Projects to support Post-Doctoral Researchers, Project No. 20577; with complementary support from the Greece 2.0 National Recovery and Resilience Plan.
Compute supplied by GRNET ARIS (the Greek national high-performance computing cluster), allocation pa260305.
Corpus licence CC BY 4.0. Tooling licences MIT / Apache 2.0 / BSD.