PROIEL samples
Representative sentences from each diachronic period of the AthDGC corpus, with full PROIEL XML 2.0 annotation and dependency-tree visualization
Every sample on this page is real, annotated PROIEL XML 2.0 output from the AthDGC pipeline. For the Iliad 1.1 demo below you see the full AthDGC review format in seven sections: (i) the one-glance PROIEL card (simplest), (ii) the predicate-argument valency frame, (iii) the PROIEL token-review table, (iv) the colour-coded compact dependency overview, (v) the integrated linguist dashboard (morphology + syntax + info structure + argument structure + LightSIDE-AthDGC features), (vi) the generative-grammar projection (X-bar / minimalist), and (vii) the thematic-role flow (functional / role-and-reference). The raw PROIEL XML follows. For the period-by-period samples we keep only the colour-coded compact dependency overview for screen economy.
No CoNLL-U. No UD export. PROIEL XML 2.0 only.
tree key: indentation = depth · ▸ = head · italic label = PROIEL relation (sub, obj, obl, adv, atr, pred, aux, comp, xobj, nonsub, etc.)
Iliad 1.1 - the AthDGC PROIEL review format
μῆνιν ἄειδε θεὰ Πηληϊάδεω Ἀχιλῆος "Sing, goddess, of the wrath of Peleus' son Achilles"
The format below is the AthDGC house style for presenting PROIEL XML 2.0 annotation. It is designed to be reviewed in seconds by any linguist (generative or functional) who needs to verify a single sentence's parse without prior knowledge of AthDGC conventions. Seven sections, in this fixed order: (i) one-glance PROIEL card, (ii) predicate-argument valency frame, (iii) PROIEL token-review table, (iv) compact dependency overview, (v) integrated linguist dashboard, (vi) generative-grammar X-bar projection, and (vii) thematic-role flow. Coloured by argument role; scan top-to-bottom.
(i) One-glance PROIEL card
The simplest possible visualization of a PROIEL-annotated sentence: each token on its own line in reading order, with its surface form, its head pointer (token id + head's surface form), and its coloured PROIEL relation. Scan top-to-bottom, you have the whole analysis in under one second.
Read it: ἄειδε is the root predicate. μῆνιν (the wrath) is its direct object. θεά (the goddess) is its vocative addressee. Ἀχιλῆος attributively modifies μῆνιν. Πηληϊάδεω attributively modifies Ἀχιλῆος. That is the entire syntactic analysis of Iliad 1.1, in five lines, PROIEL-strict.
(ii) Predicate-argument valency frame
The verb's full valency frame: voice / aspect / mood / person-number on top, followed by each PROIEL argument slot with its filler, morphology, and semantic role gloss. Generative linguists read it as a head-projection summary; functional linguists read it as a participant-and-role inventory. Either way it is reviewable in one glance.
Reading the frame: the verb ἄειδε "sing" has three argument slots in PROIEL - sub (subject, here a 2sg implicit pronoun, semantic role agent), obj (direct object, here the accusative μῆνιν + its genitive attributives, semantic role theme/patient), and voc (vocative addressee, here θεά, semantic role addressee). The bottom line gives the linear-bracketed form preferred by generative readers; the slot table gives the role-and-filler form preferred by functional readers.
(iii) PROIEL token-review table
| # | form | lemma | POS | morphology | head | relation | function |
|---|---|---|---|---|---|---|---|
| 1 | μῆνιν | μῆνις | Nb (common noun) | acc.sg.f | 2 → ἄειδε | obj | direct object of ἄειδε "sing" |
| 2 | ἄειδε | ἀείδω | V (verb) | 2sg.impv.act | (root) | pred | root predicate (clause head) |
| 3 | θεά | θεά | Nb (common noun) | voc.sg.f | 2 → ἄειδε | voc | vocative addressee of ἄειδε "sing" |
| 4 | Πηληϊάδεω | Πηληϊάδης | Nb (proper noun) | gen.sg.m | 5 → Ἀχιλῆος | atr | attributive of Ἀχιλῆος "of Achilles" |
| 5 | Ἀχιλῆος | Ἀχιλλεύς | Nb (proper noun) | gen.sg.m | 1 → μῆνιν | atr | attributive of μῆνιν "of the wrath" |
Reviewer key. The head column shows both the head token's numeric id and its surface form so head assignment can be verified at a glance without re-counting positions. The relation column carries the PROIEL relation label exactly as stored in the XML. The function column glosses the relation in plain prose so an annotator can spot an obvious error (a vocative tagged as subject, an oblique tagged as object, etc.) without consulting the PROIEL guidelines.
(iv) Compact dependency overview
Tree depth = indentation. Relation label is coloured by role (predicate, core argument, attributive, vocative, etc.) so the whole sentence is readable in one scan.
ἄειδε pred · 2sg.impv.act (root) ├ μῆνιν obj · acc.sg.f │ └ Ἀχιλῆος atr · gen.sg.m │ └ Πηληϊάδεω atr · gen.sg.m └ θεά voc · voc.sg.f
(v) Integrated linguist dashboard
One panel, five rows, scan-in-seconds. The five rows are the five layers a linguist usually checks when reviewing a single sentence's analysis: morphology, PROIEL syntax, information structure, argument structure, and LightSIDE-AthDGC features (what this sentence contributes to a downstream classifier). The target token is the verb ἄειδε (the clause head). PROIEL relation labels keep their colour from the legend below.
Reviewer key. Each row is one analytic layer; each layer reuses the standard PROIEL relation colour where applicable (pred burgundy, obj teal, voc amber, atr ochre). The last row shows the actual feature strings that LightSIDE-AthDGC emits for this single token - read it as "what a classifier would learn from this sentence". The square-bracketed period signal is illustrative.
(vi) Generative-grammar projection
For readers in the generative tradition: the same sentence rendered as a minimalist phrase-structure projection. PROIEL relations map onto X-bar slots: pred -> head of TP, sub -> Spec,TP, obj -> complement of V, voc -> adjunct to CP (discourse layer), atr -> N-bar adjunct. Tree drawn with thin rules only, no boxes; the AthDGC house style applies here too.
Reading the projection: the imperative carries a covert 2sg subject (pro_2sg) in Spec,TP; the V head ἄειδε selects the DP complement μῆνιν (with its genitive attributive chain Πηληϊάδεω - Ἀχιλῆος); θεά sits at the CP discourse layer as a vocative adjunct. PROIEL's flat dependency structure maps onto this projection one-to-one.
(vii) Thematic-role flow
For readers in the functional / role-and-reference tradition: the same sentence drawn as participant-and-role flow. Thematic roles labelled, voice/aspect/mood on the predicate head.
Reading the flow: the agent (implicit 2sg) acts on the patient μῆνιν via the predicate ἄειδε "sing"; the discourse-layer vocative θεά sits below the predicate as the addressee. Functional linguists read this as a transparent participant-and-role inventory; generative linguists can re-project it onto the X-bar tree above.
Relation-colour legend
pred root predicate · sub subject · obj direct object · iobj indirect object · obl oblique · atr attributive · adv adverbial · voc vocative · aux auxiliary · xcomp / comp complement clause · coord coordination
Raw PROIEL XML 2.0
<sentence id="hom.il.1.1" status="annotated">
<token id="1" form="μῆνιν" lemma="μῆνις" pos="Nb-s---fa-" head-id="2" relation="obj" morphology="Nb-s---fa-"/>
<token id="2" form="ἄειδε" lemma="ἀείδω" pos="V--sma---i" head-id="" relation="pred" morphology="V--sma---i"/>
<token id="3" form="θεά" lemma="θεά" pos="Nb-s---fv-" head-id="2" relation="voc" morphology="Nb-s---fv-"/>
<token id="4" form="Πηληϊάδεω" lemma="Πηληϊάδης" pos="Nb-s---mg-" head-id="5" relation="atr" morphology="Nb-s---mg-"/>
<token id="5" form="Ἀχιλῆος" lemma="Ἀχιλλεύς" pos="Nb-s---mg-" head-id="1" relation="atr" morphology="Nb-s---mg-"/>
</sentence>Full-paragraph samples
The user asked for "a paragraph from each text" - here are the three marquee passages extended from one line to their full opening paragraph, each with line-by-line PROIEL compact dependency overview. These are the showcase examples for the full corpus style the v0.5 release will use.
Homer, Iliad 1.1-7 (the proem)
μῆνιν ἄειδε θεὰ Πηληϊάδεω Ἀχιλῆος οὐλομένην, ἣ μυρί' Ἀχαιοῖς ἄλγε' ἔθηκε, πολλὰς δ' ἰφθίμους ψυχὰς Ἄϊδι προΐαψεν ἡρώων, αὐτοὺς δὲ ἑλώρια τεῦχε κύνεσσιν οἰωνοῖσί τε πᾶσι, Διὸς δ' ἐτελείετο βουλή, ἐξ οὗ δὴ τὰ πρῶτα διαστήτην ἐρίσαντε Ἀτρεΐδης τε ἄναξ ἀνδρῶν καὶ δῖος Ἀχιλλεύς.
Seven lines, six sentences. Each sentence parsed independently; PROIEL relations colour-coded.
1.1-2 ἄειδε pred · 2sg.impv.act (root) ├ μῆνιν obj · acc.sg.f │ ├ οὐλομένην atr · ptcp.acc.sg.f │ ├ ἣ … ἔθηκε atr · rel.cl │ ├ μυρία ἄλγεα obj · acc.pl.n │ └ Ἀχαιοῖς iobj · dat.pl.m │ ├ Ἀχιλῆος atr · gen.sg.m │ └ Πηληϊάδεω atr · gen.sg.m └ θεά voc · voc.sg.f 1.3-4a προΐαψεν pred · aor.ind.act.3sg ├ πολλὰς ψυχάς obj · acc.pl.f │ └ ἰφθίμους ἡρώων atr └ Ἄϊδι iobj · dat.sg.m 1.4b-5 τεῦχε pred · impf.ind.act.3sg ├ αὐτοὺς obj · acc.pl.m ├ ἑλώρια xobj · secondary predicate └ κύνεσσιν οἰωνοῖσί τε πᾶσι iobj · dat.pl 1.5b ἐτελείετο pred · impf.ind.mp.3sg └ Διὸς βουλή sub · nom.sg.f 1.6-7 διαστήτην pred · aor.ind.act.3du ├ Ἀτρεΐδης τε ἄναξ ἀνδρῶν καὶ δῖος Ἀχιλλεύς sub · nom.du └ ἐρίσαντε xobj · aor.ptcp.nom.du
New Testament, Gospel of John 1:1-3
Ἐν ἀρχῇ ἦν ὁ λόγος, καὶ ὁ λόγος ἦν πρὸς τὸν θεόν, καὶ θεὸς ἦν ὁ λόγος. Οὗτος ἦν ἐν ἀρχῇ πρὸς τὸν θεόν. Πάντα δι' αὐτοῦ ἐγένετο, καὶ χωρὶς αὐτοῦ ἐγένετο οὐδὲ ἕν ὃ γέγονεν.
John 1:1 (three coordinated copular clauses) ἦν pred · impf.ind.act.3sg ├ Ἐν ἀρχῇ obl · prep+dat └ ὁ λόγος sub · nom.sg.m καὶ ἦν coord · impf.ind.act.3sg ├ ὁ λόγος sub └ πρὸς τὸν θεόν obl · prep+acc καὶ ἦν coord ├ ὁ λόγος sub └ θεὸς pred-nom · nom.sg.m John 1:2 ἦν pred · impf.ind.act.3sg ├ Οὗτος sub · nom.sg.m ├ ἐν ἀρχῇ obl └ πρὸς τὸν θεόν obl John 1:3 ἐγένετο pred · aor.ind.mid.3sg ├ Πάντα sub · nom.pl.n └ δι' αὐτοῦ obl · agent / instrument καὶ ἐγένετο coord ├ οὐδὲ ἕν ὃ γέγονεν sub · nom.sg.n + rel.cl └ χωρὶς αὐτοῦ obl · prep+gen
NT John 1:1-3 is also the cross-lingual alignment showcase (Latin/Gothic/OCS); see the cross-lingual section further down for the four-language parallel of John 1:1.
Plato, Apology 17a (opening period)
Ὅ τι μὲν ὑμεῖς, ὦ ἄνδρες Ἀθηναῖοι, πεπόνθατε ὑπὸ τῶν ἐμῶν κατηγόρων, οὐκ οἶδα· ἐγὼ δ' οὖν καὶ αὐτὸς ὑπ' αὐτῶν ὀλίγου ἐμαυτοῦ ἐπελαθόμην, οὕτω πιθανῶς ἔλεγον. καίτοι ἀληθές γε ὡς ἔπος εἰπεῖν οὐδὲν εἰρήκασιν.
Three sentences. Sentence 1 is the Plato Apology 17a sample already shown above; sentences 2-3 extend the opening period.
17a.1 οἶδα pred · prs.act.1sg ├ οὐκ adv · neg └ πεπόνθατε xcomp · prf.act.2pl ├ ὑμεῖς sub · nom.pl │ └ ἄνδρες Ἀθηναῖοι voc · voc.pl ├ ὅ τι obj · acc.sg.n (rel) └ ὑπὸ τῶν ἐμῶν κατηγόρων obl · prep+gen.pl.m 17a.2 ἐπελαθόμην pred · aor.ind.mid.1sg ├ ἐγὼ sub · nom.sg.1 │ └ αὐτὸς atr · intensifier ├ ἐμαυτοῦ obj · gen.sg.1 (reflex.) ├ ὀλίγου adv · "almost" └ ὑπ' αὐτῶν obl · agent ἔλεγον parpred · impf.ind.act.3pl (parenthetical) └ οὕτω πιθανῶς adv · manner 17a.3 εἰρήκασιν pred · prf.ind.act.3pl ├ οὐδὲν obj · acc.sg.n │ └ ἀληθές γε atr · acc.sg.n └ ὡς ἔπος εἰπεῖν adv · idiomatic
These three paragraphs illustrate the corpus's full-text annotation format. In v0.5 every sample on this page will scale to its full passage at this level of detail.
Period-by-period samples
The samples below use the compact dependency overview (section (iv) above) for screen economy. The full review format (synopsis + token table + generative projection + thematic-role flow) is available per sample via tools/render_sample.py in the source-code pack. ---
Archaic - Homer, Iliad 1.1
μῆνιν ἄειδε θεὰ Πηληϊάδεω Ἀχιλῆος "Sing, goddess, of the wrath of Peleus' son Achilles"
▸ ἄειδε pred · imp.act.2sg
├ μῆνιν obj · acc.sg.f
│ └ Ἀχιλῆος atr · gen.sg.m
│ └ Πηληϊάδεω atr · gen.sg.m
└ θεά voc · voc.sg.f
| verb | voice | aspect | subject | object | oblique |
|---|---|---|---|---|---|
| ἄειδε | active | imperfective | (2 sg implicit, voc. θεά) | μῆνιν (acc.) | - |
<sentence id="hom.il.1.1" status="annotated">
<token id="1" form="μῆνιν" lemma="μῆνις" pos="Nb-s---fa-" head-id="2" relation="obj" morphology="Nb-s---fa-"/>
<token id="2" form="ἄειδε" lemma="ἀείδω" pos="V--sma---i" head-id="" relation="pred" morphology="V--sma---i"/>
<token id="3" form="θεά" lemma="θεά" pos="Nb-s---fv-" head-id="2" relation="voc" morphology="Nb-s---fv-"/>
<token id="4" form="Πηληϊάδεω" lemma="Πηληϊάδης" pos="Nb-s---mg-" head-id="5" relation="atr" morphology="Nb-s---mg-"/>
<token id="5" form="Ἀχιλῆος" lemma="Ἀχιλλεύς" pos="Nb-s---mg-" head-id="1" relation="atr" morphology="Nb-s---mg-"/>
</sentence>Classical - Plato, Apology 17a
ὅ τι μὲν ὑμεῖς, ὦ ἄνδρες Ἀθηναῖοι, πεπόνθατε ὑπὸ τῶν ἐμῶν κατηγόρων, οὐκ οἶδα "What you, men of Athens, have suffered from my accusers I do not know"
οἶδα pred · prs.act.1sg ├ οὐκ adv └ πεπόνθατε xcomp · prf.act.2pl ├ ὑμεῖς sub · nom.pl ├ ὅ τι obj · acc.sg.n (rel.) └ ὑπὸ obl · agent └ κατηγόρων obj · gen.pl.m
<sentence id="pla.apol.17a.1" status="annotated">
<!-- Note: token ids 3, 4, 7 omitted in this excerpt (μέν, ὦ, ἄνδρες, Ἀθηναῖοι) -->
<token id="1" form="ὅ τι" lemma="ὅστις" pos="Pr-s---na-" head-id="5" relation="obj" morphology="Pr-s---na-"/>
<token id="2" form="ὑμεῖς" lemma="σύ" pos="Pp2p---n--" head-id="5" relation="sub" morphology="Pp2p---n--"/>
<token id="5" form="πεπόνθατε" lemma="πάσχω" pos="V-2praia2p-" head-id="9" relation="xcomp" morphology="V-2praia2p-"/>
<token id="6" form="ὑπὸ" lemma="ὑπό" pos="R--------" head-id="5" relation="obl" morphology="R--------"/>
<token id="8" form="κατηγόρων" lemma="κατήγορος" pos="Nb-p---mg-" head-id="6" relation="obj" morphology="Nb-p---mg-"/>
<token id="9" form="οὐκ" lemma="οὐ" pos="Df--------" head-id="10" relation="adv" morphology="Df--------"/>
<token id="10" form="οἶδα" lemma="οἶδα" pos="V--sria1s-" head-id="" relation="pred" morphology="V--sria1s-"/>
</sentence>Koine - New Testament, John 1:1
Ἐν ἀρχῇ ἦν ὁ λόγος, καὶ ὁ λόγος ἦν πρὸς τὸν θεόν, καὶ θεὸς ἦν ὁ λόγος
ἦν pred · impf.ind.act.3sg ├ Ἐν ἀρχῇ obl · prep + dat.sg.f └ ὁ λόγος sub · nom.sg.m
NT verse aligned cross-lingually to Latin (Vulgate), Gothic (Wulfila), and Old Church Slavonic (Marianus) via LaBSE sentence embedding + AwesomeAlign word alignment.
<sentence id="nt.john.1.1" status="annotated">
<token id="1" form="Ἐν" lemma="ἐν" pos="R--------" head-id="3" relation="obl" morphology="R--------"/>
<token id="2" form="ἀρχῇ" lemma="ἀρχή" pos="Nb-s---fd-" head-id="1" relation="obj" morphology="Nb-s---fd-"/>
<token id="3" form="ἦν" lemma="εἰμί" pos="V--siia3s-" head-id="" relation="pred" morphology="V--siia3s-"/>
<token id="4" form="ὁ" lemma="ὁ" pos="S-s---mn-" head-id="5" relation="aux" morphology="S-s---mn-"/>
<token id="5" form="λόγος" lemma="λόγος" pos="Nb-s---mn-" head-id="3" relation="sub" morphology="Nb-s---mn-"/>
</sentence>Late Antique - Eusebius, Historia Ecclesiastica 1.1.1
Τὰς τῶν ἱερῶν ἀποστόλων διαδοχάς… γραφῇ παραδοῦναι προῄρημαι "I have chosen to commit to writing the succession of the holy apostles…"
προῄρημαι pred · prf.mid.1sg ├ παραδοῦναι xcomp · aor.act.inf │ ├ διαδοχάς obj · acc.pl.f │ │ └ τῶν ἀποστόλων atr · gen.pl.m │ └ γραφῇ obl · dat.sg.f └ (ἐγώ) sub · 1sg implicit
Byzantine - Anna Komnene, Alexias 1.1.1
Ὁ χρόνος ῥέων ἀκάθεκτος καὶ ἀεί τι κινούμενος παρασύρει καὶ παραφέρει πάντα τὰ ἐν γενέσει "Time, flowing ungoverned and ever in motion, drags down and carries away all that is in becoming"
παρασύρει pred · prs.act.3sg ├ Ὁ χρόνος sub · nom.sg.m │ ├ ῥέων atr · pres.act.ptcp │ └ κινούμενος atr · pres.mid.ptcp ├ καὶ παραφέρει coord · prs.act.3sg └ πάντα τὰ ἐν γενέσει obj · acc.pl.n
Modern - Cavafy, Ithaca (1911)
Σὰ βγεῖς στὸν πηγαιμὸ γιὰ τὴν Ἰθάκη, νὰ εὔχεσαι νὰ ἦναι μακρὺς ὁ δρόμος
εὔχεσαι pred · prs.mid.2sg.imp ├ Σὰ βγεῖς adv · temporal subord clause └ νὰ ἦναι xcomp · prs.act.3sg ├ ὁ δρόμος sub · nom.sg.m └ μακρὺς pred-nom · adj.nom.sg.m
Cross-lingual NT alignment - John 1:1 in four languages
The PROIEL pipeline aligns the New Testament verse-by-verse across Greek + Latin (Vulgate) + Gothic (Wulfila) + Old Church Slavonic (Marianus). Below is John 1:1 in all four, with the same predicate-tree skeleton; the alignment edges run between the verbs ἦν / erat / was / бѣ. Classical Armenian ingestion is in progress.
Greek (Koine) ἦν pred · impf.ind.act.3sg ├ Ἐν ἀρχῇ obl · prep + dat.sg.f └ ὁ λόγος sub · nom.sg.m Latin (Vulgate) erat pred · impf.ind.act.3sg ├ In principio obl · prep + abl.sg.n └ Verbum sub · nom.sg.n Gothic (Wulfila) was pred · pret.ind.3sg ├ in fruma obl · prep + dat.sg └ waurd sub · nom.sg.n OCS (Marianus) бѣ pred · aor.3sg ├ искони adv · loc / adv └ слово sub · nom.sg.n
Forthcoming v0.7 IE parallels: Sanskrit, Old English, Avestan, Old Persian, Classical Armenian, and Ukrainian (modern East-Slavic, via the Ostroh Bible 1581 and 20th-c. revisions) for continuity with the OCS witness.
Method: sentence-level alignment via LaBSE embeddings, word-level alignment via mBERT attention through the AwesomeAlign procedure. Phonetic cognate scoring via ASJP sound classes and LingPy edit distance.
Open-access source provenance
Every source text in AthDGC is open-access: public domain or an open licence (CC-BY, CC-BY-SA, or equivalent). The annotation layer is AthDGC-original under CC-BY-4.0. The open-access chain is preserved through input -> annotation -> distribution. The map below records where each Greek period and each IE parallel draws its source text from.
Greek - per-period source map
| Period | Source archive | Licence | What we use |
|---|---|---|---|
| Archaic, Classical, Hellenistic | Perseus Digital Library (Tufts) | CC-BY-SA 4.0 | Homer, Hesiod, lyric, drama, Plato, Aristotle, Demosthenes, Lysias, Isocrates |
| Archaic, Classical, Hellenistic | Open Greek and Latin / First Thousand Years of Greek (Leipzig) | CC-BY-SA 4.0 | extends Perseus coverage; reference XML editions |
| Koine - NT | SBL Greek NT | SBL licence (CC-BY-equiv. scholarly) | NT backbone; verse-aligned to Latin / Gothic / OCS |
| Koine - NT | Tischendorf (1869-72), Westcott-Hort (1881) | public domain | cross-check editions; manuscript-tradition variants |
| Koine - LXX | Rahlfs (1935) via openscriptures.org | public domain | Septuagint backbone |
| Koine - documentary | Papyri.info / DDbDP | CC-BY 3.0 | papyrological Koine |
| Late Antique, Byzantine | Patrologia Graeca (Migne) via Documenta Catholica Omnia + Internet Archive | public domain | Athanasius, Basil, Chrysostom, Cyril, John of Damascus, Photios |
| Byzantine, Late Byzantine | Bibliotheca Augustana (Greek), pre-1928 Teubner | public domain | Psellos, Anna Komnene, Choniates, Akropolites, Pachymeres |
| Early Modern | Anemi (UoC) + Wikisource (el) | public domain | Cretan Renaissance (Kornaros, Chortatsis), Phanariot prose |
| Modern (19th c. - 1928) | Wikisource (el) + Anemi (UoC) | public domain | Solomos, Kalvos, Palamas, Cavafy, Papadiamantis |
| Modern (post-1928) | publisher-licensed editions | in copyright | annotation layer only; short quotation samples (fair use) |
IE parallels - per-language source map
| Language | Source archive | Licence | What we use |
|---|---|---|---|
| Latin (Vulgate) | Clementine Vulgate via Vulsearch + Latin Library | public domain | NT + Genesis-Psalms backbone |
| Gothic (Wulfila) | Wulfila Project (University of Antwerp) | CC-BY-SA | Codex Argenteus NT |
| OCS (Marianus) | TITUS (Frankfurt) | academic open access | Codex Marianus NT |
| Classical Armenian | TITUS + Digilib Armenian | academic open access | Zohrab NT (1805, public domain) |
| Sanskrit (Vedic, Brahmana, Upanisadic) | GRETIL (Goettingen) + SARIT + TITUS | academic open access | Rgveda, Brahmana prose, Upanisads (Brhad Aranyaka, Chandogya) |
| Old English (Wessex Gospels) | TEAMS + Bosworth-Toller + DOE corpus extracts | public domain + CC-BY-SA | Wessex Gospels (West Saxon) |
| Avestan (Yasna, Yashts) | TITUS (Frankfurt) | academic open access | Geldner edition (1886-95), public domain |
| Old Persian (Behistun) | TITUS + Kent (1953) PD transliteration | public domain + CC-BY (TITUS XML) | Behistun col. 1, DNa, DPe |
| Ukrainian (Ostroh 1581) | Ostroh Bible facsimile, National Library of Ukraine | public domain | NT + Genesis + Psalms |
| Ukrainian (20th-c. rev.) | Ohienko (1962), Khomenko (1963) | in copyright | annotation layer + short samples (fair use) |
What is republished and how
- Source text from any open-access archive above is republished verbatim in the PROIEL XML 2.0
tokenelements, with provenance recorded in thesourceheader. - PROIEL annotation layer (relation, head-id, morphology, lemma) is always AthDGC-original, released under CC-BY-4.0.
- In-copyright editions (post-1928 Loeb, post-1928 Teubner, post-1928 Modern Greek translations, 20th-c. Ukrainian) are never republished. We use the open-access antecedent (e.g. pre-1928 Teubner; SBL GNT instead of Nestle-Aland) or, where unavoidable, a short quotation sample under fair use with full attribution.
IE parallels - per-language showcase
The four NT-aligned parallels in v0.4 (Latin, Gothic, OCS, Classical Armenian) ship as PROIEL XML 2.0 partitions. The five v0.7 parallels (Sanskrit, Old English, Avestan, Old Persian, Ukrainian) are queued; status tags on each line below say what is shipped, sampled, or still in scoping.
IE-parallel matrix at a glance
| Language | Family | Period | Stanza model | Status | Aligned verses (v0.4) |
|---|---|---|---|---|---|
| Greek (Koine) | IE / Hellenic | 1st-2nd c. AD | grc_proiel |
annotated | 7,956 (NT backbone) |
| Latin (Vulgate) | IE / Italic | 4th c. AD | la_proiel |
annotated | 7,956 |
| Gothic (Wulfila) | IE / Germanic East | 4th c. AD | got_proiel |
annotated | 3,512 (NT extant) |
| OCS (Marianus) | IE / Slavic | 10th c. AD | cu_proiel |
sampled | 6,861 |
| Classical Armenian | IE / Armenian | 5th c. AD | xcl_proiel (in dev.) |
ingestion | 0 (queued for v0.5) |
| Sanskrit (Brahmana / Upanisadic) | IE / Indo-Iranian | 1000-500 BC | sa_vedic (queued) |
queued (v0.7) | 0 |
| Old English (Wessex Gospels) | IE / Germanic West | 10th c. AD | ang_proiel (queued) |
queued (v0.7) | 0 |
| Avestan (Yasna / Yashts) | IE / Indo-Iranian | 1000-500 BC | ae_proiel (queued) |
queued (v0.7) | 0 |
| Old Persian (Behistun) | IE / Indo-Iranian | 6th-5th c. BC | peo_proiel (queued) |
queued (v0.7) | 0 |
| Ukrainian (Ostroh + 20th-c. rev.) | IE / Slavic East | 1581 + 1962 | uk_dep (existing UD model adapted) |
queued (v0.7) | 0 |
Sampled sentences per IE parallel
The four boxes below give one sample sentence per language already shipped or in ingestion at v0.4, each rendered as a compact AthDGC PROIEL tree. All four are taken from passages with an attested Koine Greek counterpart so the cross-lingual alignment edge runs verb-to-verb.
Latin (Vulgate) - Vulgate Matt. 5:3
Beati pauperes spiritu, quoniam ipsorum est regnum caelorum "Blessed are the poor in spirit, for theirs is the kingdom of heaven"
Beati pred-nom · nom.pl.m (predicative) ├ pauperes sub · nom.pl.m │ └ spiritu obl · abl.sg.m (respect) └ est coord · prs.ind.act.3sg ├ quoniam adv · sub-conj ├ ipsorum iobj · gen.pl.m (possessor) └ regnum sub · nom.sg.n └ caelorum atr · gen.pl.n
Gothic (Wulfila) - Mark 1:1
Anastodeins aiwaggeljons Iesuis Xristaus, sunaus gudis "Beginning of the Gospel of Jesus Christ, the Son of God"
Anastodeins pred-nom · nom.sg.f (predicative) ├ aiwaggeljons atr · gen.sg.f ├ Iesuis atr · gen.sg.m │ └ Xristaus apos · gen.sg.m (apposition) └ sunaus apos · gen.sg.m └ gudis atr · gen.sg.m
OCS (Marianus) - John 1:1
Искони бѣ слово, и слово бѣ оу бога "In the beginning was the Word, and the Word was with God"
бѣ pred · aor.act.3sg ├ слово sub · nom.sg.n ├ искони adv · loc / adv └ бѣ coord · aor.act.3sg ├ слово sub · nom.sg.n └ оу бога obl · prep + gen.sg.m
Classical Armenian (ingestion) - Matt. 5:3
Erjankik' en alk'atk' hogwoy, zi noc'a ē ark'ayut'iwn erknits' "Blessed are the poor in spirit, for theirs is the kingdom of heaven"
Erjankik' pred-nom · nom.pl (predicative) ├ alk'atk' sub · nom.pl │ └ hogwoy obl · instr.sg (respect) └ ē coord · prs.ind.3sg ├ zi adv · sub-conj ├ noc'a iobj · gen.pl (possessor) └ ark'ayut'iwn sub · nom.sg └ erknits' atr · gen.pl
Armenian ingestion: editorial fix queued for v0.5; the surface tree above is the expected output of the xcl_proiel pipeline, which we are fine-tuning against the Zohrab edition.
v0.7 queued IE parallels - design previews
The five below are not in the v0.4 corpus, but their PROIEL frames have been scoped against the schema. Tree previews here are linguistically reviewed but not yet machine-generated.
Sanskrit (Vedic) - Brhad Aranyaka Upanisad 1.4.1 (preview)
atmaivedam agra asid eka eva "In the beginning this was the self alone, only one"
asid pred · impf.ind.act.3sg ├ atma sub · nom.sg.m (atman) ├ idam pred-nom · nom.sg.n (this world) ├ agra adv · loc.sg.n (beginning) └ eka eva atr · nom.sg.m + emph
Status: queued for v0.7.
Old English (Wessex Gospels) - John 1:1 (preview)
On frymþe wæs Word, and þæt Word wæs mid Gode "In the beginning was the Word, and the Word was with God"
wæs pred · pret.ind.act.3sg ├ Word sub · nom.sg.n ├ On frymþe obl · prep + dat.sg.m └ wæs coord · pret.ind.act.3sg ├ þæt Word sub · nom.sg.n + dem. └ mid Gode obl · prep + dat.sg.m
Status: queued for v0.7.
Avestan (Yasna 28.1) - preview
ahyā yāsā nemaŋhā ustānazastō rafedhrahyā "With this prayer of homage, with hands outstretched, I beseech the support"
yāsā pred · prs.subj.act.1sg ├ (1sg implicit) sub · 1sg implicit ├ ahyā nemaŋhā obl · gen + instr.sg (with this homage) ├ ustānazastō adv · bahuvrihi (outstretched-hands) └ rafedhrahyā obj · gen.sg.m (support)
Status: queued for v0.7. Yasna 28 is the Gathic opening.
Old Persian (Behistun col. 1.1) - preview
adam Darayavaus xsayathiya vazraka "I am Darius, the great king"
adam sub · nom.sg.m (1sg pronoun) ├ (copula implicit) pred · null ├ Darayavaus pred-nom · nom.sg.m ├ xsayathiya apos · nom.sg.m └ vazraka atr · nom.sg.m
Status: queued for v0.7.
Ukrainian (Ostroh Bible 1581 + 20th-c. rev.) - John 1:1 (preview)
На початку було Слово, і Слово було у Бога "In the beginning was the Word, and the Word was with God"
було pred · impf.ind.3sg.n ├ Слово sub · nom.sg.n ├ На початку obl · prep + loc.sg.m └ було coord · impf.ind.3sg.n ├ Слово sub · nom.sg.n └ у Бога obl · prep + gen.sg.m
Status: queued for v0.7. The Ukrainian partition provides modern East-Slavic continuity with the OCS witness. Two diachronic stages will be included: 1581 Ostroh + a 20th-c. revision (Ohienko or Khomenko).
Why this matters cross-linguistically
The same pred + sub + obl skeleton recurs across all nine languages above. What varies is what the schema treats as core argument vs adjunct (e.g. OCS искони glossed adv, Greek Ἐν ἀρχῇ glossed obl; the difference is whether a preposition is present). PROIEL's strict relation inventory makes those differences visible rather than hidden inside a generic dependency label. That is exactly the property that lets LightSIDE-AthDGC train classifiers on argument-structure features cross-lingually.
Archaic - Hesiod, Theogony 1
Μουσάων Ἑλικωνιάδων ἀρχώμεθ' ἀείδειν "From the Muses of Helicon let us begin to sing"
ἀρχώμεθα pred · prs.subj.mid.1pl ├ (1pl implicit) sub · 1pl implicit ├ Μουσάων obl · gen.pl.f (source) │ └ Ἑλικωνιάδων atr · gen.pl.f └ ἀείδειν xcomp · prs.act.inf
Classical - Sophocles, Oedipus Tyrannus 1
Ὦ τέκνα, Κάδμου τοῦ πάλαι νέα τροφή "O children, latest brood of ancient Cadmus"
τροφή pred-nom · nom.sg.f (predicative) ├ τέκνα voc · voc.pl.n (addressee) ├ νέα atr · adj.nom.sg.f └ Κάδμου atr · gen.sg.m └ τοῦ πάλαι atr · gen.sg.m + temporal adv
Koine - New Testament, John 1:14
καὶ ὁ λόγος σὰρξ ἐγένετο καὶ ἐσκήνωσεν ἐν ἡμῖν "And the Word became flesh and dwelt among us"
ἐγένετο pred · aor.ind.mid.3sg ├ ὁ λόγος sub · nom.sg.m ├ σὰρξ pred-nom · nom.sg.f └ καὶ ἐσκήνωσεν coord · aor.ind.act.3sg └ ἐν ἡμῖν obl · prep + dat.pl
Late Antique - Basil the Great, Hexaemeron 1.1
Ἐν ἀρχῇ ἐποίησεν ὁ Θεὸς τὸν οὐρανὸν καὶ τὴν γῆν "In the beginning God made the heaven and the earth"
ἐποίησεν pred · aor.ind.act.3sg ├ ὁ Θεὸς sub · nom.sg.m ├ Ἐν ἀρχῇ obl · prep + dat.sg.f └ τὸν οὐρανὸν obj · acc.sg.m └ καὶ τὴν γῆν coord · acc.sg.f
Byzantine - Michael Psellos, Chronographia 1.1
Βασιλεύσας Βασίλειος ἔτη πεντήκοντα ἐνὸς δέοντα "Basil [II] having reigned for fifty years less one"
Βασιλεύσας pred · aor.act.ptcp.nom.sg.m ├ Βασίλειος sub · nom.sg.m └ ἔτη obl · acc.pl.n (duration) ├ πεντήκοντα atr · num └ ἐνὸς δέοντα atr · gen + ptcp
Late Byzantine - Niketas Choniates, Historia 1.1
Ὁ μὲν οὖν λόγος μοι περὶ τῶν Κομνηνῶν πρόεισιν "My discourse, then, proceeds concerning the Komnenoi"
πρόεισιν pred · prs.ind.act.3sg ├ Ὁ λόγος sub · nom.sg.m │ ├ μὲν οὖν adv · discourse ptcl ├ μοι iobj · dat.sg (possessor) └ περὶ τῶν Κομνηνῶν obl · prep + gen.pl.m
The Late Byzantine partition is in active ingestion; Choniates' Χρονικὴ Διήγησις is the reference text. Expected at v0.6.
Early Modern - Cretan Renaissance: Erotokritos (Kornaros, c.1600), opening
Τοῦ Κύκλου τὰ γυρίσματα, ποὺ ἀνεβοκατεβαίνουν "The turnings of the Wheel, which rise and fall"
γυρίσματα pred-nom · nom.pl.n (head of nominal clause) ├ Τοῦ Κύκλου atr · gen.sg.m └ ἀνεβοκατεβαίνουν atr · rel.cl · prs.ind.act.3pl └ ποὺ sub · rel.pron (subj of rel.cl)
The Early Modern partition is in active ingestion; Cretan Renaissance (Kornaros, Chortatsis) and the Phanariot prose tradition are the reference corpora. Expected at v0.6.
Additional canonical first lines
The six samples below expand the period coverage with further uncontroversial first-line parses. Each uses only the AthDGC compact dependency overview.
Archaic - Aeschylus, Agamemnon 1
Θεοὺς μὲν αἰτῶ τῶνδ' ἀπαλλαγὴν πόνων "I ask the gods for release from these labours"
αἰτῶ pred · prs.ind.act.1sg ├ (1 sg implicit) sub · 1sg implicit ├ Θεοὺς obj · acc.pl.m ├ μὲν adv · discourse ptcl └ ἀπαλλαγὴν obj · acc.sg.f └ τῶνδε πόνων atr · gen.pl.m
Classical - Thucydides, Historiae 1.1.1
Θουκυδίδης Ἀθηναῖος ξυνέγραψε τὸν πόλεμον τῶν Πελοποννησίων καὶ Ἀθηναίων "Thucydides the Athenian wrote up the war of the Peloponnesians and the Athenians"
ξυνέγραψε pred · aor.ind.act.3sg ├ Θουκυδίδης sub · nom.sg.m │ └ Ἀθηναῖος atr · nom.sg.m └ τὸν πόλεμον obj · acc.sg.m └ τῶν Πελοποννησίων καὶ Ἀθηναίων atr · gen.pl.m
Koine - Plutarch, Vita Alexandri 1.1
Τὸν Ἀλεξάνδρου τοῦ βασιλέως βίον γράφοντες "Writing the life of King Alexander"
γράφοντες pred · pres.act.ptcp.nom.pl.m ├ (1 pl implicit) sub · 1pl implicit └ Τὸν βίον obj · acc.sg.m └ Ἀλεξάνδρου atr · gen.sg.m └ τοῦ βασιλέως atr · gen.sg.m
Note: Plutarch's Vita Alexandri* opens with a participial absolute that resolves in the following clause. The full sentence-level annotation continues in the next sample id.*
Late Antique - Procopius, De bello Gothico 1.1.1
Προκόπιος Καισαρεὺς τοὺς πολέμους ξυνέγραψεν "Procopius of Caesarea wrote up the wars"
ξυνέγραψεν pred · aor.ind.act.3sg ├ Προκόπιος sub · nom.sg.m │ └ Καισαρεὺς atr · nom.sg.m └ τοὺς πολέμους obj · acc.pl.m
Note: Procopius is consciously modelling Thucydides 1.1.1 above. Compare the two trees: both have the same pred[sub:nom, obj:acc] valency frame with atr:nom-nom ethnonym modifier. The retelling-chain explorer (v0.5) will surface such pairings automatically.
Byzantine - John of Damascus, De Fide Orthodoxa 1.1, opening
Θεὸν οὐδεὶς ἑώρακε πώποτε "No one has ever seen God"
ἑώρακε pred · prf.ind.act.3sg ├ οὐδεὶς sub · nom.sg.m ├ Θεὸν obj · acc.sg.m └ πώποτε adv · temporal
John of Damascus opens with a direct quotation of John 1:18; see the retelling chain below.
Late Antique / Byzantine - Photios, Bibliotheca codex 1, opening
Ἀνέγνωσται ἡμῖν λόγος Θεοδώρου τοῦ Ἀντιοχέως "We read a treatise of Theodore of Antioch"
Ἀνέγνωσται pred · prf.ind.mp.3sg ├ λόγος sub · nom.sg.m │ └ Θεοδώρου atr · gen.sg.m │ └ τοῦ Ἀντιοχέως atr · gen.sg.m └ ἡμῖν iobj · dat.pl
Larger paragraph samples - per-token annotation
Three further period-rich passages, each rendered as paragraph + colour-coded compact dependency overview + a per-token PROIEL annotation table (id, surface form, lemma, PROIEL POS, 10-character morphology, head id, relation). The table format is the AthDGC review format used on every annotated sentence in the v0.5 partitions.
Late Antique - Basil the Great, Hexaemeron 1.1.1-3
Ἐν ἀρχῇ ἐποίησεν ὁ Θεὸς τὸν οὐρανὸν καὶ τὴν γῆν. Πρέπουσα ἀρχὴ τῷ περὶ τῆς τοῦ κόσμου συστάσεως μέλλοντι διηγεῖσθαι, ἀρχὴν τῆς τῶν ὁρωμένων διακοσμήσεως προθεῖναι τοῦ λόγου.
Three sentences. The opening verbatim Genesis citation, then a meta-commentary on the rhetorical propriety of opening with that citation.
Hex. 1.1.1 (the LXX-Genesis quotation) ἐποίησεν pred · aor.ind.act.3sg ├ ὁ Θεὸς sub · nom.sg.m ├ Ἐν ἀρχῇ obl · prep + dat.sg.f └ τὸν οὐρανὸν obj · acc.sg.m └ καὶ τὴν γῆν coord · acc.sg.f Hex. 1.1.2 Πρέπουσα pred-nom · prs.ptcp.act.nom.sg.f ├ ἀρχὴ sub · nom.sg.f └ τῷ μέλλοντι iobj · ptcp.dat.sg.m (substantivised) ├ περὶ τῆς συστάσεως obl · prep + gen.sg.f │ └ τοῦ κόσμου atr · gen.sg.m └ διηγεῖσθαι xcomp · prs.inf.mp Hex. 1.1.3 προθεῖναι pred · aor.inf.act (infinitival main) ├ ἀρχὴν obj · acc.sg.f │ └ τῆς διακοσμήσεως atr · gen.sg.f │ └ τῶν ὁρωμένων atr · ptcp.gen.pl (substantivised) └ τοῦ λόγου obl · gen.sg.m (partitive)
Per-token PROIEL annotation - Hex. 1.1.1
| id | form | lemma | pos | morph | head | rel |
|---|---|---|---|---|---|---|
| 1 | Ἐν | ἐν | R- | -------- | 3 | obl |
| 2 | ἀρχῇ | ἀρχή | Nb | -s---fd-- | 1 | obl |
| 3 | ἐποίησεν | ποιέω | V- | 3saia---- | 0 | pred |
| 4 | ὁ | ὁ | S- | -s---ma-- | 5 | aux |
| 5 | Θεὸς | θεός | Nb | -s---mn-- | 3 | sub |
| 6 | τὸν | ὁ | S- | -s---ma-- | 7 | aux |
| 7 | οὐρανὸν | οὐρανός | Nb | -s---ma-- | 3 | obj |
| 8 | καὶ | καί | C- | -------- | 7 | aux |
| 9 | τὴν | ὁ | S- | -s---fa-- | 10 | aux |
| 10 | γῆν | γῆ | Nb | -s---fa-- | 7 | coord |
Read it: ten tokens, one root predicate (ἐποίησεν = pred), one nominal subject (Θεός = sub), one locative oblique (Ἐν ἀρχῇ = obl), one accusative object (οὐρανόν = obj), one coordinated accusative (καὶ τὴν γῆν = coord). The morph column uses the standard 10-position PROIEL code: person, number, tense, mood, voice, gender, case, degree, strength, inflection.
Byzantine - Anna Komnene, Alexias prologue 1.1
Ὁ ῥέων χρόνος ἀκάθεκτός τις ὢν καὶ ἀεί τι κινούμενος παρασύρει καὶ παραφέρει πάντα τὰ γινόμενα, καὶ εἰς τὸν τῆς ἀφανείας βυθὸν καταβαπτίζει.
A single periodic sentence opening the Alexias. Two coordinate finite verbs (παρασύρει + παραφέρει) share a complex subject + object; a third coordinated verb (καταβαπτίζει) shifts to a directional obl.
παρασύρει pred · prs.ind.act.3sg ├ Ὁ ῥέων χρόνος sub · nom.sg.m (ptcp + N) │ ├ ἀκάθεκτός ὢν atr · pred-ptcp clause │ └ ἀεί τι κινούμενος atr · pred-ptcp clause ├ πάντα τὰ γινόμενα obj · acc.pl.n (ptcp substantive) └ παραφέρει coord · prs.ind.act.3sg └ (shared sub + obj with παρασύρει) καὶ καταβαπτίζει coord · prs.ind.act.3sg └ εἰς τὸν βυθὸν obl · prep + acc.sg.m (directional) └ τῆς ἀφανείας atr · gen.sg.f
Per-token PROIEL annotation - Alexias prologue 1.1 (first 12 tokens)
| id | form | lemma | pos | morph | head | rel |
|---|---|---|---|---|---|---|
| 1 | Ὁ | ὁ | S- | -s---ma-- | 3 | aux |
| 2 | ῥέων | ῥέω | V- | -srpama-- | 3 | atr |
| 3 | χρόνος | χρόνος | Nb | -s---mn-- | 9 | sub |
| 4 | ἀκάθεκτός | ἀκάθεκτος | A- | -s---mn-- | 5 | pred-nom |
| 5 | ὢν | εἰμί | V- | -srpama-- | 3 | atr |
| 6 | καὶ | καί | C- | -------- | 5 | aux |
| 7 | ἀεί | ἀεί | Df | -------- | 8 | adv |
| 8 | κινούμενος | κινέω | V- | -srpmma-- | 3 | atr |
| 9 | παρασύρει | παρασύρω | V- | 3spia---- | 0 | pred |
| 10 | πάντα | πᾶς | A- | -p---na-- | 12 | atr |
| 11 | τὰ | ὁ | S- | -p---na-- | 12 | aux |
| 12 | γινόμενα | γίγνομαι | V- | -prpmna-- | 9 | obj |
Read it: the participle chain ῥέων + ὢν + κινούμενος all attribute to χρόνος (head 3); the matrix verb is παρασύρει (head 9, root). The substantivised participle τὰ γινόμενα fills the obj slot under the matrix verb.
Early Modern - Cretan Renaissance: Kornaros, Erotokritos 1.1-4
Τοῦ Κύκλου τὰ γυρίσματα, πoὺ ἀνεβoκατεβαίνoυν, καὶ τοῦ Τροχοῦ, πoὺ ὥρες ψηλὰ κι ὥρες στὰ βάθη πάει, καὶ τοῦ Καιροῦ τ' ἀλλάματα, πoὺ ἀναπαημὸν δὲν ἔχουν, μὰ στὸ καλὸ κ' εἰς τὸ κακὸ περιπατοῦν καὶ τρέχουν,
Four lines. Three nominal subjects (γυρίσματα, τοῦ Τροχοῦ, τ' ἀλλάματα) coordinated under one main predicate (the implicit "are the subject of my song"); each carries a relative clause (ποὺ ...) with finite Modern Greek inflection.
γυρίσματα sub · nom.pl.n (head of subject NP) ├ Τοῦ Κύκλου atr · gen.sg.m └ ἀνεβοκατεβαίνουν atr · rel.cl · prs.ind.act.3pl └ ποὺ sub · rel.pron καὶ τοῦ Τροχοῦ coord · gen.sg.m └ πάει atr · rel.cl · prs.ind.act.3sg ├ ποὺ sub · rel.pron ├ ὥρες ψηλὰ adv · temporal + locative └ κι ὥρες στὰ βάθη coord · temporal + locative καὶ τ' ἀλλάματα coord · nom.pl.n ├ τοῦ Καιροῦ atr · gen.sg.m └ ἔχουν atr · rel.cl · prs.ind.act.3pl ├ ποὺ sub · rel.pron ├ ἀναπαημὸν obj · acc.sg.m └ δὲν adv · neg μὰ περιπατοῦν coord · prs.ind.act.3pl ├ στὸ καλὸ obl · prep + acc.sg.n └ εἰς τὸ κακὸ coord · prep + acc.sg.n καὶ τρέχουν coord · prs.ind.act.3pl
Per-token PROIEL annotation - Erotokritos 1.1 (first 8 tokens)
| id | form | lemma | pos | morph | head | rel |
|---|---|---|---|---|---|---|
| 1 | Τοῦ | ὁ | S- | -s---mg-- | 2 | aux |
| 2 | Κύκλου | κύκλος | Nb | -s---mg-- | 4 | atr |
| 3 | τὰ | ὁ | S- | -p---nn-- | 4 | aux |
| 4 | γυρίσματα | γύρισμα | Nb | -p---nn-- | 0 | sub |
| 5 | ποὺ | ποὺ | Pr | -------- | 6 | sub |
| 6 | ἀνεβοκατεβαίνουν | ἀνεβοκατεβαίνω | V- | 3pria---- | 4 | atr |
| 7 | καὶ | καί | C- | -------- | 8 | aux |
| 8 | τοῦ | ὁ | S- | -s---mg-- | 9 | aux |
Read it: the head of the subject NP is γυρίσματα (pos Nb, case n); the relative pronoun ποὺ (pos Pr) takes sub in its own clause; the relative-clause verb ἀνεβοκατεβαίνουν takes atr of γυρίσματα. Modern Greek 3pl present tense suffix preserved morphologically.
Classical - Thucydides, Historiae 1.22.1-2 (methodological prologue)
Καὶ ὅσα μὲν λόγῳ εἶπον ἕκαστοι ἢ μέλλοντες πολεμήσειν ἢ ἐν αὐτῷ ἤδη ὄντες, χαλεπὸν τὴν ἀκρίβειαν αὐτὴν τῶν λεχθέντων διαμνημονεῦσαι ἦν ἐμοί τε ὧν αὐτὸς ἤκουσα καὶ τοῖς ἄλλοθέν ποθεν ἐμοὶ ἀπαγγέλλουσιν.
A two-clause periodic sentence opening Thucydides' canonical methodological prologue. The first clause is a relative-clause topic (ὅσα... εἶπον); the second clause predicates difficulty (χαλεπὸν ἦν) on the act of recall. The whole prologue runs to Histories 1.22.4 and is the locus classicus for Greek historiographical self-reflection.
1.22.1a (relative-clause topic) εἶπον pred · aor.ind.act.3pl ├ ἕκαστοι sub · nom.pl.m ├ ὅσα obj · acc.pl.n (rel.pron) ├ λόγῳ obl · dat.sg.m (manner) ├ μέλλοντες πολεμήσειν atr · ptcp + fut.inf └ ἐν αὐτῷ ὄντες coord · ptcp.nom.pl.m (in it) 1.22.1b (matrix predication) ἦν pred · impf.ind.act.3sg ├ χαλεπὸν pred-nom · nom.sg.n (predicative) ├ διαμνημονεῦσαι sub · aor.inf.act (clausal subject) │ ├ τὴν ἀκρίβειαν αὐτὴν obj · acc.sg.f │ └ τῶν λεχθέντων atr · ptcp.gen.pl.n ├ ἐμοί iobj · dat.sg (experiencer) └ τοῖς ἀπαγγέλλουσιν coord · ptcp.dat.pl.m (other reporters) ├ ἄλλοθέν ποθεν adv · loc + indef └ ἐμοὶ iobj · dat.sg
Per-token PROIEL annotation - 1.22.1b (matrix predication, first 8 tokens)
| id | form | lemma | pos | morph | head | rel |
|---|---|---|---|---|---|---|
| 1 | χαλεπὸν | χαλεπός | A- | -s---nn-- | 4 | pred-nom |
| 2 | τὴν | ὁ | S- | -s---fa-- | 3 | aux |
| 3 | ἀκρίβειαν | ἀκρίβεια | Nb | -s---fa-- | 6 | obj |
| 4 | ἦν | εἰμί | V- | 3siia---- | 0 | pred |
| 5 | ἐμοί | ἐγώ | Pp | -s---md-- | 4 | iobj |
| 6 | διαμνημονεῦσαι | διαμνημονεύω | V- | --apa---- | 4 | sub |
| 7 | τῶν | ὁ | S- | -p---ng-- | 8 | aux |
| 8 | λεχθέντων | λέγω | V- | -prpana-- | 3 | atr |
Read it: matrix verb ἦν (head 4, root) takes χαλεπὸν as pred-nom and the infinitival clause headed by διαμνημονεῦσαι as clausal sub. The accusative ἀκρίβειαν and its participial atr λεχθέντων sit inside that clausal subject. Dative ἐμοί fills the experiencer iobj slot.
Late Antique - Eusebius, Historia Ecclesiastica 1.1.1-2
Τὰς τῶν ἱερῶν ἀποστόλων διαδοχὰς σὺν καὶ τοῖς ἀπὸ τοῦ σωτῆρος ἡμῶν καὶ εἰς ἡμᾶς διηνυσμένοις χρόνοις, ὅσα τε καὶ πηλίκα πραγματευθῆναι κατὰ τὴν ἐκκλησιαστικὴν ἱστορίαν λέγεται, καὶ ὅσοι ταύτης διαπρεπῶς ἐν ταῖς μάλιστα ἐπισημοτάταις παροικίαις ἡγήσαντό τε καὶ προέστησαν, γραφῇ παραδοῦναι πεπείραμαι.
The opening sentence of Eusebius' Ecclesiastical History. A single periodic sentence with three coordinated nominal objects (τὰς διαδοχὰς, ὅσα τε καὶ πηλίκα, ὅσοι) all governed by the matrix infinitive παραδοῦναι under πεπείραμαι. The matrix 1sg perfect mid-passive πεπείραμαι takes the infinitive as its clausal complement.
1.1.1-2 (matrix periphrastic predicate) πεπείραμαι pred · prf.ind.mp.1sg ├ (1sg implicit) sub · 1sg └ παραδοῦναι xcomp · aor.inf.act ├ γραφῇ obl · dat.sg.f (instrument) ├ τὰς διαδοχὰς obj · acc.pl.f │ ├ τῶν ἱερῶν ἀποστόλων atr · gen.pl.m │ └ σὺν τοῖς χρόνοις obl · prep + dat.pl.m │ ├ διηνυσμένοις atr · ptcp.dat.pl.m │ └ ἀπὸ τοῦ σωτῆρος εἰς ἡμᾶς obl · prep + gen / acc ├ ὅσα τε καὶ πηλίκα obj · acc.pl.n (coord) │ └ πραγματευθῆναι xcomp · aor.inf.mp │ └ κατὰ τὴν ἐκκλ. ἱστορίαν obl · prep + acc.sg.f └ ὅσοι obj · nom.pl.m (rel.pron) ├ ἡγήσαντό τε καὶ προέστησαν atr · rel.cl · aor.ind.mp + aor.ind.act.3pl ├ ταύτης atr · gen.sg.f ├ διαπρεπῶς adv · manner └ ἐν ταῖς ἐπισημοτάταις παροικίαις obl · prep + dat.pl.f
Per-token PROIEL annotation - matrix + first object (first 9 tokens)
| id | form | lemma | pos | morph | head | rel |
|---|---|---|---|---|---|---|
| 1 | πεπείραμαι | πειράω | V- | 1sria---- | 0 | pred |
| 2 | παραδοῦναι | παραδίδωμι | V- | --asa---- | 1 | xcomp |
| 3 | γραφῇ | γραφή | Nb | -s---fd-- | 2 | obl |
| 4 | τὰς | ὁ | S- | -p---fa-- | 5 | aux |
| 5 | διαδοχὰς | διαδοχή | Nb | -p---fa-- | 2 | obj |
| 6 | τῶν | ὁ | S- | -p---mg-- | 8 | aux |
| 7 | ἱερῶν | ἱερός | A- | -p---mg-- | 8 | atr |
| 8 | ἀποστόλων | ἀπόστολος | Nb | -p---mg-- | 5 | atr |
| 9 | σὺν | σύν | R- | -------- | 11 | obl |
Read it: the matrix prf.mp 1sg πεπείραμαι (head 1, root) takes the aor.inf.act παραδοῦναι as clausal xcomp. The infinitive then takes γραφῇ as instrumental obl, and the three nominal objects (τὰς διαδοχὰς, ὅσα τε καὶ πηλίκα, ὅσοι) all hang off it as obj. The participial atr διηνυσμένοις under τοῖς χρόνοις preserves the classic Greek attributive-participle pattern.
Retranslation chain - NT John 3:16 across Greek periods
Same canonical verse rendered across three Greek diachronic stages. Three parallel PROIEL trees show that the central [pred + sub + obj + obl] skeleton survives the 2,000-year journey, while the dative/accusative case alternation on the recipient and the synthetic-vs-analytic perfect realisation shift across periods.
Node A - Koine (1st c. AD)
Οὕτως γὰρ ἠγάπησεν ὁ θεὸς τὸν κόσμον, ὥστε τὸν υἱὸν τὸν μονογενῆ ἔδωκεν
pred ἠγάπησεν aor.ind.act.3sg ├ sub ὁ θεὸς nom.sg.m ├ obj τὸν κόσμον acc.sg.m ├ adv Οὕτως manner └ adv ὥστε result-clause └ pred ἔδωκεν aor.ind.act.3sg ├ sub (3sg implicit; θεὸς) └ obj τὸν υἱὸν τὸν μονογενῆ acc.sg.m
Source. Aor.ind.act + sub + obj + manner-adv + result-clause with elliptic 3sg subject.
Node B - Byzantine homiletic (10th c.)
Τοσοῦτον γὰρ ἠγάπησεν ὁ θεὸς τὸν κόσμον, ὥστε καὶ τὸν μονογενῆ αὐτοῦ υἱὸν δοῦναι αὐτῷ
pred ἠγάπησεν aor.ind.act.3sg ├ sub ὁ θεὸς nom.sg.m ├ obj τὸν κόσμον acc.sg.m ├ adv Τοσοῦτον degree └ adv ὥστε result-clause └ xcomp δοῦναι aor.inf.act ├ obj τὸν μονογενῆ αὐτοῦ υἱὸν acc.sg.m └ iobj αὐτῷ dat.sg.m
Byzantine. The result clause is now infinitival (δοῦναι) rather than finite, and the recipient αὐτῷ appears explicitly as iobj (dat.sg). Manner Οὕτως shifts to degree Τοσοῦτον.
Node C - Modern Greek (post-1976)
Τόσο πολύ αγάπησε ο Θεός τον κόσμο, ώστε έδωσε τον μονογενή Του Υιό
pred αγάπησε aor.ind.act.3sg ├ sub ο Θεός nom.sg.m ├ obj τον κόσμο acc.sg.m ├ adv Τόσο πολύ degree └ adv ώστε result-clause └ pred έδωσε aor.ind.act.3sg ├ sub (3sg implicit; Θεός) └ obj τον μονογενή Του Υιό acc.sg.m
Demotic. Same skeleton as Node A. The result clause returns to finite aor.ind.act (έδωσε); the recipient slot collapses back to elliptic; degree Τόσο πολύ preserved from Node B.
What the chain shows. The pred + sub + obj + adv(degree) + adv(result-clause) skeleton is preserved across all three nodes; only the result-clause head varies (finite Node A and C; infinitival Node B). The recipient iobj slot appears only in the Byzantine middle node, where the infinitival reframing requires an explicit dative. Aspect is uniform aor.ind.act throughout. The same retranslation pattern (finite -> infinitival -> finite) recurs across many NT verses; the v0.5 retranslation-pair browser will query this kind of structural alternation at scale.
Retelling chain - Iliad 1.1 across reception
Same opening, three reception nodes, three parallel PROIEL trees. Read the columns left to right to see which syntactic features persist across reception and which collapse.
Node 1 - Homeric (8th c. BC)
μῆνιν ἄειδε θεὰ Πηληϊάδεω Ἀχιλῆος
pred ἄειδε 2sg.impv.act ├ obj μῆνιν acc.sg.f │ └ atr Ἀχιλῆος gen │ └ atr Πηληϊάδεω gen └ voc θεά voc.sg.f
Invocation. 2 sg imperative + vocative addressee. Genitive attributive chain.
Node 2 - Tzetzes (12th c. Byz.)
Ὅμηρος ἀείδει τὴν μῆνιν τοῦ Πηληϊάδου Ἀχιλλέως
pred ἀείδει 3sg.ind.act ├ sub Ὅμηρος nom.sg.m └ obj μῆνιν acc.sg.f └ atr Ἀχιλλέως gen └ atr Πηληϊάδου gen
Narration. Mood flips imp -> ind, explicit subject Ὅμηρος fills sub slot, vocative collapses. Object NP + atr chain survive.
Node 3 - Kazantzakis-Kakridis (1955)
Τὴ μῆνη, θεά, ψάλε, τοῦ ξακουστοῦ Ἀχιλέα
pred ψάλε 2sg.impv.act ├ obj τὴ μῆνη acc.sg.f │ └ atr τοῦ Ἀχιλέα gen.sg.m │ └ atr τοῦ ξακουστοῦ gen └ voc θεά voc.sg.f
Translation. Imperative + vocative restored; verb lexicalises ψάλε (Modern); atr chain compresses; PROIEL frame returns to [obj:acc, voc:voc].
What the chain shows. The [obj:acc, voc:voc] argument frame is stable across 2,800 years (Nodes 1 and 3 match exactly); it collapses only in Tzetzes' narrative reframing (Node 2). The genitive attributive chain stays in all three nodes. The retelling-chain explorer (v0.5) will surface this kind of cross-period stability and divergence at scale.
Retranslation chain - Septuagint Psalm 1:1 across periods
Same canonical verse rendered into Greek across three diachronic stages, three parallel PROIEL trees.
Node A - LXX (3rd c. BC)
Μακάριος ἀνὴρ ὃς οὐκ ἐπορεύθη ἐν βουλῇ ἀσεβῶν
μακάριος pred-nom nom.sg.m ├ ἀνὴρ sub nom.sg.m └ ἐπορεύθη atr aor.ind.pass.3sg ├ οὐκ adv neg └ ἐν βουλῇ obl prep+dat └ ἀσεβῶν atr gen.pl
Predicative μακάριος. Negated aor.pass.3sg in the relative clause.
Node B - Byzantine (10th c.)
Μακάριος γὰρ ὑπάρχει ἀνὴρ ὃς οὐ συμπορεύεται τοῖς ἀσεβέσιν
ὑπάρχει pred prs.ind.act.3sg ├ ἀνὴρ sub nom.sg.m │ └ μακάριος atr nom.sg.m (depictive) └ συμπορεύεται atr prs.ind.mp.3sg ├ οὐ adv neg └ τοῖς ἀσεβέσιν iobj dat.pl
Explicit copula ὑπάρχει takes pred; μακάριος demoted to depictive atr. Voice + aspect shift: aor.pass -> prs.mp.
Node C - Modern Greek (post-1976)
Μακάριος είναι ο άνθρωπος που δεν πήγε στη βουλή των ασεβών
είναι pred prs.ind.act.3sg (copula) ├ ο άνθρωπος sub nom.sg.m ├ μακάριος pred-nom nom.sg.m └ πήγε atr aor.ind.act.3sg ├ δεν adv neg └ στη βουλή obl prep+acc └ των ασεβών atr gen.pl
Demotic. Voice flips Node A pass -> active; μακάριος is back as pred-nom.
What the chain shows. The pred-nom + sub + atr.rel-cl skeleton survives all three nodes; voice on the embedded relative-clause predicate flips twice (LXX pass -> Byz mid/pass -> Modern active); the negative particle changes form three times (οὐκ -> οὐ -> δεν). The retranslation-pair browser (v0.5) will query this kind of voice + negation co-variation across the corpus at scale.
Retelling chain - NT John 1:1 across patristic + Modern reception
The same opening verse reframed across the patristic and Modern Greek reception. Four nodes show how the copular skeleton ἦν + sub:λόγος + obl:ἐν ἀρχῇ survives, while the discursive surrounding shifts: from Johannine prose, to dogmatic citation, to homiletic expansion, to demotic translation.
Node 1 - John (1st c. AD)
Ἐν ἀρχῇ ἦν ὁ λόγος
pred ἦν impf.ind.act.3sg ├ sub ὁ λόγος nom.sg.m └ obl Ἐν ἀρχῇ prep + dat.sg.f
Source verse. Impf.ind copula + nominal sub + locative obl. Foundational predicate-tree skeleton.
Node 2 - John Chrysostom, In Joannem hom. 2 (4th c.)
Φησίν· ἐν ἀρχῇ ἦν ὁ λόγος
pred φησίν prs.ind.act.3sg └ xcomp ἦν impf.ind.act.3sg ├ sub ὁ λόγος nom.sg.m └ obl ἐν ἀρχῇ prep + dat.sg.f
Direct citation framed by φησίν. Original predicate becomes embedded xcomp; sub + obl preserved inside the embedded clause.
Node 3 - John of Damascus, De fide orth. 1.1 (8th c.)
Ὁ θεῖος Ἰωάννης φησίν· ἐν ἀρχῇ ἦν ὁ λόγος, καὶ ὁ λόγος ἦν πρὸς τὸν θεόν
pred φησίν prs.ind.act.3sg ├ sub Ὁ θεῖος Ἰωάννης nom.sg.m └ xcomp ἦν impf.ind.act.3sg ├ sub ὁ λόγος nom.sg.m ├ obl ἐν ἀρχῇ prep + dat.sg.f └ coord ἦν impf.ind.act.3sg ├ sub ὁ λόγος nom.sg.m (repeated) └ obl πρὸς τὸν θεόν prep + acc.sg.m
Dogmatic expansion. Citation now has explicit sub (Ἰωάννης) and the source verse is extended by coord to John 1:1b. The skeleton recurses.
Node 4 - Modern Greek liturgical (post-1976)
Στην αρχή ήταν ο λόγος
pred ήταν impf.ind.act.3sg ├ sub ο λόγος nom.sg.m └ obl Στην αρχή prep + acc.sg.f
Demotic. Predicate verb survives (ἦν -> ήταν); the dat.sg.f ἀρχῇ shifts to acc.sg.f under the obl-prep, but the obl slot itself is preserved.
What the chain shows. The pred + sub + obl skeleton is stable across 2,000 years in Nodes 1 and 4; in the patristic Nodes 2 and 3 it nests under xcomp of a citation verb φησίν. The retelling-chain explorer (v0.5) renders this nesting and its embedding depth as a visual reception graph.
Retelling chain - Plato, Apology 17a across reception
The Socratic opening reframed across philosophical reception. From direct first-person prose, to Neoplatonic citation, to Byzantine philosophical commentary, to a Modern Greek translation.
Node 1 - Plato (4th c. BC)
Ὅ τι μὲν ὑμεῖς, ὦ ἄνδρες Ἀθηναῖοι, πεπόνθατε ὑπὸ τῶν ἐμῶν κατηγόρων, οὐκ οἶδα
pred οἶδα prs.ind.act.1sg ├ sub (1sg implicit) ├ adv οὐκ neg └ comp πεπόνθατε prf.ind.act.2pl ├ sub ὑμεῖς nom.pl ├ voc ὦ ἄνδρες Ἀθηναῖοι voc.pl.m └ obl ὑπὸ τῶν κατηγόρων prep + gen.pl.m
Source. Negated 1sg main verb takes a comp clause with raised 2pl sub + vocative.
Node 2 - Olympiodorus, In Plat. Gorg. (6th c.)
Ὁ Σωκράτης λέγει ὅτι οὐκ οἶδα τί πεπόνθατε
pred λέγει prs.ind.act.3sg ├ sub Ὁ Σωκράτης nom.sg.m └ xcomp οἶδα prs.ind.act.1sg ├ adv οὐκ neg └ comp πεπόνθατε prf.ind.act.2pl └ sub (2pl implicit)
Citation. Original 1sg main verb demoted to xcomp; new matrix sub Σωκράτης; the embedded comp + neg + perf survive.
Node 3 - Modern Greek (post-1976)
Δεν ξέρω τί έχετε πάθει εσείς από τους κατηγόρους μου
pred ξέρω prs.ind.act.1sg ├ sub (1sg implicit) ├ adv Δεν neg └ comp έχετε πάθει prf.ind.act.2pl ├ sub εσείς nom.pl └ obl από τους κατηγόρους prep + acc.pl.m
Demotic translation. Same skeleton as Node 1: negated 1sg main + comp with raised 2pl sub + obl prepositional phrase. Perfect now analytic (έχω + ptcp).
What the chain shows. The pred + adv:neg + comp[sub, obl] skeleton is preserved in Nodes 1 and 3; the Neoplatonic Node 2 wraps it under a citation verb. The perfect aspect is preserved across all three nodes but realised synthetically (Greek πεπόνθατε) vs analytically (Modern έχετε πάθει).
Retranslation chain - LXX Genesis 1:1 across IE languages
Same canonical verse rendered across the four NT-aligned IE parallels in v0.4 (Greek Koine, Latin Vulgate, Gothic Wulfila, OCS Marianus). This is the project's most direct visual demonstration that PROIEL's relation inventory carries cross-lingually.
Node A - LXX Koine Greek (3rd c. BC)
Ἐν ἀρχῇ ἐποίησεν ὁ Θεὸς τὸν οὐρανὸν καὶ τὴν γῆν
pred ἐποίησεν aor.ind.act.3sg ├ sub ὁ Θεὸς nom.sg.m ├ obl Ἐν ἀρχῇ prep + dat.sg.f └ obj τὸν οὐρανὸν acc.sg.m └ coord καὶ τὴν γῆν acc.sg.f
Source. Aor.ind.act + sub + obl + coord-obj. Canonical Greek frame.
Node B - Vulgate Latin (4th c.)
In principio creavit Deus caelum et terram
pred creavit prf.ind.act.3sg ├ sub Deus nom.sg.m ├ obl In principio prep + abl.sg.n └ obj caelum acc.sg.n └ coord et terram acc.sg.f
Latin. Perfect indicative active (no aorist in Latin); same sub + obl + obj + coord skeleton. Article-less.
Node C - Gothic (Wulfila, 4th c.)
In fruma gaskop guþ himin jah airþa
pred gaskop pret.ind.act.3sg ├ sub guþ nom.sg.m ├ obl In fruma prep + dat.sg └ obj himin acc.sg.m └ coord jah airþa acc.sg.f
Gothic preterite (no aor/prf distinction); preverb ga- carries aspect. Skeleton identical.
Node D - OCS Marianus (10th c.)
Искони сътвори Богъ небо и землѭ
pred сътвори aor.act.3sg ├ sub Богъ nom.sg.m ├ adv Искони loc / adv └ obj небо acc.sg.n └ coord и землѭ acc.sg.f
OCS. Aorist active 3sg restored (matches Greek tense). Locative Искони glossed as adv rather than obl because OCS lacks the preposition slot here.
What the chain shows. The pred + sub + obl/adv + obj + coord skeleton is stable across all four IE languages. The only structural divergence is the obl-vs-adv choice in OCS (no preposition slot for искони); tense/aspect realisation varies (Greek aor., Latin perf., Gothic pret., OCS aor.) but PROIEL's pred label abstracts over that. This four-way alignment is the v0.4 backbone; v0.7 will extend it with Sanskrit, Avestan, Old English, Old Persian, Classical Armenian, and Ukrainian.
How to read these samples
Each sentence is annotated under the PROIEL XML 2.0 schema with the relation inventory established by Haug & Jøhndal (2008) and extended by Eckhoff et al. (2018):
pred- root predicatesub/obj- core argumentsobl- oblique (prepositional / instrumental / locative / ablative)atr- attributive (adnominal modifier)adv- adverbial modifiervoc- vocativeaux- auxiliary (article, copula component)xcomp/comp- open / closed complement clausesxobj/nonsub- raised argumentspred-nom- nominal predicate (with copula)
Every annotated token additionally carries full morphology (the 10-character PROIEL morphological tag) and lemma. The per-verb argument-structure frame is derived from these PROIEL relations directly - no UD remapping, no CoNLL-U export step.
Get the full corpus
The samples above are a tiny illustrative slice. v0.4 is not yet a wide public release because Stanza-introduced annotation errors are still being corrected on the ARIS side; the full partitions will ship at v0.5. Cite the project via the concept DOI: 10.5281/zenodo.20439182.
Funding
Funded by the Hellenic Foundation for Research and Innovation (HFRI) under the 3rd Call for HFRI Research Projects to support Post-Doctoral Researchers, Project No. 20577; with complementary support from the Greece 2.0 National Recovery and Resilience Plan. Compute supplied by GRNET ARIS (Greek national HPC), allocation pa260305.