User Tools

Site Tools



This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
coptic_entities [2020/01/29 23:07]
coptic_entities [2020/05/12 19:41] (current)
amir [Body Parts]
Line 35: Line 35:
 ===== Specific entity guidelines ===== ===== Specific entity guidelines =====
 +==== Appositions ====
 +Repeated mentions of the same entity in apposition are considered a single span, and do not contain more mentions of the same entity:
 +  * [ⲓⲱϩⲁⲛⲛⲉⲥ ⲡ ⲃⲁⲡⲧⲓⲥⲧⲏⲥ]
 +  * [ⲡ ⲣⲣⲟ ⲍⲏⲛⲱⲛ]
 +  * [ⲡⲉⲛ ⲡ ⲉⲧ ⲟⲩⲁⲁⲃ ⲕⲁⲧⲁ ⲥⲙⲟⲧ ⲛⲓⲙ ⲁⲡⲁ ⲕⲩⲣⲟⲥ ⲡ ⲉⲛⲧ ⲁ ϥ ...]
 +Although outwardly very similar, appositions must be distinguished from dislocations,​ in which a pronominal subject or object is repeated separately. For personal pronouns, the pronoun is simply left out of the nominal span:
 +  * [ⲡⲉϥ ⲉⲓⲱⲧ] ϥ ⲛⲁⲩ ⲉⲣⲟ ⲟⲩ - "[his father], he sees them"
 +  * ϥ ⲛⲁⲩ ⲉⲣⲟ ⲟⲩ ⲛϭⲓ [ⲡⲉϥ ⲉⲓⲱⲧ] - "he sees them, that is [his father]"​
 +If the pronoun is a substitutive demonstrative (ⲡⲁⲓ, ⲧⲁⲓ, ⲛⲁⲓ), then two spans are annotated:
 +  * [ⲡⲉϥ ⲉⲓⲱⲧ] [ⲡⲁⲓ] ⲛⲁⲩ ⲉⲣⲟ ⲟⲩ - "[his father], [this one] sees them"
 +  * [ⲡⲁⲓ] ⲛⲁⲩ ⲉⲣⲟ ⲟⲩ ⲛϭⲓ [ⲡⲉϥ ⲉⲓⲱⲧ] - "[this one] sees them, [that is his father]"​
 +But note that it is also possible for a substitutive demonstrative to stand in true apposition to a noun without dislocation,​ in which case a single span is annotated as for any apposition:
 +  * ⲁ ⲓ ⲛⲁⲩ ⲉ [ⲡⲉϥ ⲉⲓⲱⲧ , ⲡⲁⲓ ⲉⲧ ⲙⲉⲣⲓⲧ ⲥ] - "I saw [their father, the one who loves her]"
 +See the UD Coptic guidelines for more information on identifying dislocation vs. apposition.
 +==== Expanded Relative Constructions ====
 +The relative construction expanding an article is annotated as an entity:
 +  * [ⲡ ⲉⲧ ⲟⲩ ⲥⲱⲧⲙ ⲉⲣⲟ ϥ] "the one they listened to" (person)
 +However, if the ⲡ is tagged as a copula, that part of the construction is not part of the entity span, since it is part of a predication. In these instances, we view the predicate noun phrase as an entity, and the relative clause as a subject clause (compare the Universal Dependency annotation guidelines):​
 +  * [ⲡ ⲛⲟⲩⲧⲉ] ⲡ ⲉⲛⲧ ⲁ ϥ ⲁⲩⲝⲁⲛⲉ "It is God who made them grow"
 +In this example, "​God"​ receives a span, but "who made them grow" is considered a subject clause (i.e. 'who made them grow is God'), which is not nominal and hence not annotated. Note that according to the tagging guidelines, the second ⲡ should be tagged as COP and lemmatized ⲡⲉ in this sentence.
 ==== Body Parts ==== ==== Body Parts ====
-  ​* ϭⲓϫ - Mark body parts as objects. +Most body parts are marked as objects, since they are tangible: 
-  * figurative ​body parts are also considered ​objects (e.g. "in my heart")+ 
 +  ​ⲟⲩ ​ϭⲓϫ - "a hand" 
 +  * ⲡⲉϥ ⲃⲁⲗ - "his eye" 
 +However some referential ​body parts are considered ​abstract, notably ϩⲏⲧ ​"​heart"​ 
 +  * ϯ ⲛⲁ ⲧⲣⲉ ⲡⲟⲩ ϩⲏⲧ ⲙⲕⲁϩ - "I will make your [heart] suffer"​ 
 +Other uses of body parts may be totally figurative or idiomatic, in which case they are not annotated.
 ==== Non-numeral modifiers ==== ==== Non-numeral modifiers ====
Line 51: Line 96:
   * ⲡⲁⲡⲁⲩⲗⲟⲥ - the ones belonging to Paul - Such constructions should take the possessor in the span. [ⲡⲁ[ⲡⲁⲩⲗⲟⲥ]]   * ⲡⲁⲡⲁⲩⲗⲟⲥ - the ones belonging to Paul - Such constructions should take the possessor in the span. [ⲡⲁ[ⲡⲁⲩⲗⲟⲥ]]
 +==== Groups ====
 +Groups of entities are interpreted as the entity type of their constituents,​ for example, a herd of animals is of the type animal:
 +  * [ⲟⲩ ⲁⲅⲉⲗⲏ ⲛ ϣⲟϣ] - a herd of buffaloes. Note that there is no nested entity for '​buffalo'​ in this case.
 +An exception to this guideline is groups of people who form an organization,​ e.g. ⲥⲩⲛⲁⲅⲱⲅⲏ,​ ⲥⲧⲣⲁⲧⲉⲩⲙⲁ etc are '​organization',​ not '​person'​
 ==== No reference inside compounds ==== ==== No reference inside compounds ====
Line 59: Line 110:
   * ⲁ ϥ ϫⲓⲃⲁⲡⲧⲓⲥⲙⲁ - he received-baptism (baptism cannot be annotated as a markable, since it's part of an incorporated verb 'to baptize'​)   * ⲁ ϥ ϫⲓⲃⲁⲡⲧⲓⲥⲙⲁ - he received-baptism (baptism cannot be annotated as a markable, since it's part of an incorporated verb 'to baptize'​)
 +==== Coordination ====
 +We do not mark coordinate entities in addition to their constituents:​
 +  * [ⲓⲱϩⲁⲛⲛⲏⲥ] ⲙⲛ [ⲁⲛⲧⲱⲛⲓⲟⲥ] (but not also [ⲓⲱϩⲁⲛⲛⲏⲥ ⲙⲛ ⲁⲛⲧⲱⲛⲓⲟⲥ] as a third mentioned entity)
 +==== Container and substance ====
 +Container and substance form two entities, for example:
 +  * [ⲟⲩ ⲡⲩⲅⲏ ⲙ [ⲙⲟⲟⲩ]] - a fountain of water (the fountain is '​object',​ the water is '​substance',​ and the water can be referred to separately later on)
 +==== Peoples and demonyms ====
 +Pluralized demonyms indicating members of a people are labeled person:
 +  * [ⲛ ϩⲉⲗⲗⲏⲛ]
 +However peoples mentioned as a people (not as a group of individuals) are labeled organization:​
 +  * [ⲡⲉⲕ ⲗⲁⲟⲥ ⲓⲥⲣⲁⲏⲗ]
 +These cases are usually singular and involve a named people. This guideline does not apply to ad-hoc groups of people who do not form an organized entity, e.g. ⲙⲏⲏϣⲉ '​crowd'​ is still usually '​person'​.
 ==== Predicate of unchanging identity ==== ==== Predicate of unchanging identity ====
-  * ⲡ ⲟⲩⲁ ⲡ ⲟⲩⲁ - each man - Mark as two separate entities+  * ⲡ ⲟⲩⲁ ⲡ ⲟⲩⲁ - each man - Mark as a single entity
 ==== Singular noun referring to a group of people ==== ==== Singular noun referring to a group of people ====
Line 74: Line 147:
   * ⲡⲉ ⲭⲣⲓⲥⲧⲟⲥ... ⲧⲉⲓ ⲥⲛⲧⲉ - the Christ... this foundation - Mark each entity with its own type, e.g., Christ as person and Foundation as object   * ⲡⲉ ⲭⲣⲓⲥⲧⲟⲥ... ⲧⲉⲓ ⲥⲛⲧⲉ - the Christ... this foundation - Mark each entity with its own type, e.g., Christ as person and Foundation as object
-==== interruption by copula ====+==== interruption by copula ​or particle ​====
-Entity expressions interrupted by a copula are terminated at the copula. For example, the following span stops at the copula:+Entity expressions interrupted by a copula ​or particle ​are spanned to **contain** ​the copula ​or particle. For example, the following span includes ​the intervening ​copula:
-  * [ⲛⲉϥ ⲁⲡⲟⲥⲧⲟⲗⲟⲥⲛⲉ ⲉⲧⲟⲩⲁⲁⲃ+  * [ⲛⲉϥ ⲁⲡⲟⲥⲧⲟⲗⲟⲥ ⲛⲉ ⲉⲧⲟⲩⲁⲁⲃ]
 +  * [ϥⲧⲟⲟⲩ ⲇⲉ ⲛ ϩⲟⲟⲩ]
 +  * [ⲟⲩ ϣⲃⲏⲣ · ϩⲱⲱ ⲕ ⲟⲛ ⲛⲧⲉ ⲡ ⲛⲟⲩⲧⲉ]
 +Non-adjacent relative clauses are included, unless the interruption contains the verb controlling the head noun (this prevents some possibly very long '​hermeneutical'​ relatives inside mentions):
 +  * [ⲣⲱⲙⲉ ⲛⲓⲙ ⲟⲛ ⲉⲧ ⲥⲱⲧⲙ] - and also [any man who hears] (note the interruption '​ⲟⲛ'​)
 +But not:
 +  * ⲉⲣϣⲁⲛ [ⲧ ⲃⲁϣⲟⲣ] ⲁϣⲕⲁⲕ ⲉⲃⲟⲗ ⲁⲛ ⲉⲧⲉ ⲛⲧⲟⲕ ⲡⲉ ... - it is not when [the fox] barks, which is you, ...
 +The interruption by the verb '​bark'​ which is the predicate of '​fox'​ triggers the guideline to omit the relative clause. Otherwise, the mention could potentially cover the entire clause '[ⲧ ⲃⲁϣⲟⲣ ⲁϣⲕⲁⲕ ⲉⲃⲟⲗ ... ]'.
 ===== Non-referring cases ===== ===== Non-referring cases =====
Line 85: Line 172:
   * No annotations are needed for interrogatives (ⲛⲓⲙ, ⲟⲩ)   * No annotations are needed for interrogatives (ⲛⲓⲙ, ⲟⲩ)
 +  * Complex interrogatives follow the same guidelines, but note the subject of a question predicate *can* be an entity. In the following example we have one (non-interrogative) person entity span:
 +    * ⲛⲓⲙ ⲅⲁⲣ ⲛ ⲣⲱⲙⲉ [ⲡ ⲉⲧ ⲥⲟⲟⲩⲛ ⲛ [ⲛⲁ ⲛ ⲣⲱⲙⲉ]] = "what human is [he who knows [those things which are human]]?" ​
 +Note that ⲣⲱⲙⲉ without an article functions adjectivally here, and is not an entity; the phrase with ⲛⲓⲙ is interrogative and therefore not an entity; but the '​p-et-...'​ phrase is still annotated, as none of these exceptions apply to it.
 +==== Figurative body parts and other fixed expressions ====
 +The following are considered idioms, in which the constituent nouns are not construed as referential:​
 +  * ⲁϩⲉ ⲣⲁⲧ ϥ - '​stand,​ set foot' - "​foot"​ is not an entity mention
 +  * ϯ ⲧⲟⲟⲧ ϥ - 'help, give a hand'
 +  * ⲉ ⲡ ⲉⲥⲏⲧ - '​down',​ lit. 'to the ground'​
 +  * ⲟⲩⲏⲣ ⲛ ⲟⲩⲟⲉⲓϣ - 'how long'
 +  * ⲛ ⲟⲩ ϩⲟⲩⲟ - '​more'​
 +  * (ϩⲱⲃ) ⲛ ϭⲓϫ - 'handy work' - the whole phrase (handywork) is '​abstract'​ or '​object'​ in context, but '​hand'​ is not a referent
 +  * ⲣ ϩⲛⲁ ϥ - 'want, do one's will' - the word ϩⲛⲁ / ϩⲛⲉ '​will'​ is figurative, this is a fixed expression for '​desire'​
 +  * ⲉⲡⲧⲏⲣϥ meaning 'at all' is not referential
 +  * ϩⲁ ⲉⲟⲟⲩ - '​glorious'​ - the ⲉⲟⲟⲩ is not referential
 +  * ⲛ ⲟⲩ ⲕⲟⲩⲓ - 'a little'​ (manner adverbial)
 +  * ⲛ ⲧ ϩⲉ - meaning '​like'​
 +  * ⲛ ϣⲟⲣⲡ - '​first'​
 +  * ϭⲟⲙ - meaning '​capable'​ in constructions like ⲛⲧⲕ ϭⲟⲙ ⲁⲛ 'you are not capable'​
 +  * ⲛ ⲟⲩⲱⲧ - together
 +  * ϩⲓ ⲟⲩ ⲥⲟⲡ - at once
 +  * ⲙ ⲙⲏⲛⲉ - daily
coptic_entities.1580339244.txt.gz · Last modified: 2020/01/29 23:07 by amir