User Tools

Site Tools


gum:tei_markup_in_gum

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
gum:tei_markup_in_gum [2018/09/11 14:02]
127.0.0.1 external edit
gum:tei_markup_in_gum [2021/09/15 03:17] (current)
eez7
Line 3: Line 3:
  
 ^ tag@attribute ^ meaning ^ ^ tag@attribute ^ meaning ^
 +| add | text inserted by an editor, e.g. in interviews inside [] |
 | caption | caption for images in the text | | caption | caption for images in the text |
 +| caption@rend | a description of the appearance of the caption (e.g. bold) |
 +| cell | a table cell |
 +| cell@rend | a description of the appearance of the cell (e.g. bold, red background) |
 | date | date expressions | | date | date expressions |
 | date@from | starting date for a range of dates | | date@from | starting date for a range of dates |
Line 10: Line 14:
 | date@rend | formatting of a date expression (e.g. italics, color) | | date@rend | formatting of a date expression (e.g. italics, color) |
 | date@to | end date for a range of dates | | date@to | end date for a range of dates |
-| date@when | date in question, normalized to the format yyyy-mm-dd |+| date@when | date in question, normalized to the format yyyy-mm-dd (Day and Month can be omitted) |
 | figure | marks the position of a figure in the text | | figure | marks the position of a figure in the text |
-| figure@rend | a description of the appearance of the figure |+| figure@rend | a description of the appearance of the figure (e.g. "drawing of four slightly deflated balls") | 
 +| gap | a gap in the text (missing words) | 
 +| gap@reason | a gap with the reason (e.g. omitted) |
 | head | marks a heading | | head | marks a heading |
-| head@rend | a description of the appearance of the heading (e.g. bold) | +| head@rend | a description of the appearance of the heading (e.g. bold, large) | 
-| hi@rend | a highlighted section with a description of its appearance (e.g. color) | +| hi@rend | a highlighted section with a description of its appearance (e.g. bold, italic, green, small-caps, emphatic, lengthened, "space between 'U.' and 'S.'") | 
-| incident@who | an extralinguistic incident (e.g. coughing)and the person responsible |+| incident | an extralinguistic incident 
 +| incident@type | an incident with type (e.g. laughchuckle, whistle, graphic or text appearing on screen, "opens door") | 
 +| incident@who | the person responsible for an incident (e.g. #Angela) |
 | item | item or bullet point in a list | | item | item or bullet point in a list |
 | item@n | item number | | item@n | item number |
 +| l | a line in poetry |
 | l@n | a line in poetry with its number | | l@n | a line in poetry with its number |
 +| lg | a line group |
 | lg@n | a line group with the group's number | | lg@n | a line group with the group's number |
 | lg@type | line group type (e.g. stanza) | | lg@type | line group type (e.g. stanza) |
 | list | list of bullet points | | list | list of bullet points |
-| list@type | a list type (e.g. bulleted, ordered, etc.) |+| list@type | a list type (e.g. ordered, unordered, etc.) | 
 +| note | a note | 
 +| note@n | a note with number | 
 +| note@place | the place of the note (e.g. foot) |
 | p | a paragraph | | p | a paragraph |
-| p@rend | a description of the appearance of the paragraph |+| p@rend | a description of the appearance of the paragraph (e.g. bold, indent) |
 | q | quotation marks not marking a quotation (e.g. scare quotes; placed outside the quotes!) | | q | quotation marks not marking a quotation (e.g. scare quotes; placed outside the quotes!) |
 | quote | a quotation | | quote | a quotation |
 +| quote@rend | the appearance of the quotation (e.g. block, bold) |
 | ref | an external reference, usually a hyperlink | | ref | an external reference, usually a hyperlink |
-| ref@target | the target of the reference (usually a URL, if not ommitted) |+| ref@rend | the appearance of the reference (e.g. italic) | 
 +| ref@target | the target of the reference (usually a URL, if not omitted
 +| row | a table row |
 | s | a main sentence span | | s | a main sentence span |
 +| s@type | a sentence with type (e.g. decl, q, wh, frag, imp, ger, intj, sub, multiple, other) |
 | sic | a section containing an apparent language error, thus in the original | | sic | a section containing an apparent language error, thus in the original |
 | sp@who | a section uttered by a particular speaker with a reference to the speaker | | sp@who | a section uttered by a particular speaker with a reference to the speaker |
 +| sp@whom | a section uttered with a particular speaker as an addressee|
 +| table | a table |
 +| table@cols | a table with the number of columns |
 +| table@rows | a table with the number of rows |
 +| table@rend | the appearance of the table (e.g. bold, red background) |
 | time | time expressions | | time | time expressions |
 | time@from | starting time for a stretch of time | | time@from | starting time for a stretch of time |
 | time@to | end time for a stretch of time | | time@to | end time for a stretch of time |
-| time@when | time in question, normalized to the format HH:mme.g. 16:30 |+| time@when | time in question, normalized to the format HH:mm:ss (e.g. 16:30:00) |
 | w | tag to delimit a word, used when two tokens are spelled with no space, e.g. cannot | | w | tag to delimit a word, used when two tokens are spelled with no space, e.g. cannot |
  
Line 107: Line 129:
  
 == Quotation marks == == Quotation marks ==
-Literal quotes are surrounded by the <quotetags, regardless of whether or not quotation marks are used. But other uses of quotation marks are surrounded by <q>. Compare the following two uses:+ 
 +Literal quotes are surrounded by the 'quotetags, regardless of whether or not quotation marks are used. But other uses of quotation marks are surrounded by 'q'. Compare the following two uses:
  
 <code xml> <code xml>
Line 113: Line 136:
 </code> </code>
  
 +
 +== Footnotes ==
 +
 +Footnotes with running text (not bibliographical references realized using numbers hyperlinked to the bibliography) are place at the position immediately after the paragraph that contains the numbered references. The number is surrounded by **ref** tags, and the note is enclose in **note**:
 +
 +<code xml>
 +<p>
 +Some long text.<ref>1</ref> Paragraph continues. At the end of this paragraph we'll insert the note.
 +</p>
 +<note place="foot" n="1">This is the footnote, which physically appeared at the bottom of the page, which was the middle of the next paragraph.</note>
 +<p>
 +Next paragraph. This one is split across pages, but the footnote does not appear in the middle of it, even though it was there graphically.
 +</p>
 +</code>
 +
 +== Reference to deleted speakers ==
 +
 +If a deleted comments in reddit is not replied to within the context included in the document, it may be ignored. However if the comment is part of a broken thread of responses, it's existence can be encoded using an empty **sp** tag with the speaker set to DELETED, which can then be referred to in the reply:
 +
 +<code xml>
 +
 +<sp who="#DELETED"/>
 +<sp who="#kim" whom="#DELETED">
 +I agree with you.
 +</sp>
 +
 +</code>
 +
 +== Reference to multiple speakers ==
 +
 +If two characters in a work of fiction say the same thing at the same time, tag both speakers in alphabetical order, separated by a comma (without a space), in the sp@who attribute:
 +
 +<code xml>
 +
 +<p>
 +<sp who="#Fairy,#Narrator" whom="#Pete">
 +“No!”
 +</sp> 
 +we both said at once.
 +</p>
 +
 +</code>
 +
 +If there are multiple possible addressees and it is not clear who/which subset is being addressed, all possible addressees are included in sp@whom (usually everyone but the speaker). Speech uttered to no specific addressee is left without the @whom attribute.
  
 == Tokens with no intervening spaces == == Tokens with no intervening spaces ==
Line 119: Line 186:
 <code xml> <code xml>
   I <w>cannot</w> do this (5 tokens)   I <w>cannot</w> do this (5 tokens)
 +</code>
 +
 +== Section dividers ==
 +
 +If there are some graphic section dividers, which seperate different sections of the text but do not contain any words, tag them as the following example:
 +
 +<code xml>
 +
 +<p>
 +<s>* * *</s>
 +</p>
 +
 </code> </code>
  
 ==== Structural markup ==== ==== Structural markup ====
 +
 +
 +**NOTE:** As of GUM2, the following tags are no longer used
  
 ^ tag@attribute ^ meaning ^ ^ tag@attribute ^ meaning ^
Line 132: Line 214:
 | div3 | same as div1 for a third level nested section | | div3 | same as div1 for a third level nested section |
 | div3@n | | | div3@n | |
 +
 +Other deprecated tags:
 +
 +^ tag@attribute ^ meaning ^
 +| measure | span of a unit of measurement |
 +| measure@type | a measure type (e.g. currency) |
 +
gum/tei_markup_in_gum.1536674567.txt.gz · Last modified: 2021/02/11 17:03 (external edit)