User Tools

Site Tools


part_of_speech_tagging

Parts of speech tagging

Partial word

For partial words, use target hypothesis.

Example

So uhm <,> then we all <.> dec </.> they all decided they wanted to go to the disco like but I had no money

Token Souhmthenwealldectheyalldecided
Tag RBUHRBPRPRBVVDPRPRBVVD

Sometimes, it may be difficult to use target hypothesis. In these cases, see UNCLEAR.

Discourse markers

Null-to-low semantic value

Words that contain null-to-low semantic value are tagged as discourse markers (i.e. UH). These words are usually affirmative responses, where the words contain less semantic value than their alternative usage. For example, well in “oh well” no longer contains the sense of well as in “the child behaved well”.

Examples

Oh right

Token Ohright
Tag UHUH

Ah cool

Token Ahcool
Tag UHUH

He rang her alright

Token Herangheralright
Tag PRPVVDPRPUH
Clause-final 'like'

Function: “retroactive focusing power, but more importantly, […] they can be interpreted as countering potential inferences, objections, or doubts” (Miller & Weinert, 1995)

Since clause-final 'like' is extremely common, and does not (a) appear in the same distribution, and (b) have the same function as other forms of 'like', they should be tagged as UH.

All the people were out like.

Token Allthepeoplewereoutlike
Tag PDTTDNNSVBDINUH

Misc.

ye

Example

Did she go out with ye.

Token Didshegooutwithye
Tag VVDPRPVVININPRP

UNCLEAR

Either use target hypothesis or the tag XX. N.B. XX is also used in the Switchboard Corpus for partial words, and unclear parts of speech. Here, we tag partial words using target hypothesis. If the partial word is unclear, then proceed to tag as XX.

Example

Did you go UNCLEAR

Token DidyougoUNCLEAR
Tag VVDPRPVVXX
part_of_speech_tagging.txt · Last modified: 2018/09/11 10:02 (external edit)