User Tools

Site Tools


utterance_segmentation

Utterance Segmentation

Turn-based

The utterance should always end after a speaker's turn.

Example

Speaker A: <#> Went in shopping for a while

Speaker A's turn ends. End of utterance.

Speaker B: <#> Buy anything

Speaker B's turn ends. End of utterance.

Speaker A: <#> Met Nicole in town <#> No I didn't buy anything <#> I 've hardly no money <#> <{> <[> Broke <,>

Speaker A's turn ends. End of utterance.

Notice that in this example, Speaker B had interrupted Speaker A. Speaker A was still listing out the activities from their previous turn. These two turns should be annotated distinct utterances even though they are closely related.

False Starts

False starts should be included in the utterance.

Example

<#> But uhm she 's she 's from Galway

Tokens Butuhmshesshesfrom
Utterance UTTERANCE

Exceptions include false starts at the beginning of a sentence, in which the lexical item differs significantly. These should be segmented as distinct utterances.

Example

<#> <.> Sat </.> who else <,>

TokensSatwhoelse,
Utterance UTTERANCE UTTERANCE

Pauses

Pauses at the end of an utterance should be included.

Example

<#> Yeah <,> she was <{> <[> with her sister </[> <,> <#> She was going in shopping

Tokens Yeah,shewaswithhersister,Shewasgoinginshopping
Utterance UTTERANCE UTTERANCE

Sentence Boundaries

Pre-annotated sentence boundaries are a good indication of utterance boundaries.

Example

<#> So then uhm <,> what 'd I do Sunday then <#> Sunday I did nothing much

Tokens Sothenuhm,what'dIdoSundaythenSundayIdidnothingmuch
Sentence SENTENCE SENTENCE
Utterance UTTERANCE UTTERANCE

In some cases, sentences consist of multiple utterances.

Example

<#> <.> Sat </.> who else <,> I met Saoirse in town Saturday

Tokens Satwhoelse,ImetSaoirseintownSaturday
Sentence SENTENCE
Utterance UTTERANCE UTTERANCE UTTERANCE
utterance_segmentation.txt · Last modified: 2018/09/11 10:02 (external edit)