Filler Type: Filled Pause (FP)

What is a Filled Pause?

Filled pauses are hesitation sounds that speakers employ to indicate uncertainty or to maintain control of a conversation while thinking of what to say next. Filled pauses do not add any new information to the conversation (other than to indicate the speaker's hesitation) and they do not alter the meaning of what is uttered. For instance,

Filled pauses can occur anywhere in the stream of speech. In English, the set of filled pauses includes the following five words:

            ah            uh
            eh            um

Other sounds or non-lexemes can occasionally be used as a filled pause, and some speakers may adopt an idiosyncratic filled pause noise that does not appear on the above list. For the purposes of SimpleMDE annotation, we limit ourselves to the filled pauses listed above.

Note: The annotation tool pre-identifies filled pauses (limited to ah, eh, er, uh, um) and automatically pre-annotates them as filled pauses. Pre-identified filled pauses are displayed in blue font; annotated filled pauses are displayed with blue underlining. Annotators must verify that each pre-labeled filled pause candidate is actually acting as a filled pause, and must remove the annotation from any non-filled pauses.

In broadcast news filled pauses are less common and are sometimes not pre-annotated. Annotators must manually add filled pause annotation.

Other FP functions

Be aware that some tokens that can be used as FPs may have other functions, like question responses, elsewhere in the discourse. Label tokens as filled pauses only when they indicate a speaker's hesitation.

Look out for filled pauses that are actually mistranscribed backchannels. For example, a speaker says "mhm" but the transcription is "uh", with automatic filled pause annotation. In this case, "uh" is a would-be backchannel. For the conventions in dealing with would-be backchannels, see Would-be backchannels .

FPs occupying a whole speaker turn

Sometimes a speaker's turn consists solely of a filled pause. You should annotate this as an incomplete SU:

FPs at the end of a speaker turn

See the guidelines in the introduction to fillers.

Strings of FPs

In order to save time during annotation, long strings of contiguous filled pauses may be labeled as a single multi-word filler rather than a series of separate filled pauses.

These will be separated into individual filled pause tokens as an automatic post-processing step. However, fillers of different types that occur in sequence should be annotated separately according to their type. For instance,

            {Um, uh} {well} {you see} {uh} it's not that simple.

Annotation: FPause Disc.Marker FPause

Upon post-processing, this example will be rendered as follows:

              {Um} {uh} {well} {you see} {uh}  it's not that simple.

Post-process: FP FP DM DM FP

For information on where to place filled pauses in relation to SU breaks, please consult the "Sentence Level SU Breaks" section of the Sentence-level vs. Sentence-internal SUs page.


