Package rst.audition¶

Not documented

Messages¶

Utterance
SoundChunk
PhonemeCollection
Phoneme

clearer: should be made invisible via css

Message Utterance¶

class rst.audition.Utterance¶

Objects of this represent a single utterances of speech.

The data describes a single utterance in three different forms:

phonemes describes the utterance as a list of phone symbols and durations (useful e.g. for lip animation).
audio is a <Could not resolve reference to SoundChunk> that can be played back on audio devices containing the realization (e.g. by a TTS system) of the included phoneme list
<Could not resolve reference to .description> is a textual description of the utterance for debugging purposes.

Code author: Simon Schulz <sschulz@techfak.uni-bielefeld.de>

phonemes¶

Type:	`rst.audition.PhonemeCollection`

A collection of phonemes. Will be played back in the same ordering as given by Phoneme

audio¶

Type:	`rst.audition.SoundChunk`

A chunk of audio data that can be played back containing the realization (e.g. by a TTS system) of the included phoneme list

textual_representation¶

Type:	`ASCII-STRING`

Textual representation of the utterance.

Download this file

message Utterance {

    /**
     * A collection of phonemes. Will be played back in the same
     * ordering as given by @ref .Phoneme
     */
    required PhonemeCollection phonemes = 1;

    /**
     * A chunk of audio data that can be played back containing the
     * realization (e.g. by a TTS system) of the included phoneme list
     */
    required SoundChunk audio = 2;

    /**
     * Textual representation of the utterance.
     */
    required string textual_representation = 3;

}

Message SoundChunk¶

class rst.audition.SoundChunk¶

Constraint: len(.data) == 8 * .channels * .sample_count * TODO(.sample_type)

Objects of this represent a chunk of an audio stream.

The audio information for one or more channels is stored in data as a sequence of sample_count encoded samples, the encoding of which is described by endianness and sample_type.

Depending on the sample rate (rate), such a chunk of audio corresponds to a certain amount of time during which its samples have been recorded.

Interpretation of RSB timestamps:

create:: Capture time of the audio buffer. More precisely, the timestamp should correspond to the first sample contained in the buffer.

Code author: David Klotz <dklotz@techfak.uni-bielefeld.de>

@create_collection

data¶

Type:	`OCTET-VECTOR`

The sequences of bytes representing the samples of this sound chunk.

The value of this field must be interpreted according to the values of the sample_count, channels, sample_type and endianness fields.

sample_count¶

Type:	`UINT32`

Unit: number

The number of samples contained in data.

channels¶

Type:	`UINT32`

Unit: number

The number of channels for which samples are stored in data.

rate¶

Type:	`UINT32`

Unit: hz

The rate with which the samples stored in data haven been recorded or should be played.

sample_type¶

Type:	`rst.audition.SoundChunk.SampleType`

The data type used for the representation of samples in data.

endianness¶

Type:	`rst.audition.SoundChunk.EndianNess`

The Endianness used for the representation of samples in data.

Download this file

message SoundChunk {

    /**
     * The possible data types for representing individual samples.
     */
    enum SampleType {

        /**
         * Signed 8-bit samples.
         */
        SAMPLE_S8 = 0;

        /**
         * Unsigned 8-bit samples.
         */
        SAMPLE_U8 = 1;

        /**
         * Signed 16-bit samples.
         */
        SAMPLE_S16 = 2;

        /**
         * Unsigned 16-bit samples.
         */
        SAMPLE_U16 = 4;

        /**
         * Signed 24-bit samples.
         */
        SAMPLE_S24 = 8;

        /**
         * Unsigned 24-bit samples.
         */
        SAMPLE_U24 = 16;

    }

    /**
     * The possible byte-orders for representing samples.
     */
    enum EndianNess {

        /**
         * Samples are represented with little Endian byte-order.
         */
        ENDIAN_LITTLE = 0;

        /**
         * Samples are represented with big Endian byte-order.
         */
        ENDIAN_BIG = 1;
    }

    /**
     * The sequences of bytes representing the samples of this sound
     * chunk.
     *
     * The value of this field must be interpreted according to the
     * values of the @ref .sample_count, @ref .channels, @ref
     * .sample_type and @ref .endianness fields.
     */
    required bytes data = 1;

    /**
     * The number of samples contained in @ref .data.
     */
    // @unit(number)
    required uint32 sample_count = 2;

    /**
     * The number of channels for which samples are stored in @ref
     * .data.
     */
    // @unit(number)
    optional uint32 channels = 3 [default = 1];

    /**
     * The rate with which the samples stored in @ref .data haven been
     * recorded or should be played.
     */
    // @unit(hz)
    optional uint32 rate = 4 [default = 44100];

    /**
     * The data type used for the representation of samples in @ref
     * .data.
     */
    optional SampleType sample_type = 5 [default = SAMPLE_S16];

    /**
     * The Endianness used for the representation of samples in @ref
     * .data.
     */
    optional EndianNess endianness = 6 [default = ENDIAN_LITTLE];

    // TODO: interleaving type?

}

Message SampleType¶

class rst.audition.SoundChunk.SampleType¶

The possible data types for representing individual samples.

SAMPLE_S8¶
= 0: Signed 8-bit samples.

SAMPLE_U8¶
= 1: Unsigned 8-bit samples.

SAMPLE_S16¶
= 2: Signed 16-bit samples.

SAMPLE_U16¶
= 4: Unsigned 16-bit samples.

SAMPLE_S24¶
= 8: Signed 24-bit samples.

SAMPLE_U24¶
= 16: Unsigned 24-bit samples.

Download this file

    enum SampleType {

        /**
         * Signed 8-bit samples.
         */
        SAMPLE_S8 = 0;

        /**
         * Unsigned 8-bit samples.
         */
        SAMPLE_U8 = 1;

        /**
         * Signed 16-bit samples.
         */
        SAMPLE_S16 = 2;

        /**
         * Unsigned 16-bit samples.
         */
        SAMPLE_U16 = 4;

        /**
         * Signed 24-bit samples.
         */
        SAMPLE_S24 = 8;

        /**
         * Unsigned 24-bit samples.
         */
        SAMPLE_U24 = 16;

    }

Message EndianNess¶

class rst.audition.SoundChunk.EndianNess¶

The possible byte-orders for representing samples.

ENDIAN_LITTLE¶
= 0: Samples are represented with little Endian byte-order.

ENDIAN_BIG¶
= 1: Samples are represented with big Endian byte-order.

Download this file

    enum EndianNess {

        /**
         * Samples are represented with little Endian byte-order.
         */
        ENDIAN_LITTLE = 0;

        /**
         * Samples are represented with big Endian byte-order.
         */
        ENDIAN_BIG = 1;
    }

Message PhonemeCollection¶

class rst.audition.PhonemeCollection¶

Collection of Phoneme instances.

Auto-generated.

element¶

Type:	array of `rst.audition.Phoneme`

The individual elements of the collection.

Constraints regarding the empty collection, sorting, duplicated entries etc. are use case specific.

Download this file

message PhonemeCollection {

    /**
     * The individual elements of the collection.
     *
     * Constraints regarding the empty collection, sorting, duplicated
     * entries etc. are use case specific.
     */
    repeated Phoneme element = 1;

}

Message Phoneme¶

class rst.audition.Phoneme¶

Objects of this represent a single phoneme-duration pair.

A list of elements of this type can be used to describe words or whole sentences of speech.

Code author: Simon Schulz <sschulz@techfak.uni-bielefeld.de>

@create_collection

symbol¶

Type:	`ASCII-STRING`

A single phone symbol (such as aI, E, C, R, _, ...).

e.g. see https://en.wikipedia.org/wiki/Phoneme: or http://www.phon.ucl.ac.uk/home/sampa/german.htm (german) examples

duration¶

Type:	`UINT32`

Unit: millisecond

The duration of this symbol.

Download this file

message Phoneme {

    /**
     * A single phone symbol (such as aI, E, C, R, _, ...).
     *
     * e.g. see https://en.wikipedia.org/wiki/Phoneme
     *      or http://www.phon.ucl.ac.uk/home/sampa/german.htm (german)
     *      examples
     */
    required string symbol = 1;

    /**
     * The duration of this symbol.
     */
    // @unit(millisecond)
    required uint32 duration = 2;

}

Package rst.audition¶

Messages¶

Message Utterance¶

Message SoundChunk¶

Message SampleType¶

Message EndianNess¶

Message PhonemeCollection¶

Message Phoneme¶

Table Of Contents

Related Documentation

This Page