Package rst.audition¶

Audio signal processing, sometimes referred to as audio processing, is the intentional alteration of auditory signals, or sound.

This package contains data type definitions related to audio processing.

Messages¶

SoundChunkCollection
Utterance
SoundChunk
PhonemeCollection
Phoneme

clearer: should be made invisible via css

Message SoundChunkCollection¶

class rst.audition.SoundChunkCollection¶

Collection of SoundChunk instances.

Auto-generated.

element¶

Type:	array of `rst.audition.SoundChunk`

The individual elements of the collection.

Constraints regarding the empty collection, sorting, duplicated entries etc. are use case specific.

Download this file

message SoundChunkCollection {

    /**
     * The individual elements of the collection.
     *
     * Constraints regarding the empty collection, sorting, duplicated
     * entries etc. are use case specific.
     */
    repeated SoundChunk element = 1;

}

Message Utterance¶

class rst.audition.Utterance¶

Objects of this represent a single utterances of speech.

The data describes a single utterance in three different forms:

phonemes describes the utterance as a list of phone symbols and durations (useful e.g. for lip animation).
audio is a <Could not resolve reference to SoundChunk> that can be played back on audio devices containing the realization (e.g. by a TTS system) of the included phoneme list
<Could not resolve reference to .description> is a textual description of the utterance for debugging purposes.

Code author: Simon Schulz <sschulz@techfak.uni-bielefeld.de>

phonemes¶

Type:	`rst.audition.PhonemeCollection`

A collection of phonemes. Will be played back in the same ordering as given by Phoneme

audio¶

Type:	`rst.audition.SoundChunk`

A chunk of audio data that can be played back containing the realization (e.g. by a TTS system) of the included phoneme list

textual_representation¶

Type:	`ASCII-STRING`

Textual representation of the utterance.

Download this file

message Utterance {

    /**
     * A collection of phonemes. Will be played back in the same
     * ordering as given by @ref .Phoneme
     */
    required PhonemeCollection phonemes = 1;

    /**
     * A chunk of audio data that can be played back containing the
     * realization (e.g. by a TTS system) of the included phoneme list
     */
    required SoundChunk audio = 2;

    /**
     * Textual representation of the utterance.
     */
    required string textual_representation = 3;

}

Message SoundChunk¶

class rst.audition.SoundChunk¶

Constraint: len(.data) == 8 * .channels * .sample_count * TODO(.sample_type)

Objects of this represent a chunk of an audio stream.

The audio information for one or more channels is stored in data as a sequence of sample_count encoded samples, the encoding of which is described by endianness and sample_type.

Depending on the sample rate (rate), such a chunk of audio corresponds to a certain amount of time during which its samples have been recorded.

Interpretation of RSB timestamps:

create:: Capture time of the audio buffer. More precisely, the timestamp should correspond to the first sample contained in the buffer.

Code author: David Klotz <dklotz@techfak.uni-bielefeld.de>

@create_collection

data¶

Type:	`OCTET-VECTOR`

The sequences of bytes representing the samples of this sound chunk.

The value of this field must be interpreted according to the values of the sample_count, channels, sample_type and endianness fields.

sample_count¶

Type:	`UINT32`

Unit: number

The number of samples contained in data.

channels¶

Type:	`UINT32`

Unit: number

The number of channels for which samples are stored in data.

rate¶

Type:	`UINT32`

Unit: hz

The rate with which the samples stored in data haven been recorded or should be played.

sample_type¶

Type:	`rst.audition.SoundChunk.SampleType`

The data type used for the representation of samples in data.

endianness¶

Type:	`rst.audition.SoundChunk.EndianNess`

The Endianness used for the representation of samples in data.

Download this file

message SoundChunk {

    /**
     * The possible data types for representing individual samples.
     */
    enum SampleType {

        /**
         * Signed 8-bit samples.
         */
        SAMPLE_S8 = 0;

        /**
         * Unsigned 8-bit samples.
         */
        SAMPLE_U8 = 1;

        /**
         * Signed 16-bit samples.
         */
        SAMPLE_S16 = 2;

        /**
         * Unsigned 16-bit samples.
         */
        SAMPLE_U16 = 4;

        /**
         * Signed 24-bit samples.
         */
        SAMPLE_S24 = 8;

        /**
         * Unsigned 24-bit samples.
         */
        SAMPLE_U24 = 16;

    }

    /**
     * The possible byte-orders for representing samples.
     */
    enum EndianNess {

        /**
         * Samples are represented with little Endian byte-order.
         */
        ENDIAN_LITTLE = 0;

        /**
         * Samples are represented with big Endian byte-order.
         */
        ENDIAN_BIG = 1;
    }

    /**
     * The sequences of bytes representing the samples of this sound
     * chunk.
     *
     * The value of this field must be interpreted according to the
     * values of the @ref .sample_count, @ref .channels, @ref
     * .sample_type and @ref .endianness fields.
     */
    required bytes data = 1;

    /**
     * The number of samples contained in @ref .data.
     */
    // @unit(number)
    required uint32 sample_count = 2;

    /**
     * The number of channels for which samples are stored in @ref
     * .data.
     */
    // @unit(number)
    optional uint32 channels = 3 [default = 1];

    /**
     * The rate with which the samples stored in @ref .data haven been
     * recorded or should be played.
     */
    // @unit(hz)
    optional uint32 rate = 4 [default = 44100];

    /**
     * The data type used for the representation of samples in @ref
     * .data.
     */
    optional SampleType sample_type = 5 [default = SAMPLE_S16];

    /**
     * The Endianness used for the representation of samples in @ref
     * .data.
     */
    optional EndianNess endianness = 6 [default = ENDIAN_LITTLE];

    // TODO: interleaving type?

}

Message SampleType¶

class rst.audition.SoundChunk.SampleType¶

The possible data types for representing individual samples.

SAMPLE_S8¶
= 0: Signed 8-bit samples.

SAMPLE_U8¶
= 1: Unsigned 8-bit samples.

SAMPLE_S16¶
= 2: Signed 16-bit samples.

SAMPLE_U16¶
= 4: Unsigned 16-bit samples.

SAMPLE_S24¶
= 8: Signed 24-bit samples.

SAMPLE_U24¶
= 16: Unsigned 24-bit samples.

Download this file

    enum SampleType {

        /**
         * Signed 8-bit samples.
         */
        SAMPLE_S8 = 0;

        /**
         * Unsigned 8-bit samples.
         */
        SAMPLE_U8 = 1;

        /**
         * Signed 16-bit samples.
         */
        SAMPLE_S16 = 2;

        /**
         * Unsigned 16-bit samples.
         */
        SAMPLE_U16 = 4;

        /**
         * Signed 24-bit samples.
         */
        SAMPLE_S24 = 8;

        /**
         * Unsigned 24-bit samples.
         */
        SAMPLE_U24 = 16;

    }

Message EndianNess¶

class rst.audition.SoundChunk.EndianNess¶

The possible byte-orders for representing samples.

ENDIAN_LITTLE¶
= 0: Samples are represented with little Endian byte-order.

ENDIAN_BIG¶
= 1: Samples are represented with big Endian byte-order.

Download this file

    enum EndianNess {

        /**
         * Samples are represented with little Endian byte-order.
         */
        ENDIAN_LITTLE = 0;

        /**
         * Samples are represented with big Endian byte-order.
         */
        ENDIAN_BIG = 1;
    }

Message PhonemeCollection¶

class rst.audition.PhonemeCollection¶

Collection of Phoneme instances.

Auto-generated.

element¶

Type:	array of `rst.audition.Phoneme`

The individual elements of the collection.

Constraints regarding the empty collection, sorting, duplicated entries etc. are use case specific.

Download this file

message PhonemeCollection {

    /**
     * The individual elements of the collection.
     *
     * Constraints regarding the empty collection, sorting, duplicated
     * entries etc. are use case specific.
     */
    repeated Phoneme element = 1;

}

Message Phoneme¶

class rst.audition.Phoneme¶

Objects of this represent a single phoneme-duration pair.

A list of elements of this type can be used to describe words or whole sentences of speech.

Code author: Simon Schulz <sschulz@techfak.uni-bielefeld.de>

@create_collection

symbol¶

Type:	`ASCII-STRING`

A single phone symbol (such as aI, E, C, R, _, ...).

e.g. see https://en.wikipedia.org/wiki/Phoneme: or http://www.phon.ucl.ac.uk/home/sampa/german.htm (german) examples

duration¶

Type:	`UINT32`

Unit: millisecond

The duration of this symbol.

Download this file

message Phoneme {

    /**
     * A single phone symbol (such as aI, E, C, R, _, ...).
     *
     * e.g. see https://en.wikipedia.org/wiki/Phoneme
     *      or http://www.phon.ucl.ac.uk/home/sampa/german.htm (german)
     *      examples
     */
    required string symbol = 1;

    /**
     * The duration of this symbol.
     */
    // @unit(millisecond)
    required uint32 duration = 2;

}

Package rst.audition¶

Messages¶

Message SoundChunkCollection¶

Message Utterance¶

Message SoundChunk¶

Message SampleType¶

Message EndianNess¶

Message PhonemeCollection¶

Message Phoneme¶

Table Of Contents

Related Documentation

This Page