Package rst.audition¶
Not documented
Messages¶
Message Utterance¶
-
class
rst.audition.
Utterance
¶ Objects of this represent a single utterances of speech.
The data describes a single utterance in three different forms:
phonemes
describes the utterance as a list of phone symbols and durations (useful e.g. for lip animation).audio
is a <Could not resolve reference to SoundChunk> that can be played back on audio devices containing the realization (e.g. by a TTS system) of the included phoneme list- <Could not resolve reference to .description> is a textual description of the utterance for debugging purposes.
Code author: Simon Schulz <sschulz@techfak.uni-bielefeld.de>
-
phonemes
¶ Type: rst.audition.PhonemeCollection
A collection of phonemes. Will be played back in the same ordering as given by
Phoneme
-
audio
¶ Type: rst.audition.SoundChunk
A chunk of audio data that can be played back containing the realization (e.g. by a TTS system) of the included phoneme list
-
textual_representation
¶ Type: ASCII-STRING
Textual representation of the utterance.
message Utterance {
/**
* A collection of phonemes. Will be played back in the same
* ordering as given by @ref .Phoneme
*/
required PhonemeCollection phonemes = 1;
/**
* A chunk of audio data that can be played back containing the
* realization (e.g. by a TTS system) of the included phoneme list
*/
required SoundChunk audio = 2;
/**
* Textual representation of the utterance.
*/
required string textual_representation = 3;
}
Message SoundChunk¶
-
class
rst.audition.
SoundChunk
¶ Constraint:
len(.data) == 8 * .channels * .sample_count * TODO(.sample_type)
Objects of this represent a chunk of an audio stream.
The audio information for one or more
channels
is stored indata
as a sequence ofsample_count
encoded samples, the encoding of which is described byendianness
andsample_type
.Depending on the sample rate (
rate
), such a chunk of audio corresponds to a certain amount of time during which its samples have been recorded.Interpretation of RSB timestamps:
- create:
- Capture time of the audio buffer. More precisely, the timestamp should correspond to the first sample contained in the buffer.
Code author: David Klotz <dklotz@techfak.uni-bielefeld.de>
@create_collection
-
data
¶ Type: OCTET-VECTOR
The sequences of bytes representing the samples of this sound chunk.
The value of this field must be interpreted according to the values of the
sample_count
,channels
,sample_type
andendianness
fields.
-
rate
¶ Type: UINT32
Unit: hz
The rate with which the samples stored in
data
haven been recorded or should be played.
-
sample_type
¶ Type: rst.audition.SoundChunk.SampleType
The data type used for the representation of samples in
data
.
-
endianness
¶ Type: rst.audition.SoundChunk.EndianNess
The Endianness used for the representation of samples in
data
.
message SoundChunk {
/**
* The possible data types for representing individual samples.
*/
enum SampleType {
/**
* Signed 8-bit samples.
*/
SAMPLE_S8 = 0;
/**
* Unsigned 8-bit samples.
*/
SAMPLE_U8 = 1;
/**
* Signed 16-bit samples.
*/
SAMPLE_S16 = 2;
/**
* Unsigned 16-bit samples.
*/
SAMPLE_U16 = 4;
/**
* Signed 24-bit samples.
*/
SAMPLE_S24 = 8;
/**
* Unsigned 24-bit samples.
*/
SAMPLE_U24 = 16;
}
/**
* The possible byte-orders for representing samples.
*/
enum EndianNess {
/**
* Samples are represented with little Endian byte-order.
*/
ENDIAN_LITTLE = 0;
/**
* Samples are represented with big Endian byte-order.
*/
ENDIAN_BIG = 1;
}
/**
* The sequences of bytes representing the samples of this sound
* chunk.
*
* The value of this field must be interpreted according to the
* values of the @ref .sample_count, @ref .channels, @ref
* .sample_type and @ref .endianness fields.
*/
required bytes data = 1;
/**
* The number of samples contained in @ref .data.
*/
// @unit(number)
required uint32 sample_count = 2;
/**
* The number of channels for which samples are stored in @ref
* .data.
*/
// @unit(number)
optional uint32 channels = 3 [default = 1];
/**
* The rate with which the samples stored in @ref .data haven been
* recorded or should be played.
*/
// @unit(hz)
optional uint32 rate = 4 [default = 44100];
/**
* The data type used for the representation of samples in @ref
* .data.
*/
optional SampleType sample_type = 5 [default = SAMPLE_S16];
/**
* The Endianness used for the representation of samples in @ref
* .data.
*/
optional EndianNess endianness = 6 [default = ENDIAN_LITTLE];
// TODO: interleaving type?
}
Message SampleType¶
-
class
rst.audition.SoundChunk.
SampleType
¶ The possible data types for representing individual samples.
-
SAMPLE_S8
¶ -
= 0
Signed 8-bit samples.
-
SAMPLE_U8
¶ -
= 1
Unsigned 8-bit samples.
-
SAMPLE_S16
¶ -
= 2
Signed 16-bit samples.
-
SAMPLE_U16
¶ -
= 4
Unsigned 16-bit samples.
-
SAMPLE_S24
¶ -
= 8
Signed 24-bit samples.
-
SAMPLE_U24
¶ -
= 16
Unsigned 24-bit samples.
-
enum SampleType {
/**
* Signed 8-bit samples.
*/
SAMPLE_S8 = 0;
/**
* Unsigned 8-bit samples.
*/
SAMPLE_U8 = 1;
/**
* Signed 16-bit samples.
*/
SAMPLE_S16 = 2;
/**
* Unsigned 16-bit samples.
*/
SAMPLE_U16 = 4;
/**
* Signed 24-bit samples.
*/
SAMPLE_S24 = 8;
/**
* Unsigned 24-bit samples.
*/
SAMPLE_U24 = 16;
}
Message EndianNess¶
enum EndianNess {
/**
* Samples are represented with little Endian byte-order.
*/
ENDIAN_LITTLE = 0;
/**
* Samples are represented with big Endian byte-order.
*/
ENDIAN_BIG = 1;
}
Message PhonemeCollection¶
-
class
rst.audition.
PhonemeCollection
¶ Collection of
Phoneme
instances.Auto-generated.
-
element
¶ Type: array of rst.audition.Phoneme
The individual elements of the collection.
Constraints regarding the empty collection, sorting, duplicated entries etc. are use case specific.
-
message PhonemeCollection {
/**
* The individual elements of the collection.
*
* Constraints regarding the empty collection, sorting, duplicated
* entries etc. are use case specific.
*/
repeated Phoneme element = 1;
}
Message Phoneme¶
-
class
rst.audition.
Phoneme
¶ Objects of this represent a single phoneme-duration pair.
A list of elements of this type can be used to describe words or whole sentences of speech.
Code author: Simon Schulz <sschulz@techfak.uni-bielefeld.de>
@create_collection
-
symbol
¶ Type: ASCII-STRING
A single phone symbol (such as aI, E, C, R, _, ...).
- e.g. see https://en.wikipedia.org/wiki/Phoneme
- or http://www.phon.ucl.ac.uk/home/sampa/german.htm (german) examples
-
duration
¶ Type: UINT32
Unit: millisecond
The duration of this symbol.
-
message Phoneme {
/**
* A single phone symbol (such as aI, E, C, R, _, ...).
*
* e.g. see https://en.wikipedia.org/wiki/Phoneme
* or http://www.phon.ucl.ac.uk/home/sampa/german.htm (german)
* examples
*/
required string symbol = 1;
/**
* The duration of this symbol.
*/
// @unit(millisecond)
required uint32 duration = 2;
}