Talk Analytics API, gRPC: TalkService.Get
- gRPC request
- GetTalkRequest
- GetTalkResponse
- Talk
- Field
- Transcription
- Phrase
- PhraseText
- Word
- PhraseStatistics
- UtteranceStatistics
- AudioSegmentBoundaries
- DescriptiveStatistics
- Quantile
- RecognitionClassifierResult
- PhraseHighlight
- RecognitionClassifierLabel
- AlgorithmMetadata
- Error
- SpeechStatistics
- SilenceStatistics
- InterruptsStatistics
- InterruptsEvaluation
- ConversationStatistics
- SpeakerStatistics
- Points
- Quiz
- TextClassifiers
- ClassificationResult
- ClassifierStatistics
- Histogram
- Summarization
- SummarizationStatement
- SummarizationField
rpc for bulk get
gRPC request
rpc Get (GetTalkRequest) returns (GetTalkResponse)
GetTalkRequest
{
"organizationId": "string",
"spaceId": "string",
"connectionId": "string",
"projectId": "string",
"talkIds": [
"string"
],
"resultsMask": "google.protobuf.FieldMask"
}
Field |
Description |
organizationId |
string id of organization |
spaceId |
string id of space |
connectionId |
string id of connection to search data |
projectId |
string id of project to search data |
talkIds[] |
string ids of talks to return. Requesting too many talks may result in "message exceeds maximum size" error. |
resultsMask |
All types of analysis will be returned if not set. |
GetTalkResponse
{
"talk": [
{
"id": "string",
"organizationId": "string",
"spaceId": "string",
"connectionId": "string",
"projectIds": [
"string"
],
"createdBy": "string",
"createdAt": "google.protobuf.Timestamp",
"modifiedBy": "string",
"modifiedAt": "google.protobuf.Timestamp",
"talkFields": [
{
"name": "string",
"value": "string",
"type": "FieldType"
}
],
"transcription": {
"phrases": [
{
"channelNumber": "int64",
"startTimeMs": "int64",
"endTimeMs": "int64",
"phrase": {
"text": "string",
"language": "string",
"normalizedText": "string",
"words": [
{
"word": "string",
"startTimeMs": "int64",
"endTimeMs": "int64"
}
]
},
"statistics": {
"statistics": {
"speakerTag": "string",
"speechBoundaries": {
"startTimeMs": "int64",
"endTimeMs": "int64",
"durationSeconds": "int64"
},
"totalSpeechMs": "int64",
"speechRatio": "double",
"totalSilenceMs": "int64",
"silenceRatio": "double",
"wordsCount": "int64",
"lettersCount": "int64",
"wordsPerSecond": {
"min": "double",
"max": "double",
"mean": "double",
"std": "double",
"quantiles": [
{
"level": "double",
"value": "double"
}
]
},
"lettersPerSecond": {
"min": "double",
"max": "double",
"mean": "double",
"std": "double",
"quantiles": [
{
"level": "double",
"value": "double"
}
]
}
}
},
"classifiers": [
{
"startTimeMs": "int64",
"endTimeMs": "int64",
"classifier": "string",
"highlights": [
{
"text": "string",
"offset": "int64",
"count": "int64"
}
],
"labels": [
{
"label": "string",
"confidence": "double"
}
]
}
]
}
],
"algorithmsMetadata": [
{
"createdTaskDate": "google.protobuf.Timestamp",
"completedTaskDate": "google.protobuf.Timestamp",
"error": {
"code": "string",
"message": "string"
},
"traceId": "string",
"name": "string"
}
]
},
"speechStatistics": {
"totalSimultaneousSpeechDurationSeconds": "int64",
"totalSimultaneousSpeechDurationMs": "int64",
"totalSimultaneousSpeechRatio": "double",
"simultaneousSpeechDurationEstimation": {
"min": "double",
"max": "double",
"mean": "double",
"std": "double",
"quantiles": [
{
"level": "double",
"value": "double"
}
]
}
},
"silenceStatistics": {
"totalSimultaneousSilenceDurationMs": "int64",
"totalSimultaneousSilenceRatio": "double",
"simultaneousSilenceDurationEstimation": {
"min": "double",
"max": "double",
"mean": "double",
"std": "double",
"quantiles": [
{
"level": "double",
"value": "double"
}
]
},
"totalSimultaneousSilenceDurationSeconds": "int64"
},
"interruptsStatistics": {
"speakerInterrupts": [
{
"speakerTag": "string",
"interruptsCount": "int64",
"interruptsDurationMs": "int64",
"interrupts": [
{
"startTimeMs": "int64",
"endTimeMs": "int64",
"durationSeconds": "int64"
}
],
"interruptsDurationSeconds": "int64"
}
]
},
"conversationStatistics": {
"conversationBoundaries": {
"startTimeMs": "int64",
"endTimeMs": "int64",
"durationSeconds": "int64"
},
"speakerStatistics": [
{
"speakerTag": "string",
"completeStatistics": {
"speakerTag": "string",
"speechBoundaries": {
"startTimeMs": "int64",
"endTimeMs": "int64",
"durationSeconds": "int64"
},
"totalSpeechMs": "int64",
"speechRatio": "double",
"totalSilenceMs": "int64",
"silenceRatio": "double",
"wordsCount": "int64",
"lettersCount": "int64",
"wordsPerSecond": {
"min": "double",
"max": "double",
"mean": "double",
"std": "double",
"quantiles": [
{
"level": "double",
"value": "double"
}
]
},
"lettersPerSecond": {
"min": "double",
"max": "double",
"mean": "double",
"std": "double",
"quantiles": [
{
"level": "double",
"value": "double"
}
]
}
},
"wordsPerUtterance": {
"min": "double",
"max": "double",
"mean": "double",
"std": "double",
"quantiles": [
{
"level": "double",
"value": "double"
}
]
},
"lettersPerUtterance": {
"min": "double",
"max": "double",
"mean": "double",
"std": "double",
"quantiles": [
{
"level": "double",
"value": "double"
}
]
},
"utteranceCount": "int64",
"utteranceDurationEstimation": {
"min": "double",
"max": "double",
"mean": "double",
"std": "double",
"quantiles": [
{
"level": "double",
"value": "double"
}
]
}
}
]
},
"points": {
"quiz": [
{
"request": "string",
"response": "google.protobuf.StringValue",
"id": "string"
}
]
},
"textClassifiers": {
"classificationResult": [
{
"classifier": "string",
"classifierStatistics": [
{
"channelNumber": "google.protobuf.Int64Value",
"totalCount": "int64",
"histograms": [
{
"countValues": [
"int64"
]
}
]
}
]
}
]
},
"summarization": {
"statements": [
{
"field": {
"id": "string",
"name": "string",
"type": "SummarizationFieldType"
},
"response": [
"string"
]
}
]
}
}
]
}
Field |
Description |
talk[] |
Talk
Field |
Description |
id |
string talk id |
organizationId |
string |
spaceId |
string |
connectionId |
string |
projectIds[] |
string |
createdBy |
string audition info |
createdAt |
|
modifiedBy |
string |
modifiedAt |
|
talkFields[] |
key-value representation of talk fields with values |
transcription |
various ml analysis results |
speechStatistics |
|
silenceStatistics |
|
interruptsStatistics |
|
conversationStatistics |
|
points |
|
textClassifiers |
|
summarization |
Field
connection field value
Field |
Description |
name |
string name of the field |
value |
string field value |
type |
enum FieldType field type
|
Transcription
Field |
Description |
phrases[] |
|
algorithmsMetadata[] |
Their might be several algorithms that work on talk transcription. For example: speechkit and translator |
Phrase
Field |
Description |
channelNumber |
int64 |
startTimeMs |
int64 |
endTimeMs |
int64 |
phrase |
|
statistics |
|
classifiers[] |
PhraseText
Field |
Description |
text |
string |
language |
string |
normalizedText |
string |
words[] |
Word
Field |
Description |
word |
string |
startTimeMs |
int64 |
endTimeMs |
int64 |
PhraseStatistics
Field |
Description |
statistics |
UtteranceStatistics
Field |
Description |
speakerTag |
string |
speechBoundaries |
Audio segment boundaries |
totalSpeechMs |
int64 Total speech duration |
speechRatio |
double Speech ratio within audio segment |
totalSilenceMs |
int64 Total silence duration |
silenceRatio |
double Silence ratio within audio segment |
wordsCount |
int64 Number of words in recognized speech |
lettersCount |
int64 Number of letters in recognized speech |
wordsPerSecond |
Descriptive statistics for words per second distribution |
lettersPerSecond |
Descriptive statistics for letters per second distribution |
AudioSegmentBoundaries
Field |
Description |
startTimeMs |
int64 Audio segment start time |
endTimeMs |
int64 Audio segment end time |
durationSeconds |
int64 Duration in seconds |
DescriptiveStatistics
Field |
Description |
min |
double Minimum observed value |
max |
double Maximum observed value |
mean |
double Estimated mean of distribution |
std |
double Estimated standard deviation of distribution |
quantiles[] |
List of evaluated quantiles |
Quantile
Field |
Description |
level |
double Quantile level in range (0, 1) |
value |
double Quantile value |
RecognitionClassifierResult
Field |
Description |
startTimeMs |
int64 Start time of the audio segment used for classification |
endTimeMs |
int64 End time of the audio segment used for classification |
classifier |
string Name of the triggered classifier |
highlights[] |
List of highlights, i.e. parts of phrase that determine the result of the classification |
labels[] |
Classifier predictions |
PhraseHighlight
Field |
Description |
text |
string Text transcription of the highlighted audio segment |
offset |
int64 offset in symbols from the beginning of whole phrase where highlight begins |
count |
int64 count of symbols in highlighted text |
RecognitionClassifierLabel
Field |
Description |
label |
string The label of the class predicted by the classifier |
confidence |
double The prediction confidence |
AlgorithmMetadata
Field |
Description |
createdTaskDate |
|
completedTaskDate |
|
error |
|
traceId |
string |
name |
string |
Error
Field |
Description |
code |
string |
message |
string |
SpeechStatistics
Field |
Description |
totalSimultaneousSpeechDurationSeconds |
int64 Total simultaneous speech duration in seconds |
totalSimultaneousSpeechDurationMs |
int64 Total simultaneous speech duration in ms |
totalSimultaneousSpeechRatio |
double Simultaneous speech ratio within audio segment |
simultaneousSpeechDurationEstimation |
Descriptive statistics for simultaneous speech duration distribution |
SilenceStatistics
Field |
Description |
totalSimultaneousSilenceDurationMs |
int64 |
totalSimultaneousSilenceRatio |
double Simultaneous silence ratio within audio segment |
simultaneousSilenceDurationEstimation |
Descriptive statistics for simultaneous silence duration distribution |
totalSimultaneousSilenceDurationSeconds |
int64 |
InterruptsStatistics
Field |
Description |
speakerInterrupts[] |
Interrupts description for every speaker |
InterruptsEvaluation
Field |
Description |
speakerTag |
string Speaker tag |
interruptsCount |
int64 Number of interrupts made by the speaker |
interruptsDurationMs |
int64 Total duration of all interrupts |
interrupts[] |
Boundaries for every interrupt |
interruptsDurationSeconds |
int64 Total duration of all interrupts in seconds |
ConversationStatistics
Field |
Description |
conversationBoundaries |
Audio segment boundaries |
speakerStatistics[] |
Average statistics for each speaker |
SpeakerStatistics
Field |
Description |
speakerTag |
string Speaker tag |
completeStatistics |
analysis of all phrases in format of single utterance |
wordsPerUtterance |
Descriptive statistics for words per utterance distribution |
lettersPerUtterance |
Descriptive statistics for letters per utterance distribution |
utteranceCount |
int64 Number of utterances |
utteranceDurationEstimation |
Descriptive statistics for utterance duration distribution |
Points
Field |
Description |
quiz[] |
Quiz
Field |
Description |
request |
string |
response |
|
id |
string |
TextClassifiers
Field |
Description |
classificationResult[] |
ClassificationResult
Field |
Description |
classifier |
string Classifier name |
classifierStatistics[] |
Classifier statistics |
ClassifierStatistics
Field |
Description |
channelNumber |
Channel number, null for whole talk |
totalCount |
int64 classifier total count |
histograms[] |
Represents various histograms build on top of classifiers |
Histogram
Field |
Description |
countValues[] |
int64 histogram count values. For example: |
Summarization
Field |
Description |
statements[] |
SummarizationStatement
Field |
Description |
field |
|
response[] |
string |
SummarizationField
Field |
Description |
id |
string |
name |
string |
type |
enum SummarizationFieldType
|