violet_rails
violet_rails copied to clipboard
transcript parser plugin
given a JSON path that points to a string value (called the input), this plugin will scan the the associated API Resources of the API Namespace output the value to a different attribute (pointed at by another JSON path)
properties
input_string_path: "api_namespace_slug.some_property.another_property"
output_string_path: "api_namespace_slug.some_property.a_different_property"
it should raise an error of the input path and output path are the same (indicates overwriting)
given a transcript like:
1
00:00:00,300 --> 00:00:02,167
[sally]: i'm in los angeles and
2
00:00:01,920 --> 00:00:02,102
[bobby]: okay
3
00:00:03,151 --> 00:00:04,737
[sally]: maybe possibly
4
00:00:05,780 --> 00:00:13,613
[bobby]: oh maybe ye so nice an is a
supporter financial supporter of hollows
5
00:00:13,345 --> 00:00:15,510
[sally]: s yeah
output a text corpus without segments, timestamps and [names]:
i'm in los angeles and okay maybe possibly oh maybe ye so nice an is a supporter financial supporter of hollowss yeah
Do the metadata properties have to be JSON paths? Since this plugin will run against the API namespace that it's connected to, we can just specify the names of the input and output properties in the metadata.
metadata: {
INPUT_STRING_PROPERTY: "raw_transcript",
OUTPUT_STRING_PROPERTY: "transcript"
}
Do we need to add a boolean property to API resources to check if parsing is required?