transporter
transporter copied to clipboard
Changing Namespaces doesn't work
I'm following and adjusting the instructions from the namespaces post (here), specially the part about changing the namespace of the transformed messages.
My pipeline.js
file looks like the snippet below:
var source = mongodb({
"uri": "mongodb://localhost:3001/meteor"
});
var sink = elasticsearch({
"uri": "http://user:pass@host:port"
});
t.Source("source", source, "/^(files)$/")
.Transform("sort", js({filename: "sort-files.js"}), "/.*/")
.Save("sink", sink, "/.*/") ;
And the sort-files.js
transformer sends each document to a different namespace according to some field or discards it if it is not interesting.
function transform(msg) {
if (msg.data.kind === "COVER") {
msg.ns = "forms-covers.data";
} else {
msg.op = "skip";
}
return msg;
}
System info:
- Transporter version: 0.4.0-rc.1-linux-amd64
- OS: Debian 9
- DB version(s)
- mongodb 3.2.6
- elasticsearch 5.2.2
Reproducible Steps:
- transporter run
What did you expect to happen?
According to the blog post, I expected to be able to rename each document namespace to one that follows the pattern "
What actually happened?
trasnporter created an index named test
and used the namespace of the documents as the type inside that element.
curl http://user:pass@host:port/test
{"test":{"aliases":{},"mappings":{"forms-covers.data":{...
How can I split a collection from the MongoDB source into different indices of the ES sink?
That article refers to a previous version of Transporter - specifically 0.1.0. Namespace handling has changed since then - In version 0.3.0 specifically.
Also ES handling of namespaces incoming and index setting has changed. To set the index, include it in the URI of the ES sink. The "test" index is created if no index is specified in the URI.
The namespace will be used as the type.
To have two indices, create two sinks, place both in the pipeline and then set the namespace so it matches only one of them.
That's a pity, it seems I cannot implement my use case easily then.
The problem is that the type name is the same for all indices. Thus, assuming it's possible to have predefined sinks, there's no way to filter the documents on save because all of them have the same namespace.
Here's a diagram of what I'm trying to do
I have the similar scene, just like : mongo(source) collections -> elasticsearch(sink) indicies
just like mongo-connecotr's support "namespace_mapping" : index_name.* => *.type_name
I think we might support this with some chained transform functions. I'll see what I can come up with and get back to you soon.
Ok, thanks! Sounds great. I'm currently running multiple instances of transporter to implement the use case. So it's not a big issue; at leas it has been working quite good for a few days.
@diegonc after a fair amount of testing, this is not currently possible but it should be. I've labeled it as a bug and will hopefully be able to fit it in to the 0.5.0
release.