mist Docs - Multiple Classes in Jar, Custom Encoder, Package Class, Resubmit Conf, Debug and Absolute name of artifact & function.

How to add multiple functions from same jar. The packaged jar has multiple classes. How to write conf for multiple functions pointing to the respective classes.
How to write a custom encoder for case class which can is used in MistFn[CaseClass].

// Sample case class
case class CorrelationMatrix(headers: Array[String], values:Array[Array[Double]])
object CorrelationMatrix extends MistFn[CorrelationMatrix] {
    ...
}

How to add a class which has a package in class-name. The class has package like io.hydrosphere. Adding class-name = "io.hydrosphere.CorrelationMatrix$" doesn't work.
How to re-submit the function after code changes. After submitting the conf, there are few more code changes. If we submit the conf again, Error: Artifact key xxx.jar has to be unique. How to overwrite the artifact without manually deleting the data/artifacts/xxx.jar and data/functions/yyy.conf.
How to debug the spark job code.
How to prevent current user getting prefixed to artifact and function.

Mar 27 '18 16:03 gowravshekar

Thanks for questions, they will help us to improve our documentation. For a start, I try to answer here

Multiple functions If your question was about mist-cli configuration, then you just need to create a conf file that points on class-name for each function that you want to deploy. For example for two functions A and B there should be two files:
- a.conf
```
model = Function
name = a
data {
   path = my_jar_0.0.1.jar
   class-name = "A$"
   context = default
}
```
- b.conf:
```
model = Function
name = b
data {
 path = my_jar_0.0.1.jar
 class-name = "B$"
 context = default
}
```

Custom encoders We are going to add encoder derivation for cases classes in future releases, so currently there are no other ways except to write it manually:

import mist.api._
import mist.api.Encoder
import mist.api.data._

case class MyResponse(x: Int, y: String)
object MyResponse {
    implicit val myResponseEncoder = new Encoder[MyResponse] {
         override def aply(rsp: MyResponse): JsLikeData = {
             JsLikeMap("x" -> JsLikeNumber(a.x), "s" -> JsLikeString(a.s))
         }
    }
}

object MyFn extends MistFn[MyResponse] { 
  ..
}

Package - I can't reproduce that problem. Are you sure that package you specified is correct and exists in jar?
Updating artifact: You can use mist-cli apply -f conf --validate true. But keep in mind: that action can affect in-progress functions. Also, there is an issue about artifact refreshing on workers #437, so if you use shared context type you need to manually stop worker to apply changes or use exclusive.
Every job has logs - you can use them for debugging. In RC14 we improved them - now mist collects logs from spark too. There is also withMistExtras directive to obtain a logger inside function body
Passing empty -u argument should work: mist-cli apply -f conf -u ''. I think we should reconsider the default behavior of building names in mist-cli. @blvp, do you have any thoughts?

Mar 29 '18 11:03 dos65

Also, we have gitter room for questions.

Mar 29 '18 11:03 dos65

@dos65, Thank you for the explanation. Really appreciate your time and consideration.

Package class works. The artifact wasn't refreshed when I added package to class. On restarting mist-master it worked.

I was not able to get the updating artifact work. Using mist-1.0.0-RC13

If I run mist-cli apply -f conf --validate true -u '', getting error - Artifact key xxx.jar has to be unique.

If I run mist-cli apply -f conf/correlation-matrix.conf --validate true -u '', getting error - Error: 400 Client Error: Bad Request for url: http://localhost:2004/v2/api/functions?force=False: class java.lang.IllegalStateException: Endpoint correlation-matrix already exists

With respect to debugging, I'm looking for a way to put breakpoint in code and debug. Similar to this.

Mar 30 '18 18:03 gowravshekar

Last error with function update was fixed in a new version of mist-cli, try to update it with following command:pip install mist-cli --upgrade

Mar 30 '18 20:03 blvp

After upgrading,

mist-cli apply -f conf/correlation-matrix.conf --validate true -u '' - Works.

mist-cli apply -f conf --validate true -u '' - Getting same error message. Artifact key xxx.jar has to be unique

Mar 30 '18 20:03 gowravshekar

@gowravshekar About debugging - unfortunately, there is a bug with constructing spark-submit command (#472), so currently it's impossible to pass driver-java-options correctly, If you really need it you can implement manual runner and add into spark submit following argument --driver-java-options '-Xdebug -Xrunjdwp:transport=dt_socket,address=15000,server=y,suspend=y'

Apr 02 '18 08:04 dos65

mist-cli apply -f conf --validate true -u '' - Getting same error message. Artifact key xxx.jar has to be unique

This is normal behavior because you can break all functions using that jar. If you want to update jar with enabled validation you should change version config value and then change it in function. Reasons behind this are the following - apply method used for both development and release and this limitation is kind of our vision of release process.

Some additional notes.

You can use environment variables to manage artifact version. For example: artifact.conf

model = Artifact
name = test-artifact
version = ${ARTIFACT_VERSION}
data.file-path = "./path/to/artifact.jar"

function.conf

model = 
data {
    ...
    path = test-artifact_${ARTIFACT_VERSION}.jar
    ...
}

and then ARTIFACT_VERSION=0.0.1 mist-cli apply -f conf/

Apr 02 '18 08:04 blvp

Oh, my mistake - --validate false instead of --validate true for unsafe update

Apr 02 '18 08:04 dos65

@gowravshekar About debugging - unfortunately, there is a bug with constructing spark-submit command (#472), so currently it's impossible to pass driver-java-options correctly, If you really need it you can implement manual runner and add into spark submit following argument --driver-java-options '-Xdebug -Xrunjdwp:transport=dt_socket,address=15000,server=y,suspend=y'

Does this bug still exist? Is there a way now to debug the spark job?

Nov 30 '18 11:11 apoorv22

@apoorv22 this one is fixed, you can use these options to debug spark job. Also, you need to be aware of the following things:

your context should have precreated=true and maxParallelJobs=1settings. Otherwise, it will be problematical to start several workers and connect a debugger to the desired process.
breakpoints should suspend the current thread only, not all VM. When mist-master loses heartbeats from worker-process it marks it as failed. For example, by default IntelliJ sets breakpoints that suspend VM fully.

Dec 03 '18 06:12 dos65

@blvp, Is there a way to use an environmental variable or config to use in data.file-path in artifact.conf?

Some thing similar as below: data.file-path = "./path/to/artifact_${ARTIFACT_VERSION}.jar"

Dec 04 '18 19:12 gowravshekar

Yes, you can use environment variable here in a similar manner: data.file-path="simple-name"${VERSION}".jar"

Dec 05 '18 07:12 blvp

mist mist copied to clipboard

Docs - Multiple Classes in Jar, Custom Encoder, Package Class, Resubmit Conf, Debug and Absolute name of artifact & function.

mist
mist copied to clipboard