DataCleaner
DataCleaner copied to clipboard
Using external Groovy scripts and classes from Java jar files in Groovy advanced scripts
Is it possible to set classpath in a Groovy advanced script to allow import and use of classes in jar files? I have tried adding the jar to the datacleaner.cmd DATACLEANER_JAVA_OPTS using -cp this seems to have no effect.
Also is there any way to set the classpath in a DataCleaner Groovy script to allow it to load and use classes and methods from another Groovy script. thanks in advance, Tom
Hi Tom,
I don't think we have a ready-to-use fix for this, but building up the classpath would in deed be the right way to go about this. Currently the sh/cmd scripts use the java -jar
command line option, which unfortunately is incompatible with the -cp
parameter that you're talking about. But maybe an alternative is for you to redefine the script. Something like this should be doable:
java -cp DataCleaner.jar:lib/*.jar org.datacleaner.Main
I'm just not sure that the wildcard lib/*.jar
entry works ...
If you get this to work, please share the fix - we may want to simple change the official sh/cmd files included with DC.
I seem to remember that we changed it to JAR files (from classpath based) because it solved at problem, but I do not remember why.
Signed execution, maybe?
Oh, I don't recall that. But maybe. Or maybe because it allows for the main class name to be specified in the manifest rather than having to include it on the command line?
That could also very well be.
Thanks for the quick reply Kasper. I have been able to access code in both external JARs and classes by using your suggested fix as follows:
- Place ORMLITE (or any) custom jars in %DATACLEANER_HOME%\local-lib
- Place compiled Groovy script StrUtil in %DATACLEANER_HOME%\local-scripts\src
- Change startup commamnd in datacleaner.cmd to
call java %DATACLEANER_JAVA_OPTS% -cp "DataCleaner.jar;local-lib/*;local-scripts/src" org.datacleaner.Main
then import and use classes in DC advanced Groovy script:
import com.j256.ormlite.support.ConnectionSource; import com.j256.ormlite.jdbc.JdbcConnectionSource; import StrUtil
class Transformer {
ConnectionSource consrc
def StrUtil = null // needed only if loading uncompiled Groovy script (see comments below)
// utility method to print this scripts working classpath to DC console
void printClassPath(classLoader) {
println "$classLoader"
classLoader.getURLs().each {url->
println "- ${url.toString()}"
}
if (classLoader.parent) {
printClassPath(classLoader.parent)
}
}
void initialize() {
// alternative way to load uncompiled Groovy script (without import), not required if script is compiled
//File sourceFile = new File("local-scripts\\src\\StrUtil.groovy");
//StrUtil = new GroovyClassLoader(getClass().getClassLoader()).parseClass(sourceFile);
// don't need instance as all StrUtil methods are static
//GroovyObject strutil = (GroovyObject) StrUtil.newInstance();
// alternative way to add external jar to classpath (without import) and instantiate a class within:
// this.class.classLoader.addURL(new URL("file:///D:/opt/pkg/DataCleaner-5.7.0/local-lib/ormlite-core-5.1.jar"))
// this.class.classLoader.addURL(new URL("file:///D:/opt/pkg/DataCleaner-5.7.0/local-lib/ormlite-jdbc-5.1.jar"))
//consrc = this.class.classLoader.loadClass("com.j256.ormlite.support.ConnectionSource")
// print script's current classpath (for debugging class access issues only)
printClassPath this.class.classLoader
// create new class from jar
consrc = new JdbcConnectionSource()
}
// use classes and instances in script main body
void transform(map, outputCollector) {
def fname = map.licenseeFirstName;
def validFname = "false";
def cleanFname = null;
if (! StrUtil.isEmpty(fname)){
validFname = "true"
if (StrUtil.isAlphanum(fname))
cleanFname = StrUtil.capAllWords(fname,[" ","'","&","-","\\("])
else
cleanFname = fname.trim()
}
// etc..., e.g. consrc.someMethod()
}
}
In regard to specifying classpaths with -cp, please note the following (from here)
Class path entries can contain the basename wildcard character , which is considered equivalent to specifying a list of all the files in the directory with the extension .jar or .JAR. For example, the class path entry foo/ specifies all JAR files in the directory named foo. A classpath entry consisting simply of * expands to a list of all the jar files in the current directory.
A class path entry that contains * will not match class files. To match both classes and JAR files in a single directory foo, use either foo;foo/* or foo/*;foo. The order chosen determines whether the classes and resources in foo are loaded before JAR files in foo, or vice versa.
I'm amazed that you didn't have to put the lib
directory in that classpath. But maybe because the manifest of DataCleaner.jar references all those files in lib, the -cp
option is clever enough to include them too. It didn't used to be like that with "your grandfather's java", but I'm glad if it improved.
I think we'll need to test this on different OS and Java versions, but I like the idea of updating the sh/cmd files to allow easier inclusion of custom JAR files.