DataCleaner icon indicating copy to clipboard operation
DataCleaner copied to clipboard

Using external Groovy scripts and classes from Java jar files in Groovy advanced scripts

Open tjr50 opened this issue 5 years ago • 6 comments

Is it possible to set classpath in a Groovy advanced script to allow import and use of classes in jar files? I have tried adding the jar to the datacleaner.cmd DATACLEANER_JAVA_OPTS using -cp this seems to have no effect.

Also is there any way to set the classpath in a DataCleaner Groovy script to allow it to load and use classes and methods from another Groovy script. thanks in advance, Tom

tjr50 avatar May 01 '19 05:05 tjr50

Hi Tom,

I don't think we have a ready-to-use fix for this, but building up the classpath would in deed be the right way to go about this. Currently the sh/cmd scripts use the java -jar command line option, which unfortunately is incompatible with the -cp parameter that you're talking about. But maybe an alternative is for you to redefine the script. Something like this should be doable:

java -cp DataCleaner.jar:lib/*.jar org.datacleaner.Main

I'm just not sure that the wildcard lib/*.jar entry works ...

If you get this to work, please share the fix - we may want to simple change the official sh/cmd files included with DC.

kaspersorensen avatar May 02 '19 03:05 kaspersorensen

I seem to remember that we changed it to JAR files (from classpath based) because it solved at problem, but I do not remember why.

Signed execution, maybe?

LosD avatar May 02 '19 06:05 LosD

Oh, I don't recall that. But maybe. Or maybe because it allows for the main class name to be specified in the manifest rather than having to include it on the command line?

kaspersorensen avatar May 02 '19 21:05 kaspersorensen

That could also very well be.

LosD avatar May 02 '19 21:05 LosD

Thanks for the quick reply Kasper. I have been able to access code in both external JARs and classes by using your suggested fix as follows:

  • Place ORMLITE (or any) custom jars in %DATACLEANER_HOME%\local-lib
  • Place compiled Groovy script StrUtil in %DATACLEANER_HOME%\local-scripts\src
  • Change startup commamnd in datacleaner.cmd to call java %DATACLEANER_JAVA_OPTS% -cp "DataCleaner.jar;local-lib/*;local-scripts/src" org.datacleaner.Main

then import and use classes in DC advanced Groovy script:

import com.j256.ormlite.support.ConnectionSource; import com.j256.ormlite.jdbc.JdbcConnectionSource; import StrUtil

class Transformer {

    ConnectionSource consrc 
	def StrUtil = null  // needed only if loading uncompiled Groovy script (see comments below)
	
	// utility method to print this scripts working classpath to DC console
	void printClassPath(classLoader) {
        println "$classLoader"
        classLoader.getURLs().each {url->
            println "- ${url.toString()}"
        }
		if (classLoader.parent) {
            printClassPath(classLoader.parent)
        }
    }
	
	void initialize() {
		// alternative way to load uncompiled Groovy script (without import), not required if script is compiled
		//File sourceFile = new File("local-scripts\\src\\StrUtil.groovy");
		//StrUtil = new GroovyClassLoader(getClass().getClassLoader()).parseClass(sourceFile);
		// don't need instance as all StrUtil methods are static
		//GroovyObject strutil = (GroovyObject) StrUtil.newInstance();

		// alternative way to add external jar to classpath (without import) and instantiate a class within:
        // this.class.classLoader.addURL(new URL("file:///D:/opt/pkg/DataCleaner-5.7.0/local-lib/ormlite-core-5.1.jar"))
        //  this.class.classLoader.addURL(new URL("file:///D:/opt/pkg/DataCleaner-5.7.0/local-lib/ormlite-jdbc-5.1.jar"))
        //consrc = this.class.classLoader.loadClass("com.j256.ormlite.support.ConnectionSource")

		// print script's current classpath (for debugging class access issues only)
        printClassPath this.class.classLoader
		// create new class from jar
        consrc = new JdbcConnectionSource()
    } 
	
	// use classes and instances in script main body
	void transform(map, outputCollector) {
		
		def fname = map.licenseeFirstName;
		def validFname = "false";
		def cleanFname = null;

        if (! StrUtil.isEmpty(fname)){
            validFname = "true"
            if (StrUtil.isAlphanum(fname))
				cleanFname = StrUtil.capAllWords(fname,[" ","'","&","-","\\("])
            else
              	cleanFname = fname.trim()
        }
		// etc..., e.g. consrc.someMethod()
	}
}

In regard to specifying classpaths with -cp, please note the following (from here)

Class path entries can contain the basename wildcard character , which is considered equivalent to specifying a list of all the files in the directory with the extension .jar or .JAR. For example, the class path entry foo/ specifies all JAR files in the directory named foo. A classpath entry consisting simply of * expands to a list of all the jar files in the current directory.

A class path entry that contains * will not match class files. To match both classes and JAR files in a single directory foo, use either foo;foo/* or foo/*;foo. The order chosen determines whether the classes and resources in foo are loaded before JAR files in foo, or vice versa.

tjr50 avatar May 03 '19 03:05 tjr50

I'm amazed that you didn't have to put the lib directory in that classpath. But maybe because the manifest of DataCleaner.jar references all those files in lib, the -cp option is clever enough to include them too. It didn't used to be like that with "your grandfather's java", but I'm glad if it improved.

I think we'll need to test this on different OS and Java versions, but I like the idea of updating the sh/cmd files to allow easier inclusion of custom JAR files.

kaspersorensen avatar May 03 '19 13:05 kaspersorensen