testcontainers-scala icon indicating copy to clipboard operation
testcontainers-scala copied to clipboard

Contributing new containers?

Open miketwo opened this issue 6 years ago • 2 comments

I'd like to contribute back to this project, but not really sure of where to put the code. I've got a working Hadoop HDFS container, implemented as a trait. See below:

import java.time.Duration
import java.time.temporal.ChronoUnit.SECONDS

import com.dimafeng.testcontainers.GenericContainer
import org.testcontainers.containers.wait.{LogMessageWaitStrategy, Wait, WaitAllStrategy}
import java.util.function.{Predicate => JPredicate}

/** Mix this container in to integration tests.
  *
  * Usage:
  * After the container is up (in afterStart()), you can get a handle to the filesystem with
  *
  *   val hdfsuri = s"hdfs://${container.containerIpAddress}:${container.mappedPort(9000)}"
      val hadoopConfig = new Configuration()
      hadoopConfig.set("fs.defaultFS", hdfsuri)
      fs = FileSystem.newInstance(new URI(hdfsuri), hadoopConfig)
  *
  */
trait HadoopContainer {

  /** Helper method for writing Java Predicates
    * See testForSafeMode for example usage.
    * */
  implicit def toJavaPredicate[A](f: Function1[A, Boolean]) =
    new JPredicate[A] {
      override def test(a: A): Boolean = f(a)
    }

  private val WaitForNameNode = new LogMessageWaitStrategy().withRegEx(".*localhost: starting nodemanager.*\n")
  private val testForSafeMode: String => Boolean = (s: String) => !s.contains("Safe mode is ON.")
  private val WaitForSafeModeOff = Wait.forHttp("/dfshealth.jsp").forResponsePredicate(testForSafeMode)

  lazy val container = GenericContainer(
    "sequenceiq/hadoop-docker:2.6.0",
    command = Seq("/etc/bootstrap.sh", "-d"),
    exposedPorts = Seq(50070, 50010, 50020, 50075, 50090, 8020, 9000),
    waitStrategy = new WaitAllStrategy()
      .withStrategy(WaitForNameNode)
      .withStrategy(WaitForSafeModeOff)
      .withStartupTimeout(Duration.of(60, SECONDS))
  )
}

and example usage:

class IntegrationTestsHadoop
    extends FunSpec
    with ForEachTestContainer
    with Logging
    with HadoopContainer {

  var fs: FileSystem = _

  /** Setup the container with the proper fixture */
  override def afterStart: Unit = {
    logger.info("Starting...")
    val hdfsuri = s"hdfs://${container.containerIpAddress}:${container.mappedPort(9000)}"
    val hadoopConfig = new Configuration()
    hadoopConfig.set("fs.defaultFS", hdfsuri)
    fs = FileSystem.newInstance(new URI(hdfsuri), hadoopConfig)

    // Setup all the data needed for testing
    logger.info("Making directory...")
    fs.mkdirs(new Path("/whatever/newdirectory"))
    logger.info("Copying fixture to HDFS...")
    fs.copyFromLocalFile(new Path("src/test/resources/fixtures/data"),
                         new Path("/"))
...

I think it might be useful to have additional "pre-baked" images for people to use, but this isn't really implemented in the same way as the others (MySQL, Postgres, Selenium...). Thoughts?

miketwo avatar Apr 20 '18 16:04 miketwo

Hi @miketwo, Sorry for the late response. Thank you for sharing this! It's a great idea to have "pre-baked". I'm not sure that having them as a trait is generic enough for all user, because if this case:

  • It's a bit hard to combine this mixin with other containers (container field would be taken)
  • It's going to be harder to tweak some parameters
  • There will be 2 ways of using containers: current one and mixins which can be hard to use (not sure)

Looking at your example, it seems to me, that it would be great to have a prepared container as a new type of container that extands GenericContainer:

class HadoopContainer extends GenericContainer("sequenceiq/hadoop-docker:2.6.0",
                             command = Seq("/etc/bootstrap.sh", "-d"),
                              ...) {
...
def hdfsuri = s"hdfs://${container.containerIpAddress}:${container.mappedPort(9000)}"
def hadoopConfig = new Configuration() {
    hadoopConfig.set("fs.defaultFS", hdfsuri)
}
lazy val fileSystem: FileSystem = FileSystem.newInstance(new URI(hdfsuri), hadoopConfig)
...
}

In this case, usage would be:

class IntegrationTestsHadoop
    extends FunSpec
    with ForEachTestContainer
    with Logging {

  val container = new HadoopContainer()

  /** Setup the container with the proper fixture */
  it should "to do" in {
    val fs = container.fileSystem

    // Setup all the data needed for testing
    logger.info("Making directory...")
    fs.mkdirs(new Path("/whatever/newdirectory"))
    logger.info("Copying fixture to HDFS...")
    fs.copyFromLocalFile(new Path("src/test/resources/fixtures/data"),
                         new Path("/"))
  }
...

I see few benifits of such approach:

  1. More consistent API
  2. Common methods hdfsuri, hadoopConfig and fileSystem are provided out of the box and you don't need to define it in your spec.
  3. You have some room in configuration of the container
  4. Easy to test this contaier in scope of testcontainers-scala testing 😃

What do you think about this approach? Would you like to prepare a PR?

dimafeng avatar Apr 22 '18 15:04 dimafeng

Sounds good. I'll raise a PR when I get some time.

miketwo avatar Apr 25 '18 18:04 miketwo