-- after-save-hook: (org-html-export-to-html org-babel-tangle); pyvenv-activate: "./env3"; --

#+TITLE: Reimplementing a deep network architecture for single-image rain removal #+AUTHOR: Jonathan Jin #+DATE: <2018-04-28 Sat> #+EMAIL: [email protected]

#+EXPORT_FILE_NAME: index

#+SETUPFILE: theme.setup

#+PROPERTY: header-args:python :eval never-export

Note: This project is a work-in-progress. As such, there are bound to be some rough edges and/or missing functionality here and there. For now, I encourage readers and followers to take what's currently here at face value and follow along for updates as the project progresses. Thanks for the interest! It means a lot to me.

Abstract

We present a re-implementation of the [[https://arxiv.org/pdf/1609.02087v2.pdf][DerainNet method for single-image rain removal]] in the [[https://www-cs-faculty.stanford.edu/~knuth/lp.html][literate programming]] style. Our approach here deviates slightly from the authors' implementation in a variety of ways, most notably of which is the use of [[https://www.tensorflow.org][TensorFlow]] for all stages in the workflow: data preparation; model definition; model training; and so on.

For our purposes, we will use a mirror of the original authors' dataset hosted at [[https://github.com/jinnovation/rainy-image-dataset][github.com/jinnovation/rainy-image-dataset]].
TODO [1/2] Development

We'll split the development process roughly into the following stages:
- Data importing/preparation;
- Model training.

** DONE Data importing and preparation :PROPERTIES: :header-args: :tangle rainy_image_input.py :results none :END:

For starters, we'll need to implement the gateway into our training and evaluation data. We'll do so using the =tf.data.Dataset= API; see [[https://www.tensorflow.org/programmers_guide/datasets][TensorFlow

Develop > Programmer's Guide > Importing Data]] for more details.

#+NAME: get-tangled-fname #+BEGIN_SRC emacs-lisp :exports none :tangle no (let* ((header-args (org-entry-get-multivalued-property (point) "header-args")) (i-tangle (seq-position header-args ":tangle"))) (file-name-base (elt header-args (+ i-tangle 1)))) #+END_SRC

These contents will comprise our call_get-tangled-fname() module.

#+BEGIN_SRC python import glob import os import tensorflow as tf #+END_SRC

The dataset directory is organized into two folders:

=ground truth= images; and
=rainy= images.

#+BEGIN_SRC python GROUND_TRUTH_DIR = "ground truth" RAINY_IMAGE_DIR = "rainy image" #+END_SRC

These correspond to our model's expected outputs, as well as the inputs, respectively. Likewise, each "ground truth" image corresponds to one or more "rainy" images according to some "index" value -- =ground truth/1.jpg= to =rainy image/1_{1,2,3}.jpg=, =ground truth/100.jpg= to =rainy image/100_{1,2,3,4,5}=, and so on.

With this topology in mind, we can define a function to retrieve all index values within the data directory:

#+BEGIN_SRC python def _get_indices(data_dir): """Get indices for input/output association.

     Args:
     data_dir: Path to the data directory.

     Returns:
     indices: List of numerical index values.

     Raises:
     ValueError: If no data_dir or no ground-truth dir.
     """

     if not tf.gfile.Exists(os.path.join(data_dir, GROUND_TRUTH_DIR)):
         raise ValueError("Failed to find ground-truth directory.")

     return [
         os.path.splitext(os.path.basename(f))[0]
         for f in glob.glob(os.path.join(data_dir, GROUND_TRUTH_DIR, "*.jpg"))
     ]

#+END_SRC

#+BEGIN_SRC python :exports none :results none :tangle rainy_image_input_test.py from unittest import mock

 def test__get_indices():
     with mock.patch("glob.glob") as MockGlob, \
         mock.patch("tensorflow.gfile.Exists") as MockExists:
         MockGlob.return_value = ["../foo/ground truth/1.jpg", "../foo/ground truth/2.jpg"]
         MockExists.return_value = True
         assert _get_indices("../foo") == ["1", "2"]

#+END_SRC

Likewise, we can define functions to retrieve the filenames of all input and output files corresponding to a given set of indices.

#+BEGIN_SRC python def _get_input_files(data_dir, indices): """Get input files from indices.

     Args:
     data_dir: Path to the data directory.
     indices: List of numerical index values.

     Returns:
     Dictionary, keyed by index value, valued by string lists containing
     one or more filenames.

     Raises:
     ValueError: If no rainy-image dir.
     """

     directory = os.path.join(data_dir, RAINY_IMAGE_DIR)
     if not tf.gfile.Exists(directory):
         raise ValueError("Failed to find rainy-image directory.")

     return {
         i: glob.glob(os.path.join(directory, "{}_[0-9]*.jpg".format(i)))
         for i in indices
     }

#+END_SRC

#+BEGIN_SRC python def _get_output_files(data_dir, indices): """Get output files from indices.

     Args:
     data_dir: Path to the data directory.
     indices: List of numerical index values.

     Returns:
     outputs: Dictionary, keyed by index value, valued by stsring lists
     containing one or more filenames.

     Raises:
     ValueError: If no ground-truth dir.
     """

     directory = os.path.join(data_dir, GROUND_TRUTH_DIR)
     if not tf.gfile.Exists(directory):
         raise ValueError("Failed to find ground-truth directory.")

     return {
         i: os.path.join(directory, "{}.jpg".format(i)) for i in indices
     }

#+END_SRC

Now we arrive at our first deviation from the author's implementation. The authors, in their paper, don't seem to perform any standardized cropping on their images, relying on the inputs to all be identical in resolution or, at most, flipped, e.g. =384x512= and =512x384=. Here, to ease integration with TensorFlow, we settle on a common length to truncate inputs and outputs by along both dimensions.

#+BEGIN_SRC python IMAGE_SIZE = 384 #+END_SRC

Finally, we have our canonical dataset-creation interface, concluding the first part of our project.

#+BEGIN_SRC python def dataset(data_dir, indices=None): """Construct dataset for rainy-image evaluation.

     Args:
     data_dir: Path to the data directory.
     indices: The input-output pairings to return. If None (the default), uses
     indices present in the data directory.

     Returns:
     dataset: Dataset of input-output images.
     """

     if not indices:
         indices = _get_indices(data_dir)

     fs_in = _get_input_files(data_dir, indices)
     fs_out = _get_output_files(data_dir, indices)

     ins = [
         fname for k, v in iter(sorted(fs_in.items()))
         for fname in v if k in indices
     ]

     outs = [v for sublist in [
         [fname] * len(fs_in[k])
         for k, fname in iter(sorted(fs_out.items()))
         if k in indices
     ] for v in sublist]

     def _parse_function(fname_in, fname_out):
         def _decode_resize(fname):
             f = tf.read_file(fname)
             contents = tf.image.decode_jpeg(f)
             resized = tf.image.resize_image_with_crop_or_pad(
                 contents, IMAGE_SIZE, IMAGE_SIZE,
             )
             casted = tf.cast(resized, tf.float32)
             return casted

         return _decode_resize(fname_in), _decode_resize(fname_out)

     dataset = tf.data.Dataset.from_tensor_slices(
         (tf.constant(ins), tf.constant(outs)),
     ).map(_parse_function)

     return dataset

#+END_SRC

** TODO Model Training :PROPERTIES: :header-args: :tangle train.py :results none :session :END:

*** Dependencies

#+BEGIN_SRC python
  import tensorflow as tf
  import logging
  from rainy_image_input import dataset, IMAGE_SIZE
#+END_SRC

*** Execution Environment

#+BEGIN_SRC python
  tf.app.flags.DEFINE_string("checkpoint_dir", "/tmp/derain-checkpoint",
                             """Directory to write event logs and checkpointing
                             to.""")

  tf.app.flags.DEFINE_string("data_dir",
                             "/tmp/derain_data",
                             """Path to the derain data directory.""")

  tf.app.flags.DEFINE_integer("batch_size",
                              128,
                              """Number of images to process in a batch.""")

  tf.app.flags.DEFINE_integer("max_steps",
                              1000000,
                              """Number of training batches to run.""")

  LEVEL = tf.logging.DEBUG
  FLAGS = tf.app.flags.FLAGS

  LOG = logging.getLogger("derain-train")
#+END_SRC

*** Defining the model

#+BEGIN_SRC python
  MODEL_DEFAULT_PARAMS = {
      "learn_rate": 0.01,
  }


  def model_fn(features, labels, mode, params):
      inputs = features
      expected_outputs = labels
      tf.summary.image("inputs", inputs)
      tf.summary.image("expected_outputs", expected_outputs)
      global_step = tf.train.get_or_create_global_step()

      params = {**MODEL_DEFAULT_PARAMS, **params}

      l = tf.keras.layers
      model = tf.keras.Sequential([
          l.Conv2D(
              3,
              (16, 16),
              input_shape=(IMAGE_SIZE, IMAGE_SIZE, 3),
              use_bias=True,
              activation=tf.nn.tanh,
              padding="same",
          )
          # # 512 kernels of 16x16x3
          # l.Conv2D(
          #     512,
          #     (16, 16),
          #     input_shape=(IMAGE_SIZE, IMAGE_SIZE, 3),
          #     use_bias=True,
          #     activation=tf.nn.tanh,
          #     padding="same",
          # ),
          # # 512 kernels of 1x1x512
          # l.Conv2D(
          #     512,
          #     (1, 1),
          #     use_bias=True,
          #     activation=tf.nn.tanh,
          # ),
          # # 3 kernels of 8x8x512 (one for each color channel)
          # l.Conv2D(
          #     3,
          #     (8, 8),
          #     use_bias=True,
          #     padding="same",
          # ),
      ])

      # TODO: handle each of ModeKeys.{EVAL,TRAIN,PREDICT}
      if mode == tf.estimator.ModeKeys.TRAIN:
          predictions = model(inputs, training=True)

          norm = tf.norm(expected_outputs - predictions, ord="fro", axis=[-2, -1])
          loss = tf.reduce_mean(norm, 1)
          # tf.summary.scalar("loss", loss)

          optimizer = tf.train.GradientDescentOptimizer(params["learn_rate"])

          train_op = optimizer.minimize(loss, global_step=global_step)

          return tf.estimator.EstimatorSpec(
              mode,
              loss=loss,
              train_op=train_op,
          )

      raise NotImplementedError
#+END_SRC

*** The Rest of the Owl

#+BEGIN_SRC python
  def train():
      regressor = tf.estimator.Estimator(
          model_fn=model_fn,
          model_dir=FLAGS.checkpoint_dir,
          # TODO
          config=None,
          params={},
      )

      regressor.train(
          input_fn=lambda: dataset(FLAGS.data_dir, range(1, 20)).batch(1),
          max_steps=FLAGS.max_steps,
      )


  def main(argv=None):
      if tf.gfile.Exists(FLAGS.checkpoint_dir):
          LOG.debug("Emptying checkpoint dir")
          tf.gfile.DeleteRecursively(FLAGS.checkpoint_dir)

      LOG.debug("Creating checkpoint dir")
      tf.gfile.MakeDirs(FLAGS.checkpoint_dir)

      train()
#+END_SRC

#+BEGIN_SRC python
  if __name__ == "__main__":
      tf.logging.set_verbosity(LEVEL)
      tf.app.run(main)
#+END_SRC

References
- http://smartdsp.xmu.edu.cn/derainNet.html
- X. Fu, J. Huang, D. Zeng, Y. Huang, X. Ding and J. Paisley. ¡°Removing Rain from Single Images via a Deep Detail Network¡±, CVPR, 2017.
- X. Fu, J. Huang, X. Ding, Y. Liao and J. Paisley. ¡°Clearing the Skies: A deep network architecture for single-image rain removal¡±, IEEE Transactions on Image Processing, vol. 26, no. 6, pp. 2944-2956, 2017.
Appendix

** The Dataset

The authors' rainy image dataset can be found [[http://smartdsp.xmu.edu.cn/memberpdf/fuxueyang/cvpr2017/rainy_image_dataset.zip][here]]. Unfortunately, that page was, at the start of this project, unreliable at best; it is now, as of 04/28/2018, entirely unavailable. As such, at the start of this project, I took the liberty of cloning the authors' dataset; it is available at [[https://github.com/jinnovation/rainy-image-dataset][github.com/jinnovation/rainy-image-dataset]].

** The Report

This article is written with Emacs and [[https://orgmode.org/manual/HTML-export.html][Org mode]] in the [[https://www-cs-faculty.stanford.edu/~knuth/lp.html][literate programming]] style.

** The Site

The source for this write-up, as well as all tangled source files, is hosted at [[https://github.com/jinnovation/derain-net][jinnovation/derain-net]]. This write-up is generated using Org-mode's [[https://orgmode.org/manual/HTML-export.html][HTML export functionality]], as well as the [[https://github.com/fniessen/org-html-themes][ReadTheOrg theme]]. All resulting source files were tangled directly from the write-up document.

derain-net
derain-net copied to clipboard

Metadata

-- after-save-hook: (org-html-export-to-html org-babel-tangle); pyvenv-activate: "./env3"; --

← Metadata

Owner

Metadata

derain-net derain-net copied to clipboard

Metadata

-- after-save-hook: (org-html-export-to-html org-babel-tangle); pyvenv-activate: "./env3"; --

← Metadata

Owner

Metadata

derain-net
derain-net copied to clipboard