better-files icon indicating copy to clipboard operation
better-files copied to clipboard

More than one events generated for a file modification

Open shivam-880 opened this issue 5 years ago • 6 comments

Hi,

I am trying to watch a config file for modifications with the following piece of code:

object WatchTest extends App {
  implicit val system: ActorSystem = ActorSystem("mySystem")
  val watcher: ActorRef = File(conf_dir + "/application.conf").newWatcher(recursive = false)

  watcher ! on(EventType.ENTRY_MODIFY) { file =>
    println(s"$file got modified")
  }
}

I could watch a single file instead of a directory, right?

This seems to work but the problem is that everytime I am modififying the file, more than one events are getting generated. This is problematic since the sideeffects in the callback can't be called more than once.

Please suggest how can I achieve this or if I am doing something wrong?

shivam-880 avatar May 07 '19 07:05 shivam-880

I think the file is actually modified twice - once for the content and once for the update time. See: https://stackoverflow.com/questions/16777869/java-7-watchservice-ignoring-multiple-occurrences-of-the-same-event

If that is not the case, please reopen this issue.

pathikrit avatar May 08 '19 19:05 pathikrit

I have come up with this work around based on your input. What do you think about it? And if deem fine, why not mention it in the README and place this or similar example (Will definitely save needless debugging knowing this needs to be handled always, anyway!)?

trait ConfWatcher {

  implicit def actorSystem: ActorSystem

  private val confPath = "/home/codingkapoor/application.conf"
  private val appConfFile = File(confPath)
  private var appConfLastModified = appConfFile.lastModifiedTime

  val watcher: ActorRef = appConfFile.newWatcher(recursive = false)

  watcher ! on(EventType.ENTRY_MODIFY) { file =>
    if (appConfLastModified.compareTo(file.lastModifiedTime) < 0) {
      // TODO
      appConfLastModified = file.lastModifiedTime
    }
  }

}

shivam-880 avatar May 09 '19 06:05 shivam-880

I think the file is actually modified twice - once for the content and once for the update time. See: https://stackoverflow.com/questions/16777869/java-7-watchservice-ignoring-multiple-occurrences-of-the-same-event

If that is not the case, please reopen this issue.

It definitely is happening more than twice. Actually thrice.

$CONF_DIR/application.conf got modified at 2019-05-07T07:58:55Z
$CONF_DIR/application.conf got modified at 2019-05-07T07:58:55Z
$CONF_DIR/application.conf got modified at 2019-05-07T07:58:55Z

Not sure, if that's a problem and the issue needs to be reopenend!

shivam-880 avatar May 09 '19 06:05 shivam-880

@codingkapoor : Yes that's strange. I need some more information:

  1. Can you verify this is not an issue with the akka watcher? You can do this by doing:
val watcher = new FileMonitor(CONF_DIR, recursive = true) {
override def onCreate(file: File, count: Int) = println(s"$file got created $count times")
  override def onModify(file: File, count: Int) = println(s"$file got modified $count times")
  override def onDelete(file: File, count: Int) = println(s"$file got deleted $count times")
  override def onUnknownEvent(event: WatchEvent[_]) = println(s"Unknown event (${event.context()}: $event) got triggered ${event.count} times")
}
watcher.start() 
Thread.sleep(60 * 1000) // The above line starts the monitoring asynchronously 
  1. Are you watching the directory or the file? Try just the file and see if that's the issue?
val watcher = new FileMonitor(CONF_DIR/application.conf, recursive = true) {
  1. What OS are you on?

pathikrit avatar May 09 '19 13:05 pathikrit

I'm guessing this is a problem with the JDK WatchService on whatever OS and filesystem you're using, so if that's the case it's not a problem with better-files, since better-files doesn't try to do anything with the events beyond what the native WatchService does. If the native WatchService sends multiple events while the file is being written, that's not something better-files can easily control. In your case it might be better to buffer those events and wait for a short time period before taking action on the changed files.

In general, though, this is a hard problem to solve. I've tried to solve it in https://github.com/gmethvin/directory-watcher, but there are still some pretty significant trade-offs. Some OSes and filesystems have modification dates up to the millisecond or nanosecond, while others only provide second-level precision, so modification dates aren't universally reliable. It's often better to hash the actual file contents to see what changed, but that can be a bad idea if your directory contains very large files. Ultimately it depends a lot on your particular use case.

gmethvin avatar May 10 '19 09:05 gmethvin

Thanks for the answer @gmethvin . I still want to rule out a better-files bug because better-files does attach watchers recursively if watching directories: https://github.com/pathikrit/better-files/blob/master/core/src/main/scala/better/files/FileMonitor.scala#L54

pathikrit avatar May 10 '19 12:05 pathikrit