fscrawler
fscrawler copied to clipboard
Elasticsearch File System Crawler (FS Crawler)
Thank you very much for helping with the file permissions on linux, now unfortunately with success comes additional requests.. Current we are focusing on indexing linux file systems, but now...
This PR fixes #1003 by letting users bring their own complex document processing. Instead of trying to make a fancy configurable pipeline, it is much simpler to just let users...
Workplace Search supports many other auth methods: https://www.elastic.co/guide/en/workplace-search/7.13/workplace-search-api-authentication.html For now, we just support basic auth. We should also support the other ones.
**Is your feature request related to a problem? Please describe.** The current (2.7-SNAPSHOT, 20210505) program's response to a syntactically invalid _settings.yaml is to say that the job does not exist,...
I've tried to upload a simple pdf using fscrawler REST document upload API. It successfully converted pdf contents into plain text and indexed it in Elasticsearch. But when I checked...
**Describe the bug** Crawling a large directory, FS crawler appears to stop sometimes, with no error or stop message. There doesn't seem to be any particular file that causes it,...
**Describe the bug** Ill preface this by saying this may not be a bug just my understanding of how fscrawler works. I have made an alias for my index as...
**Describe the bug** Architecture : linux server crawling over nfs directory, pushing metadata to workplace search. Expected: having fieds like owner, permissions, etc when attributes_support: true Result: No owner or...
pf4j is lightweight and should bring a lot of stuff OOTB without having to reimplement the wheel. https://pf4j.org/doc/getting-started.html And may be https://github.com/pf4j/pf4j-update
According to https://docs.oracle.com/javase/tutorial/essential/io/notification.html and the fact that OS supports this: * inotify (linux) * fsevents (macos) * ReadDirectoryChangesW (windows) We can try to replace the current and super buggy implementation...