dataall
dataall copied to clipboard
In Dataset Crawler status should be displayed
Is your feature request related to a problem? Please describe. Once the user creates the crawler under dataset, status of the crawler is unknown and user might be forced to trigger it again and again which keeps on restarting the crawler
Describe the solution you'd like Once the crawler is triggered, either start crawler button to be disabled or status of the crawler should be displayed
Describe alternatives you've considered Once the crawler is triggered, either start crawler button to be disabled or status of the crawler should be displayed
Additional context Add any other context or screenshots about the feature request here.
P.S. Please Don't attach files. Add code snippets directly in the message body instead.
Hi @sandeephs1, thanks for opening an issue, definitely a feature of interest. It might look like an easy enhancement, but it is a bit trickier than it looks like. There are 2 alternatives, polling the status or pushing the status.
- polling: data.all backend retrieves Glue Crawler status. Polling happens constantly at defined periods of time.
- pushing: a certain event (GlueCrawlerX) triggers an action in data.all backend, that updates info in UI.
Polling: developing a function that retrieves the Glue Crawler status is relatively easy, the problem is that in order to have the most updated information, we need to execute the function constantly. It is doable, but we just need to be careful to not overload Glue service quotas throttling and to minimize the amount of boto3 calls
Pushing: definitely a more elegant solution, but requires additional development. Communication in data.all always starts from the central account backend to the environment accounts. This feature would implement communication from environment to backend. We could implement this pattern using AWS CloudTrail and an event bus as shown by the proposal of @SofiaSazonova in https://github.com/data-dot-all/dataall/issues/922 (at the end of the issue)
We are happy to explore both options, let us know what level of importance this feature has for you and we will try to prioritize accordingly. Bests