azimutt
azimutt copied to clipboard
Import tables data and display some statistics
Currently Azimutt focus mainly on the database schema, ignoring completely the data inside. It's easier in term of performance but also privacy and we already have a lot to do ^^
But having access to the data can allow Azimutt to leverage new features:
- search can look into the values so it can be easier to find the wanted table/column
- data statistics can give you interesting information:
- for string: cardinality (enum values for example), % of null values, some samples
- for numbers: min, max, average
As such information can be huge and everything is in the browser, Azimutt can only keep a sampling, for example 100 random values for each table. This will not be a perfect information but still can help you a lot.
Such data can be found in the schema.sql if you choose to export it with the values (as INSERT INTO commands) or can be read in the database in the case of the CLI (https://github.com/azimuttapp/azimutt/issues/25).
If you want this feature, please leave a :+1: reaction on this issue.
I am not your target audience, i haven‘t used or seen an ERD in 10 years. I need to see the actual data: what is really stored in a varchar? what possible values does it have? how is the distribution of values? An ERD has too less information for me. A data explorer within an erd tool would be interesting but that would be a really large task. Data exploration is a very big topic, especially in statistics and machine learning. But to my knowledge it‘s still mostly a manual task or very expensive (SPSS) If it has data exploration one day i would be happy if you ping me ;) Something like excel‘s smart filters for seeing values and the data distribution could be a good starting point. With the ability to filter on some values and see everything changing it could be an MVP
from @tobias_petry
I feel like this should not be a focus of Azimutt development - it would be better to concentrate on making a great ERD tool than muddy the target with some of these kind of things (perhaps a separate tool but in the same "stable"?)
Hi @stacy-rendall, thanks for your comment, indeed it's very important for Azimutt to stay focus on its goal. But its goal is not to make an ERD tool (there is already plenty) but make the best database exploration tool. Of course it starts with schema diagrams (so ERD-like) but this is only a first step. Do you see missing things on the ERD side? To me apart from table grouping it's almost done. What do you think? Next steps will be to focus on integrations/adoption (like the Heroku plugin done but also a IDE plugins and desktop app) and then data exploration (started in #154 for diagrams with db connection) or documentation (depends on user needs). What do you think?
I think that Azimutt is an awesome ERD tool, I haven't found anything else that comes close in terms of flexibility and usability.
I also think that a lot of the work you have done recently has fixed a lot of the annoyances/bugs I had noticed (I wasn't expecting things to move so quickly), so I can't see much else "urgent" :)
Thanks for your kind words :) Indeed we are very active on this project and expect to make it grow quite fast in 2023 :D Don't hesitate to post issues if you think of more improvements!
Mostly done in: #92 When connecting to a database source you see samples and statistics on table on columns details.
Still can be improved:
- [ ] More advanced statistics depending on column type
- [ ] Data search (still have to figure out how, doing
SELECT ... LIKE '%query%
in every table/column seems too much)
Re-wrote this issue with more details: https://github.com/azimuttapp/azimutt/issues/237 So closing it ;)