Sqlite backed history for advanced capabilities
Prerequisites
- [X] Write a descriptive title.
Description of the new feature/enhancement
I work in a Linux/Windows hybrid environment where I try to minimize the cognitive load while keeping the security and usability in a reasonable amount. I recently started to use Bashhub and atuin to improve daily workflows on terminals. Even though I do not use the cloud synchronization, I experienced a database backed history provides alternatives without utilizing heavy grep or Select-String usage.
However, this (SQLite backend) would require either adding a Find-History or Search-History command or similar (I prefer the latter) to fully benefit from it, or improving Get-History command.
Currently, we handle basic search requirements with Get-History | Select-String -Pattern..., or parameters of Invoke-History or Clear-History but in order to make use of a database backend, a specific command makes more sense.
Proposed technical implementation details (optional)
Almost all of these can be achieved by using several PowerShell modules already. Yet, a database backend might provide a smoother experience and open way to further improvements.
- Cloud synchronization via OneDrive: Even though it can be provided via text file synchronization too, database synchronization can be provided with more advanced tools as records can have unique IDs minimizing the impact on merge.
- Fuzzy search: It can make use of SQLite extensions to provide fuzzy search on a lower level for performance and convenience.
- Search by system ran the command: When cloud synchronization exists, it can work by saving the computer name.
- Encrypted history: SQLite encryption extensions can provide encryption for privacy. If it is needed to record for security reasons especially for corporate environments, the current log system can provide the data needed when configured.
- Search by exit code/execution status: Find failed commands, commands raised exceptions, etc. without using
Select-Object -Property... - Search by directory: Find commands run in a specific directory (requires another field)
- Search by date: Find commands by StartExecutionTime and EndExecutionTime without using
Select-Object -Property.... - Pipe support: Even though it would be complicated to accept pipe input, it would be easier if it can pipe the result to
Invoke-HistoryorClear-History. - Deduplicate search results: If needed, the duplicated search results can be ignored by providing a
-Uniqueparameter - Ignore invalid lines: Due to accidents, it is possible for a user to paste hundreds of lines of text, either a piece of code or a JSON file content. It can be filtered out during search.
- Extended pipeline: Currently, PSReadLine captures some keywords and defaults to not writing them to history. This can be extendable as an add-on object applied before flushing into database. And it might be possible to add a filtering and/or enriching before writing into the database. For instance, encoded commands can be decoded before writing into database while setting a field like
Encoded: $true.
Really love the ideas here! Related to #3339, #1886. This #1808 and #3472 would be dependent on adding more detail to the history entries.
atuin works fairly well. I didn't need to make any changes to it, I just import this module which replace the function PSConsoleReadLine with the necessary hooks to call atuin. It's not perfect but I'm finding it useful.
I would like to add another possibility: it is possible to store more fields than displayed. Therefore, the Get-History command can be customized by enabling and disabling fields.
I tried an experiment with the OneDrive synchronization. The name conflict and overwrite would create issues. Therefore, the databases would need a unique identifier. I tried to use a console app as a medior: use a folder to store sqlite databases, each has unique names (GUID.db). The console app creates an in-memory database by running.
ATTACH `path/GUID.db` as localDb;
BEGIN;
INSERT INTO command SELECT * FROM localDb.command;
COMMIT;
DETACH localDb;
Therefore, each instance has a persistent copy for each system but the in-memory data consists of all systems, and queryable.
But this is an ugly solution as I assumed no concurrent writes. A proper synchronization requires continuous reading and deduplication.
@zbalkan this has been discussed as a potential solution in this issue #1886 @StevenBucher98 / @daxian-dbw I think that we can close this as a duplicate
@kilasuit, as @StevenBucher98 mentioned above, it is related but not a duplicate.