Add options for case-insensitivity, blank lines, output file, counts, and backup to anew
Hi Maintainers,
This PR enhances the anew uniqueness tool by adding several commonly requested features to improve its flexibility and usability in various scenarios. The goal was to add practical options without significantly increasing the tool's complexity.
Motivation:
The base anew tool is useful for ensuring unique lines, but real-world use cases often require more nuanced handling:
- Ignoring case differences (e.g., 'apple' vs 'Apple').
- Skipping blank lines from input.
- Directing unique output to a separate file instead of modifying the input file.
- Getting feedback on how many lines were processed/added.
- Safeguarding the original file when modifying it in place.
Changes Introduced:
This PR adds the following command-line options to anew:
-i(Ignore Case): Performs case-insensitive comparisons when checking for existing lines and duplicates from stdin.# 'Apple' from stdin won't be added if 'apple' exists in existing.txt echo "Apple" | anew -i existing.txt-B(Ignore Blank Lines): Skips processing (and potentially adding) blank lines received from standard input.printf "line1\n\nline2\n" | anew -B existing.txt-o <outfile>(Output File): Specifies a different file to append the new unique lines to. If omitted, behavior remains the same (appends to the[input_filename]if provided). This allows merging unique lines into a new destination.# Read existing lines from check.txt, append new unique lines from stdin to unique_lines.txt cat new_stuff.txt | anew check.txt -o unique_lines.txt # Read only stdin, append unique lines to a new file cat new_stuff.txt | anew -o unique_only_from_stdin.txt-c(Counts): Prints statistics (lines read, duplicates found, blanks skipped, lines output/written) to stderr upon completion.cat new_stuff.txt | anew -c existing.txt--backup[=<SUFFIX>](Backup): Creates a backup copy of the[input_filename]before modification. This only takes effect if output is being written back to the same file specified as[input_filename](i.e.,-ois not used or-opoints to the same file).- If
--backupis used without a value, the suffix.bakis used. - If
--backup=<SUFFIX>is used, the specifiedSUFFIXis appended to the filename (e.g.,--backup=.orig).
# Creates existing.txt.bak before appending cat new_stuff.txt | anew --backup existing.txt # Creates existing.txt.timestamp before appending cat new_stuff.txt | anew --backup=.timestamp existing.txt- If
Internal Improvements:
- Refactored flag handling into a
Configstruct. - Added a
Statsstruct for collecting counts. - Introduced a
normalizeLinehelper function to handle trimming and case-folding consistently. - Improved error handling around file operations (distinguishing
ErrNotExist, checking scanner errors). - Used
bufio.Writerfor potentially more efficient file appends. - Added basic argument count validation.
- Updated usage information.
Testing:
Manual testing was performed with various combinations of flags, input files (existing, non-existing), stdin content (with duplicates, blanks, case variations), and output scenarios (in-place, -o, dry-run).
Request for Review:
Please review the changes for correctness, adherence to project style, and potential edge cases. Particular attention to the logic for -o, --backup, and the interaction between -i, -t, and -B would be appreciated.
Thanks for considering this contribution!
good stuff. thanks. hoping it merged soon.
good stuff. thanks. hoping it merged soon.
thanks !
yeah i also hope it gets merged. maybe he is busy. i am thinking of continuing a separate fork if it doesn't get merged in 4 months
@NullifiedSec are u planning for separate fork?
@noob6t5, I'm thinking of forking this project since tomnomnom seems inactive. I plan to keep it updated and add new features. Would you like to contribute? I'd also appreciate your opinion on something: given that the code is now almost unrecognizable from the original due to a complete restructure, should I create a new repository or stick with a fork?
@NullifiedSec I have some updated features for it using personally if it's okay for all I will PR in your forked repo or new repo , I think both is fine but forking this will weight more showing respect to author rather then creating new tool's.
But fully dedicating and updating is quite impossible right now as I'm Crushed with some tool's of mine 😅
@noob6t5 i continued this as a separate repository maintaining a fork is kinda complex for me though i mentioned the original repo
here is my repo if you want to contribute
https://github.com/NullifiedSec/onew/