zed
zed copied to clipboard
Unique lines (a.k.a. remove duplicates)
Check for existing issues
- [X] Completed
Describe the feature
I wish there was a way to keep only one copy of repeating text in a buffer’s all lines (or a selected portion of it).
There may be different operating modes. For example, in one mode;
Foo
Bar
Foo
Foo
Bar
… may be converted into:
Foo
Bar
… (i.e., removing all repeating occurrences of each line). Note, however, if this was implemented as “first sort [internally], then remove dups” then one must be thoughtful how the sorting works, because, for example, if sorting was making decisions about “what comes first” (e.g., case-sensitive or insensitive, numbers first VS letters first, special characters, similar accented chars to behave as one «e.g., ôöòóœø», etc.), this would impact on the final output of this feature.
In another mode the same input above might become:
Foo
Bar
Foo
Bar
… (i.e., only the last repeating Foos are removed, because other repeating occurrences weren’t continuous.)
And maybe even in a multi-cursor selection like this:
The end result, after the dups are removed, might look like this:
So the words wrapped around ~ in the screenshot below are removed from the initial input, because they’re appearing for the n-th time:
What do you think?
(P.S.: This may be a sub-feature of https://github.com/zed-industries/zed/issues/5438.)
If applicable, add mockups / screenshots to help present your vision of the feature
No response
I really think this ought to be a builtin considering #5788 got added 🙏 Then I can finally stop my little getaway dance with sublime..
This looks doable, see https://github.com/zed-industries/zed/pull/2786 for the reference.
One needs to create another Editor::manipulate_lines wrapper and another action for that + add a test into editor_tests.rs
One decision that we probably need to make here is whether or not this command should
- remove all duplicates, or
- remove only consecutive duplicates
The Vec::dedup method only works on consecutive duplicates. You get a bit more flexibility with the latter option, since you can always sort your lines first before deduplicating, but something tells me I'd never want to have the behavior of removing only consecutive duplicates.
A single data point here, but I've never needed to only remove consecutive duplicates ;)
I hear both of you, but if ever someone needed "remove only consecutive dups", only one of the two approaches would allow it, while for anyone who would like to "sort & dedup", well, they can do that 😉
My :two-cents: , I'd totally dig having multiple commands:
- dedup all
- dedup consecutive
Each enabling any combinations of the following modes:
- case-aware
- variant-aware
Also, I'd expect these to work with:
- entire file, it nothing is selected
- on single selection
- on multiple selections
- cross-selections
And last, but not least 😅, all this should also work on multi-file-buffers (e.g., project search results window.)
Easy, right?! 🙂
I disagree with the tendency to overcomplicate and believe that a simple HashSet::contains-like, case-aware, deduplication is all we need — this is exactly what VSCode and Sublime provide, the original issue required for, and exactly what we should start here with.
The rest of the related wishes could go into separate issues/PRs and I suspect those would not be as popular.
Can't argue with that 🙂
All the additional features, if any and if ever, can be a plug-in one day.
I disagree with the tendency to overcomplicate and believe that a simple
HashSet::contains-like, case-aware, deduplication is all we need — this is exactly what VSCode and Sublime provide, the original issue required for, and exactly what we should start here with.The rest of the related wishes could go into separate issues/PRs and I suspect those would not be as popular.
Yeah, using a hashset seems like the right approach to me. :)
Hi! Looking forward to make my first contribution! Is it ok to try this one ? I did manage to craft a solution that changes Editor::manipulate_lines to allow lines insertion or removal. Here's the approach I'm thinking about
fn manipulate_lines<Fn>(&mut self, cx: &mut ViewContext<Self>, mut callback: Fn)
where
Fn: FnMut(&mut Vec<&str>),
{
...
// Mark start and end row for each selection. The edited buffer will be used to calculate offsets
let start_row = start_point.row + added_lines - removed_lines;
new_selections.push(Selection {
id: selection.id,
start: start_row,
end: start_row + (lines.len() as u32) - 1,
goal: SelectionGoal::None,
reversed: selection.reversed,
});
if lines.len() > lines_len {
added_lines += (lines.len() - lines_len) as u32
} else {
removed_lines += (lines_len - lines.len()) as u32;
}
}
self.transact(cx, |this, cx| {
let buffer = this.buffer.update(cx, |buffer, cx| {
buffer.edit(edits, None, cx);
buffer.snapshot(cx)
});
let new_selections = new_selections
.iter()
.map(|s| {
let start_point = Point::new(s.start, 0);
let end_point = Point::new(s.end, buffer.line_len(s.end));
let start_anchor = buffer.anchor_after(start_point);
let end_anchor = buffer.anchor_before(end_point);
Selection {
id: s.id,
start: start_anchor.to_offset(&buffer),
end: end_anchor.to_offset(&buffer),
goal: SelectionGoal::None,
reversed: s.reversed,
}
})
.collect();
this.change_selections(Some(Autoscroll::fit()), cx, |s| {
s.select(new_selections);
});
This is how the unique action would look like:
pub fn unique_lines_case_sensitive(
&mut self,
_: &UniqueLinesCaseSensitive,
cx: &mut ViewContext<Self>,
) {
self.manipulate_lines(cx, |lines| {
let mut seen = HashSet::default();
lines.retain(|line| seen.insert(*line));
})
}
@dalton-oliveira welcome to submit a PR. I think any good-first-issue labeled issue that has an approach description that more or less matches yours can safely get a PR straight away, without the discussion.
But what you did in unique_lines_case_sensitive looks exactly what I've imagined the fix would be.
Could not comprehend the diff for manipulate_lines but that does not look bad either — as long as the PR has a test for this, it's fine.
@dalton-oliveira landed unique lines in this PR, which will be in preview next Wed. 🎉
- https://github.com/zed-industries/zed/pull/7526
Let's close this issue out and open new ones for any extra line/text manipulation commands.