LineSplit trait
- [x] I agree to follow the project's code of conduct.
- [ ] I added an entry to
CHANGES.mdif knowledge of this change could be valuable to users. - [ ] ran
cargo fmt --all -- --check - [ ] ran
cargo clippy --all-features --all-targets -- -Dwarnings
Split a Line or LineString based on one or two fractional distances along the length using proposed trait methods LineSplit::line_split(fraction) and LineSplit::line_split_twice(fraction_start, fraction_end)
Previously discussed in #378 and #986
Needed for #935 see this comment
Still to do / discuss
- [x] write some docs
- [x] ~~Instead of~~ In addition to
line_split_twice, considerline_split_many. Being able to split atnlocations is desirable as discussed in #985 and #986- [ ] Consider removing
line_split_twice? - [x] Add more tests for
line_split_many - [ ] The
LineSplittrait's default implementation forline_split_manyandline_split_twiceare failing tests due to precision issues which arise because these use a lazy implementation which just repeatedly applies theline_splitfunction. The error seems to compound much faster than I expected 😞 For the same reasons, these methods are also super inefficient.
- [ ] Consider removing
- [ ] Decide what happens when the user calls 'line_split_twice' with a reversed range. Should it return
None? Or just automatically swap thefraction_start&fraction_endarguments? Currently it does the latter. - [x] ~~Consider supporting
parameterandlength_alongin addition tofraction(see discussion below)~~
Perhaps in a future PR - [x] ~~Consider breaking
measure_line_string()into a separate trait?~~
Perhaps in a future PR- [x] ~~Consider optimization using dedicated prepared geometry types
LineStringMeasuredandLineMeasured(see discussion below)~~
Perhaps in a future PR
- [x] ~~Consider optimization using dedicated prepared geometry types
Example Usage
Click to expand code examples
Line LineSplit::line_split()
let line = Line::new(
coord! {x: 0.0_f32, y:0.0_f32},
coord! {x:10.0_f32, y:0.0_f32},
);
let result = line.line_split(0.6);
assert_eq!(
result,
Some(LineSplitResult::FirstSecond(
Line::new(
coord! {x: 0.0_f32, y:0.0_f32},
coord! {x: 6.0_f32, y:0.0_f32},
),
Line::new(
coord! {x: 6.0_f32, y:0.0_f32},
coord! {x:10.0_f32, y:0.0_f32},
)
))
);
LineString LineSplit::line_Split()
let line_string: LineString<f32> = line_string![
(x:0.0, y:0.0),
(x:1.0, y:0.0),
(x:1.0, y:1.0),
];
assert_eq!(
line_string.line_split(0.75),
Some(LineSplitResult::FirstSecond(
line_string![
coord! {x:0.0, y:0.0},
coord! {x:1.0, y:0.0},
coord! {x:1.0, y:0.5}],
],
line_string![
coord! {x:1.0, y:0.5},
coord! {x:1.0, y:1.0}
]
))
);
Line LineSplit::line_split_twice()
// TODO write example
LineString LineSplit::line_split_twice()
let line_string: LineString<f32> = line_string![
(x:0.0, y:0.0),
(x:1.0, y:0.0),
(x:1.0, y:1.0),
(x:2.0, y:1.0),
(x:2.0, y:2.0),
];
let result = line_string.line_split_twice(0.25, 0.5).unwrap();
assert_eq!(
result,
LineSplitTwiceResult::FirstSecondThird(
line_string![
(x: 0.0, y:0.0_f32),
(x: 1.0, y:0.0_f32),
],
line_string![
(x: 1.0, y:0.0_f32),
(x: 1.0, y:1.0_f32),
],
line_string![
(x: 1.0, y:1.0_f32),
(x: 2.0, y:1.0_f32),
(x: 2.0, y:2.0_f32),
],
)
);
Line LineSplit::line_split_many
let line = Line::new(
coord!{x: 0.0_f32, y:0.0_f32},
coord!{x:10.0_f32, y:0.0_f32},
);
let result = line.line_split_many(
&vec![0.1, 0.2, 0.5]
);
assert_eq!(
result,
Some(vec![
Some(Line::new(
coord! { x: 0.0, y: 0.0 },
coord! { x: 1.0, y: 0.0 },
)),
Some(Line::new(
coord! { x: 1.0, y: 0.0 },
coord! { x: 2.0, y: 0.0 },
)),
Some(Line::new(
coord! { x: 2.0, y: 0.0 },
coord! { x: 5.0, y: 0.0 },
)),
Some(Line::new(
coord! { x: 5.0, y: 0.0 },
coord! { x: 10.0, y: 0.0 },
))
])
);
Implementation Notes
Return Types
At first I tried implementing the return types like Option<(Option<Line>, Option<Line>)> but this causes issues in the implementation (non exhaustive match statements) and uncertainty for the user; The user may expect to possibly receive Some((None, None)) which is never possible.
Therefore the following types were implemented:
pub enum LineSplitResult<T> { // where T:Line<_> or T:LineString<_>
First (T ),
Second ( T),
FirstSecond (T, T),
}
pub enum LineSplitTwiceResult<T> { // where T:Line<_> or T:LineString<_>
First (T ),
Second ( T ),
Third ( T),
FirstSecond (T, T ),
SecondThird ( T, T),
FirstThird (T, T),
FirstSecondThird (T, T, T),
}
line_split returns Option<LineSplitResult<T>> and line_split_twice returns Option<LineSplitTwiceResult<T>>
So that we get the best of both worlds I implemented LineSplitResult::as_tuple()->(Option<&T>, Option<&T>) and a set of single item accessors like LineSplitResult::first() and LineSplitResult::second() which return Option<&T>
I also added .into_ variants, eg into_first and into_tuple to consume the result and discard other returned data.
This works great for line_split() but for the line_split_twice() function I am still wondering if these return types are the best option. For example the LineSplitTwiceResult::FirstThird variant is pretty confusing, although from an implementation point of view it makes total sense; it happens when there is no 'Second' line segment because the fraction_start==fraction_end.
Further Thoughts
Fraction vs Length-Along vs Parameter
A point on a line string can be identified in three ways; as a fraction of length along, as a length along the line, or as something that I call 'parameter'
A parameter is just a float where the integer part represents the segment index and the fractional part represents the fraction along the next line segment. In my previous tinkering I found that many algorithms are naturally faster when using parameters internally. This is because they do not need to know the length of every segment, and the total length of the linestring. Only the segment with the fractional part needs to be measured.
It seems natural to me that line intersection algorithms and line split algorithms should have the option to use 'parameter' instead of fraction.
LineStringMeasured and LineMeasured
In previous projects I have implemented 'measured' types which might look something like struct LineStringMeasure<T>{(coordinates:Vec<Coord<T>>, length_segments:Vec<T>, length_total:T}. For repeated queries on the same line, these can massively reduce the number of repeated length calculations. No idea if it would result in real performance improvements because storing the extra data may reduce cache locality etc.
Hi @urschrei, @michaelkirk, it seems like every time I push changes to my fork it is triggering all the checks on this draft PR. Should I be concerned about the usage?
Hi @urschrei, @michaelkirk, it seems like every time I push changes to my fork it is triggering all the checks on this draft PR. Should I be concerned about the usage?
CI is configured to run tests on all PR's (even drafts). Maybe it's possible to turn CI off for draft PR's, but I prefer it as it is. In fact I actually sometimes use draft PR's exactly for this purpose: as a way to see if CI passes before I'm ready to mark it as ready and waste a living human's time on it.
I'm not overly concerned about the usage. If you don't want it to run, you could close the PR until you're ready.
CI is configured to run tests on all PR's (even drafts). Maybe it's possible to turn CI off for draft PR's, but I prefer it as it is. In fact I actually sometimes use draft PR's exactly for this purpose: as a way to see if CI passes before I'm ready to mark it as ready and waste a living human's time on it.
I'm not overly concerned about the usage. If you don't want it to run, you could close the PR until you're ready.
Thanks @michaelkirk I'll try to limit unnecessary pushes for now anyway. I had understood an early draft PR as a way of letting others know I'm having a go at some feature. Perhaps I will open them a bit later in the process in future.
I thought this PR would be a two or three quick commits but it's turning out to be tricky :/
I'm late here, but let me drop 2 cents. I've implemented something like this for myself in Python and Rust, also working with OSM streets data.
The best way to use was to have "line-substring" method with "from" and "to" fractions or distances. If I were to code this, I'd have 2 separate methods, to avoid confusion. The "parameter" mode makes sense, and I'd make it separate as well.
Other options were:
- Use a split function twice -- very unintuitive and awkward. Easy to forget how to use it after some time off the code.
- When I indeed needed a split in the middle, I usually needed just 1 piece, so constructing and returning both was an extra waste of processing power.
- It also didn't ease making double-split, like "take the part from fraction 0.2 to 0.75". It's confusing to use and code inside.
I've implemented something like this for myself in Python and Rust, also working with OSM streets data.
Is your implementation public? Would love to check it out if so. (Currently looking through https://github.com/culebron/rust_routing/ -- cool stuff!)
Is your implementation public? Would love to check it out if so. (Currently looking through https://github.com/culebron/rust_routing/ -- cool stuff!)
Thanks, actually the credit goes to you, who inspired me to try Rust. I implemented that before learning of fast_paths crate, and it turned out not so useful, because ALT algorithm doesn't work as advertized (no speedup, nodes visited reduced x2, not x20).
Here's the Python and Rust code, free to use. Actually, it turned out I did use cutting in Python, but had to add a convenient wrapper around it.
Python code
def line_substring(line, start=None, end=None):
if start is not None:
start = clip_position(line.length, start)
if end is not None:
end = clip_position(line.length, end)
if end is not None and start is not None:
if end < start:
start, end = end, start
end -= start
return cut_line(cut_line(line, start)[1], end)[0]
def clip_position(length, position):
if position is None:
return None
if position < -length:
return 0
if position > length:
return length
if position < 0:
return position + length
return position
def cut_line(line, distance):
# Cuts a line in two at a distance from its starting point
empty_line = LineString()
if distance is None:
return line, line
if line is None or line.is_empty:
return empty_line, empty_line
distance = clip_position(line.length, distance)
if distance == 0:
return empty_line, line
if distance >= line.length:
return line, empty_line
coords = list(line.coords)
points = [Point(p) for p in coords]
distances = [[line.project(p), p] for p in points]
distances[-1][0] = line.length
if not any(d == distance for d, p in distances):
distances += [[distance, line.interpolate(distance)]]
distances = sorted(distances, key=lambda v: v[0])
line1 = [p for d, p in distances if d <= distance]
line2 = [p for d, p in distances if d >= distance]
line1 = LineString(line1) if len(line1) > 1 else empty_line
line2 = LineString(line2) if len(line2) > 1 else empty_line
return line1, line2
Rust code
use geo::{Coord, LineString, EuclideanDistance, LineInterpolatePoint, EuclideanLength};
pub trait LineSubstring {
fn sub_frac(&self, start: f32, end: f32) -> Option<LineString<f32>>;
}
impl LineSubstring for LineString<f32> {
fn sub_frac(&self, mut start: f32, mut end: f32) -> Option<Self> {
if start.is_nan() || end.is_nan() || start > end { // probably should be Err, but that's too much wrapping, Greta Thunberg won't approve.
return None;
}
start = start.max(0.0).min(1.0);
end = end.max(0.0).min(1.0);
let tot_dist = self.euclidean_length();
let mut prev : Option<&Coord<f32>> = None;
let mut cur_dist = 0.0f32;
let mut result = Self(vec![]);
self.line_interpolate_point(start).map(|p| result.0.push(p.0));
for i in 0..self.0.len() {
prev.map(|p| cur_dist += self.0[i].euclidean_distance(p));
if cur_dist / tot_dist > end { break; }
if cur_dist / tot_dist > start { result.0.push(self.0[i].clone()); }
prev = Some(&self.0[i])
}
if end < 1.0 {
self.line_interpolate_point(end).map(|p| result.0.push(p.0));
}
Some(result)
}
}
#[cfg(test)]
mod line_substring_tests {
use geo::{LineString, Coord};
use crate::assert_delta;
use super::LineSubstring;
fn cmp(ls1: &LineString<f32>, start: f32, end: f32, vec2: Vec<(f32, f32)>) {
let ls1 = ls1.sub_frac(start, end).unwrap();
println!("expected {:?}, got: {:?}", vec2, ls1);
for (Coord { x: x1, y: y1}, (x2, y2)) in ls1.0.iter().zip(vec2.iter()) {
assert_delta!(x1, x2, 0.001);
assert_delta!(y1, y2, 0.001);
}
}
#[test]
fn half1_works() {
let l1 = LineString::from(vec![(0.0, 0.0), (100.0, 0.0)]);
cmp(&l1, 0.0, 0.5, vec![(0.0, 0.0), (50.0, 0.0)]);
cmp(&l1, -100.0, 0.5, vec![(0.0, 0.0), (50.0, 0.0)]);
cmp(&l1, 0.0, 1.0, vec![(0.0, 0.0), (100.0, 0.0)]);
cmp(&l1, 0.0, 2.0, vec![(0.0, 0.0), (100.0, 0.0)]);
assert!(l1.sub_frac(0.75, 0.25).is_none());
assert!(l1.sub_frac(f32::NAN, f32::NAN).is_none());
}
#[test]
fn half1_real_geom() {
let ls: LineString<f32> = LineString::from(vec![
(8564258.0, 5351193.5),
(8564267.0, 5351172.5),
(8564277.0, 5351118.0),
(8564288.0, 5351049.5),
(8564292.0, 5351002.0)]);
let result = ls.sub_frac(0.0, 0.5).unwrap();
assert_eq!(result.0.len(), 4);
}
}
Edits
edit (RobWalt): Sorry in advance for editing this comment 🙈 I just added the language highlighting to the code examples to make it easier to read.
Closing this for the time being. If you find the time to work on this, please reopen!