petals icon indicating copy to clipboard operation
petals copied to clipboard

Allow filtering by max sequence length

Open justheuristic opened this issue 11 months ago • 3 comments

Problem: if some (but not all) servers support longer sequence length, inferencing with that sequence length would be very inefficient because the client will constantly bump into short-length servers.

Suggested solution: if we ask servers to report max sequence length to the DHT, a client will be able to filter by sequence length as they read DHT entries.

justheuristic avatar Jul 20 '23 21:07 justheuristic