Take lessons from WikiSort and QuickXSort
WikiSort and QuickXSort have in common the fact that they don't always need to allocate memory to merge big runs: sometimes they swap values with an « internal buffer »: a delimited area in the sequence where data is known not to be sorted (yet).
Since the big advantage of vergesort is to somehow adapt to Runs, using the memory that hasn't been inspected yet is a no-go since potentially sorted runs would end up being shuffled and we don't want that. However, if we delay the calls to pdqsort and remember where the big runs start and stop (already sorted or yet to be sorted), we might use the big unsorted ones as free swap buffer space for the merge operations.
Moreover, if some big runs are already sorted, but some other ones aren't, it probably means that the unsorted ones are mostly sorted (that's how it should be with real-world data). Quicksort isn't the best sorting algorithm in the world with mostly sorted sequences and pdqsort starts with a quicksort. Using the unsorted (and probably mostly sorted) sequences a swap buffers could help to break their patterns and generate a data distribution more suitable for quicksort.
Of course storing the start and end of the runs means that more memory is used, but that shouldn't be a problem since it would be O(log n) thanks to how vergesort already works. Moreover, the merge strategy isn't optimal (see #8), and better merging strategy often means that we need to store runs information too.
Since this issue was opened, the random-access overload of vergesort was changed to store the runs and use a k-way merge that merges runs pairwise until there are none left. While this means that we track the different runs and that makes it easier to tag, find and use the unsorted ones during the merge step (assuming a delayed sorting), it will only be possible to use internal buffering during the first pass, and memory will have to be allocated for the next merge passes.
It could be worked around a bit by identifying the biggest unsorted run and to merge everything but that one before sorting and merging it too. We would still need to allocate extra memory, but only once or twice if we are lucky.