pickhardtpayments icon indicating copy to clipboard operation
pickhardtpayments copied to clipboard

Adding a min-cost flow library

Open Lagrang3 opened this issue 3 years ago • 20 comments

Hi @renepickhardt. I have prepared a draft for the C-API. I still owe you an example, I'll work on that.

I'm also proposing an example that illustrates the use of the Python-API from my point of view. Very soon I will work on the specifics of the Python-API.

Just for the record: we have already a MCF solver in C++ in my repository https://github.com/Lagrang3/mincostflow and we have a prototype of the uncertainty network and payment session already in this repository. We are defining the C and Python APIs before implementation.

Comments are welcomed!

Lagrang3 avatar Jun 21 '22 16:06 Lagrang3

One high level comment before I go into the details of the API calls. There is a PR #18 by @sebulino that already changes the API of the Payment Session quite a bit and basically moves a lot over to the newly introduced pay class. While there are some differences I think it is also converging to the python example provided by @Lagrang3. I know that the link to @sebulino 's branch is currently being refactored a bit but I think we can already have a discussion about the similarities and differences:

I think the major difference is that in sebulinos's approach The Payment class not only generates the attemps (in initialize() via the min cost flow solver on the uncertainty net) but also has the logic of making the attempts. I am uncertain which is better. From the perspective of having a really small c-lib it seems that generating candidate sets would be sufficient. However as the c-lib needs to be feeded with the new information it certainly seems plausible to also pass the Oracle network and let it handle the making of the attempts....

The main reason that speaks against sebulinos approach of doing all in Payment is that it currently is syncronous in Payment and not concurrently. So the fact that a User should be able to decide how to do that speak for letting the user do the logic and just feed Payment.

Looking forward for a good discussion and other / better arguments and thoughts than the ones that I have provided so far.

renepickhardt avatar Jun 27 '22 10:06 renepickhardt

@renepickhardt. At this point I have integrated my min-cost flow solver into Pickhardtpayments.

How is it done?

I am using a git submodule to pull my mincostflow repository from here. Thus for the moment I am keeping these two projects separate. I am anyways willing to put all my files here at some point to merge both projects.

My mincostflow is written in C++, but I've just recently added a minimal working C-API and then a Python-API on top of that. So that you can call it from the python sources in Pickhardtpayments. We have discussed the possibility to write the Uncertainty Network in C++ as well, but for the moment we only have the Min-cost Flow solver in C++ and nothing beyond that.

How to test this?

Meson will take care to pull the submodule from github and compile my mincostflow library in two steps. For example:

git clone https://github.com/Lagrang3/pickhardtpayments.git
cd pickhardtpayments
git checkout eduardo-mcf

we create a build directory

mkdir build && cd build
meson ..
ninja
cd ..

At this point my mincostflow is built into build/subprojects/MinCostFlow, and the python API is found in the source directory subprojects/MinCostFlow/python. Then we build the pickhardtpayment python binaries

python -m build .

And we can run the example

LD_LIBRARY_PATH="build/subprojects/MinCostFlow" PYTHONPATH=".:subprojects/MinCostFlow/python" python examples/basicexample.py

Additional comments

My project still depends on Ortools library for benchmarking, but that dependency is optional. It will compile even if it doesn't find it and you can still run benchmarks without ortools, it will just not run the ortools solver.

Since Lagrang3/mincostflow appears as a git submodule, this project contains a link to a certain commit in that submodule. Therefore if something changes in Lagrang3/mincostflow, this project still will see the old commit, and in order to update it one needs to

git submodule update --remote --merge
git commit

that will update the link to the latest commit in Lagrang3/mincosflow.

The solver has been tested thoroughly, I have no doubts about its correctness. But once you run the example you will notice the runtime is too long. It takes about 10 to 15 seconds to perform a single min-cost flow in the lightning network.

Next steps

  1. improve the python-API of the MCF solver to be able to update channels' capacities and costs instead of the entire network between payment iterations. We could solve this little detail in the present PR.
  2. improve the performance of the MCF solver.
  3. unify in a single build system pickhardtpayments and Lagrang3/mincostflow.
  4. merge Lagrang3/mincostflow into pickhardtpayments.

Lagrang3 avatar Aug 15 '22 10:08 Lagrang3

I did a profile of the runtime of Pickhardpayments, running the basicexample.py with random.seed(64) for reproducibility. Using Ortools I get:

runtime for graph initialization: 3.109 sec
total runtime for MCF solve: 1.444 sec
total runtime (including inefficient memory managment): 4.554 sec

and using my MCF library I get

runtime for graph initialization: 2.779 sec
total runtime for MCF solve: 9.432 sec
total runtime (including inefficient memory managment): 12.214 sec

In this particular test case there where 4 iterations of the MCF. My solver had to initialize the graph only once and it did in 2.7 seconds. While the Ortools solver needed to build the graph data 4 times resulting in 3.1 seconds. The bottleneck of our solver is in the computation of the MCF, which is still slower than Ortools.

Lagrang3 avatar Aug 18 '22 14:08 Lagrang3

The results for running simpleMCFNetwork solver:

runtime for graph initialization: 2.582 sec
total runtime for MCF solve: 4.058 sec
total runtime (including inefficient memory managment): 6.642 sec

Lagrang3 avatar Aug 18 '22 18:08 Lagrang3

At this point there are some observations:

  1. initializing my solvers is very expensive.
  2. even the simpleMCFNetwork is slower than Ortool at solving the MCF.
  3. the MCFNetwork solver uses the same algorithms that simpleMCFNetwork but its data structures make the computations heavier.

Some ideas to solve the 1st issue: Transfer data from python to C and into the graph initializer in batches, we can first profile the time for initialization to find the bottleneck, if it is in the python side deserializing the channels and nodes, or if it is in the C/Python interface because of the repeated call of an inefficient functions, or if it is in the C++ side.

To tackle the 2nd issue: We can profile again simpleMCFNetwork, for example by measuring the time it takes to build the static adjacency data structure, and count the calls to relabel and push, compare those numbers to Ortools. Then we can also improve the implementation of the look-ahead-push heuristic as described in Bunnagel 1998. We can also try to implement the price-refinement heuristic also described in Bunnagel 1998.

For the 3rd issue: We need to profile mainly the time it takes to extract adjacency from nodes with the MCFNetwork solver and compare against simpleMCFNework. Try changing the type of node_index_t and arc_index_t, use primitive types if necessary, check also by moving from 64bit to 32bit integers.

Lagrang3 avatar Aug 18 '22 19:08 Lagrang3

@renepickhardt !!! Latest test:

runtime for graph initialization: 2.519 sec
total runtime for MCF solve: 0.363 sec
total runtime (including inefficient memory managment): 2.884 sec

The pseudo-polynomial solver by successive shortest paths is faster than the Cost-scaling for the LN graph. This was one of the possibilities we foresaw at the beginning of the Summer of Bitcoin. The fact that different solvers could have different performance depending on the topology of the graph.

Lagrang3 avatar Aug 18 '22 19:08 Lagrang3

Primal dual:

runtime for graph initialization: 2.745 sec
total runtime for MCF solve: 1.255 sec
total runtime (including inefficient memory managment): 4.001 sec

Lagrang3 avatar Aug 18 '22 19:08 Lagrang3

I will paste here the entire output for the aforementioned run of basicexample.py with random.seed(64) using MCFNetwork with the augmenting-path back-end:

Round number:  1
Try to deliver 10000000 satoshi:

Statistics about 3 candidate onions:

successful attempts:
--------------------
 p =  74.59% amt:    289114 sats  hops: 7 ppm:  3251

failed attempts:
----------------
 p =  44.97% amt:   6710886 sats  hops: 3 ppm:  2036 
 p =  60.34% amt:   3000000 sats  hops: 5 ppm:  2580 

Attempt Summary:
=================

Tried to deliver   10000000 sats
expected to deliver    5044042 sats 	(50.44%)
actually deliverd     289114 sats 	(2.89%)
deviation: 0.06
planned_fee: 22350.270 sat
paid fees:  940.196 sat
Runtime of flow computation: 0.10 sec 

================================================================

Round number:  2
Try to deliver 9710886 satoshi:

Statistics about 4 candidate onions:

successful attempts:
--------------------
 p =  65.74% amt:   2162227 sats  hops: 4 ppm:  2496
 p =  53.82% amt:   3296177 sats  hops: 7 ppm:  3251

failed attempts:
----------------
 p =  86.00% amt:   1252482 sats  hops: 3 ppm:  1973 
 p =  54.88% amt:   3000000 sats  hops: 6 ppm:  5318 

Attempt Summary:
=================

Tried to deliver    9710886 sats
expected to deliver    5919095 sats 	(60.95%)
actually deliverd    5458404 sats 	(56.21%)
deviation: 0.92
planned_fee: 34544.642 sat
paid fees: 16118.244 sat
Runtime of flow computation: 0.12 sec 

================================================================

Round number:  3
Try to deliver 4252482 satoshi:

Statistics about 3 candidate onions:

successful attempts:
--------------------
 p =  77.31% amt:   1252482 sats  hops: 3 ppm:  2036
 p =  78.73% amt:    200000 sats  hops: 4 ppm: 21787

failed attempts:
----------------
 p =  65.07% amt:   2800000 sats  hops: 3 ppm:  1897 

Attempt Summary:
=================

Tried to deliver    4252482 sats
expected to deliver    2947654 sats 	(69.32%)
actually deliverd    1452482 sats 	(34.16%)
deviation: 0.49
planned_fee: 12220.305 sat
paid fees: 6908.705 sat
Runtime of flow computation: 0.08 sec 

================================================================

Round number:  4
Try to deliver 2800000 satoshi:

Statistics about 1 candidate onions:

successful attempts:
--------------------
 p =  77.15% amt:   2800000 sats  hops: 4 ppm: 21787

Attempt Summary:
=================

Tried to deliver    2800000 sats
expected to deliver    2160219 sats 	(77.15%)
actually deliverd    2800000 sats 	(100.00%)
deviation: 1.30
planned_fee: 61003.600 sat
paid fees: 61003.600 sat
Runtime of flow computation: 0.07 sec 

================================================================

SUMMARY:
========
Rounds of mcf-computations:  4
Number of onions sent:  11
Number of failed onions:  5
Failure rate: 45.45% 
runtime for graph initialization: 2.581 sec
total runtime for MCF solve: 0.363 sec
total runtime (including inefficient memory managment): 2.947 sec
Learnt entropy: 14.93 bits
Fees for successfull delivery: 84970.745 sat --> 8497 ppm
used mu: 0

Lagrang3 avatar Aug 18 '22 19:08 Lagrang3

Just adding a comment here as it's somewhat relevant: I have been working on an alternative min cost flow implementation as well for the Lightning Network. It's based on negative cycle cancellation on the derivative of the cost function (as opposed to linearisation of the problem), and it's already competitive with OrTools in every metric. Here's the implementation:

https://github.com/adamritter/pickhardtpayments/

I also started integrating it to the Rust Lightning Development Kit (LDK), but it takes a lot more work to finish (and I would need help as well if anybody wants to help with productionizing MCF):

https://github.com/lightningdevkit/rust-lightning/pull/1668

adamritter avatar Aug 19 '22 15:08 adamritter

Just adding a comment here as it's somewhat relevant: I have been working on an alternative min cost flow implementation as well for the Lightning Network. It's based on negative cycle cancellation on the derivative of the cost function (as opposed to linearisation of the problem), and it's already competitive with OrTools in every metric. Here's the implementation:

https://github.com/adamritter/pickhardtpayments/

I also started integrating it to the Rust Lightning Development Kit (LDK), but it takes a lot more work to finish (and I would need help as well if anybody wants to help with productionizing MCF):

lightningdevkit/rust-lightning#1668

Thanks @adamritter. Right now I am busy polishing the details on this little MCF project for Summer of Bitcoin. When I am done with that, I'd really like to collaborate on your solver. It is very exciting to see that you are already working to implement it into the Rust Lightning Development Kit!

Lagrang3 avatar Aug 25 '22 07:08 Lagrang3

I am uploading here the latest benchmarks I obtained. The test cases were generated by the file benchmark/benchmark.py. Each test cases consist of a random pair of nodes from the Lightning Network (A,B) and a certain amount s of satoshis to be sent from A to B. The pairs are selected from a population of pairs for which a feasible solution exists. Several values of s were tested from 2^10 until 2^24, and for each value of s we have generated 100 pairs of nodes.

The first plot shows the time spent in computing the MCF for three backends: Google's OR-tools (the underlying algorithm is cost-scaling) and two MCF solvers from my library using cost-scaling and augmenting-path (maybe better known as successive shortest path, sorry for the confusing naming). The lines indicate the mean value of the runtime and the shaded regions indicate the maximum and minimum runtime of the testcases. It is noteworthy that OR-tools and my cost-scaling solver have the same curve shape, which is expected since they both consist of the same algorithm, however OR-tool runtime is roughly 3 times faster than my own implementation. On the other hand, my implementation of augmenting-path can be almost a factor of 10 faster than OR-tools for small transactions, though the runtime increases faster than cost-scaling with the amount, going above OR-tools curve for transactions of 16M sats.

plot_time

The second plot is the most important result of this PR. Our MCF solver, being tailored specifically for the Lightning Network, has the advantage that between iterations of Pickhardtpayments when the uncertainty network is updated the MCF data structure MCFNetwork needs only to update the corresponding arcs and not build the entire graph as it was done with the external OR-tools library. For that reason, the initialization of the object MCFNetwork, which is a resource expensive operation, only takes place when the "server" starts and between payments it only needs to be updated. Combining this optimization with the augmenting-path solver we are able to ensure that the computation of the payments falls below 1 second for transactions up to 1M sats.

plot_totaltime

Lagrang3 avatar Aug 25 '22 09:08 Lagrang3

Thanks @adamritter. Right now I am busy polishing the details on this little MCF project for Summer of Bitcoin. When I am done with that, I'd really like to collaborate on your solver. It is very exciting to see that you are already working to implement it into the Rust Lightning Development Kit!

Sounds great, it will be awesome to improve lightning payments in real world. Right now I'm planning to port the simulator to use LDK payments / retries, as testing large payments in real life is really hard and scary (the possibility of losing thousands of dollars because of some software bug).

adamritter avatar Aug 26 '22 09:08 adamritter

Hi, this is just a question. Would it be hard to put in a feature in your fastest algorithm to make the fees part of the channel capacity? As I'm implementing the integration with LDK I have to take care of this just as much as min HTLC and base fees.

I use satoshis and drop channels with bigger than 1 satoshi min HTLC and >0 base fees right now, but I wish I had implemented pct fees being part of the channel capacity as a part of the min cost algorithm, as it's not that hard to implement.

What's important is to provide the pct fees for the edges of the graph and whenever a channel is updated, the capacity shouldn't be bigger than the routed flow / (1 - proportional forwarding fee). Another way to write it is that the incoming flow to the target node of an edge is equal to the outgoing flow * (1-proportional forwarding fee).

adamritter avatar Sep 19 '22 17:09 adamritter

Hi @adamritter. My first impression is that the Min. Cost Flow solver does just that. Any adjustment of costs and capacities should be performed in the following layer.

Lagrang3 avatar Sep 20 '22 10:09 Lagrang3

I tried to solve this problem in a higher layer with hacks, but I couldn’t.

For example if we take probabilities out of the problem statement, let’s suppose a path of 100, 100, 100, 100 capacities and 1% fee on each edge. A standard min cost algorithm would think that it can route 100 units for 4% fee, but in reality at most 1000.990.990.990.99 units can arrive to the target node.

Eduardo Quintana @.***> (időpont: 2022. szept. 20., K, 11:07) ezt írta:

Hi @adamritter https://github.com/adamritter. My first impression is that the Min. Cost Flow solver does just that. Any adjustment of costs and capacities should be performed in the following layer.

— Reply to this email directly, view it on GitHub https://github.com/renepickhardt/pickhardtpayments/pull/29#issuecomment-1252130135, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN5SWAHDKKYANK635JVTZNDV7GEFRANCNFSM5ZM4Y6GQ . You are receiving this because you were mentioned.Message ID: @.***>

adamritter avatar Sep 20 '22 10:09 adamritter

Just to elaborate more, I solved the problem by increasing the target amount by 1% and modifying the resulting paths with hacks, which is fine for most cases in reality, but if it´s possible to move this problem inside the min cost algorithm without sacrificing run-time, that means that less hacks have to be around the algorithm.

Adam Ritter @.***> (időpont: 2022. szept. 20., K, 11:20) ezt írta:

I tried to solve this problem in a higher layer with hacks, but I couldn’t.

For example if we take probabilities out of the problem statement, let’s suppose a path of 100, 100, 100, 100 capacities and 1% fee on each edge. A standard min cost algorithm would think that it can route 100 units for 4% fee, but in reality at most 1000.990.990.990.99 units can arrive to the target node.

Eduardo Quintana @.***> (időpont: 2022. szept. 20., K, 11:07) ezt írta:

Hi @adamritter https://github.com/adamritter. My first impression is that the Min. Cost Flow solver does just that. Any adjustment of costs and capacities should be performed in the following layer.

— Reply to this email directly, view it on GitHub https://github.com/renepickhardt/pickhardtpayments/pull/29#issuecomment-1252130135, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN5SWAHDKKYANK635JVTZNDV7GEFRANCNFSM5ZM4Y6GQ . You are receiving this because you were mentioned.Message ID: @.***>

adamritter avatar Sep 20 '22 10:09 adamritter

There should be then a correct algorithmic solution, which is not linear or convex MCF. It is a very interesting proposition, but we should, in my opinion, move this discussion to a dedicated thread/issue.

Lagrang3 avatar Sep 20 '22 10:09 Lagrang3

Of course, no problem, I understand that it’s outside the problem space of replacing Google’s min-cost flow algorithm. I just see that it’s harder for me to work on improving the algorithm now that I moved to the integration of the algorithm with the lightning network.

Eduardo Quintana @.***> (időpont: 2022. szept. 20., K, 11:28) ezt írta:

There should be then a correct algorithmic solution, which is not linear or convex MCF. It is a very interesting proposition, but we should, in my opinion, move this discussion to a dedicated thread/issue.

— Reply to this email directly, view it on GitHub https://github.com/renepickhardt/pickhardtpayments/pull/29#issuecomment-1252154866, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN5SWAA4OF2HOZGBR6GYIR3V7GGVLANCNFSM5ZM4Y6GQ . You are receiving this because you were mentioned.Message ID: @.***>

adamritter avatar Sep 20 '22 10:09 adamritter

@adamritter if you look at section 2.4 of our paper addressed the issue and noted that generalized min cost flow problems exist but that it seems not worthwhile during optimizsation to go for the exact fees as we approximate anyway

renepickhardt avatar Sep 20 '22 10:09 renepickhardt

Thanks, to tell you the truth, most of the problems I got were from trying to pass the LDK routing unit tests, in whose test cases the fees can be 200% of the amount sent over the network. Most of the unit tests seemed to test unrealistic cases though, so I ended up disabling the ones that were too hard to pass.

Rene Pickhardt @.***> (időpont: 2022. szept. 20., K, 11:44) ezt írta:

@adamritter https://github.com/adamritter if you look at section 2.4 of our paper https://arxiv.org/abs/2107.05322 addressed the issue and noted that generalized min cost flow problems exist but that it seems not worthwhile during optimizsation to go for the exact fees as we approximate anyway

— Reply to this email directly, view it on GitHub https://github.com/renepickhardt/pickhardtpayments/pull/29#issuecomment-1252170582, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN5SWAH2NDXCE3ZHEVXYTQ3V7GIQPANCNFSM5ZM4Y6GQ . You are receiving this because you were mentioned.Message ID: @.***>

adamritter avatar Sep 20 '22 10:09 adamritter