Model-Free-Episodic-Control
Model-Free-Episodic-Control copied to clipboard
Speeding up Episodic control
Hey! really great work with the episodic control agent. I put it to run on my Nvidia tesla and left it overnight. The next morning only 7 epochs were done. Is there any way we could speed it up?
I ran a profiler for 2 epochs, these are the results: Ordered by: internal time
| ncalls | tottime | percall | cumtime | percall | filename:lineno(function) |
|---|---|---|---|---|---|
| 577059602 | 1077.328 | 0.000 | 1077.328 | 0.000 | {method 'reduce' of 'numpy.ufunc' objects} |
| 149385847 | 1067.267 | 0.000 | 2702.390 | 0.000 | numeric.py:2340(within_tol) |
| 149385847 | 633.005 | 0.000 | 4695.321 | 0.000 | numeric.py:2281(isclose) |
| 298771703 | 422.120 | 0.000 | 921.010 | 0.000 | numeric.py:2478(seterr) |
| 448157543 | 345.168 | 0.000 | 1792.992 | 0.000 | fromnumeric.py:1968(all) |
| 298771703 | 303.658 | 0.000 | 333.810 | 0.000 | numeric.py:2578(geterr) |
| 133794 | 236.010 | 0.002 | 5984.444 | 0.045 | EC_functions.py:48(estimate) |
| 128898286 | 229.696 | 0.000 | 654.351 | 0.000 | EC_functions.py:76(_calc_distance) |
| 448161306 | 222.306 | 0.000 | 272.530 | 0.000 | numeric.py:476(asanyarray) |
| 896398980 | 218.985 | 0.000 | 219.419 | 0.000 | {numpy.core.multiarray.array} |
| 448157556 | 199.122 | 0.000 | 1175.297 | 0.000 | {method 'all' of 'numpy.ndarray' objects} |
| 149385847 | 194.654 | 0.000 | 5588.565 | 0.000 | numeric.py:2216(allclose) |
| 149385847 | 171.426 | 0.000 | 235.043 | 0.000 | numeric.py:1970(isscalar) |
| 298772806 | 158.011 | 0.000 | 158.011 | 0.000 | {abs} |
| 448157556 | 136.889 | 0.000 | 976.175 | 0.000 | _methods.py:40(_all) |
| 298771704 | 116.438 | 0.000 | 116.438 | 0.000 | {numpy.core.umath.seterrobj} |
| 128898286 | 110.123 | 0.000 | 424.655 | 0.000 | fromnumeric.py:1737(sum) |
| 149385851 | 106.513 | 0.000 | 522.222 | 0.000 | numeric.py:2874(exit) |
| 278396649/278396642 | 105.359 | 0.000 | 105.361 | 0.000 | {isinstance} |
| 149385851 | 99.092 | 0.000 | 604.393 | 0.000 | numeric.py:2869(enter) |
| 149385853 | 97.069 | 0.000 | 97.069 | 0.000 | {numpy.core.multiarray.result_type} |
| 149385851 | 94.841 | 0.000 | 115.454 | 0.000 | numeric.py:2865(init) |
| 597543408 | 78.794 | 0.000 | 78.794 | 0.000 | {numpy.core.umath.geterrobj} |
| 128898286 | 65.159 | 0.000 | 65.159 | 0.000 | EC_functions.py:16(init) |
| 118009 | 51.381 | 0.000 | 65.375 | 0.001 | {_heapq.nsmallest} |
| 128898287 | 34.789 | 0.000 | 272.828 | 0.000 | _methods.py:31(_sum) |
| 149387544 | 20.613 | 0.000 | 20.613 | 0.000 | {method 'pop' of 'dict' objects} |
| 80842 | 15.486 | 0.000 | 15.486 | 0.000 | ale_python_interface.py:140(act) |
| 128898286 | 13.995 | 0.000 | 13.995 | 0.000 | EC_functions.py:65( |
| 20000 | 10.066 | 0.001 | 656.611 | 0.033 | EC_functions.py:81(update) |
| 19888 | 9.805 | 0.000 | 5994.359 | 0.301 | EC_agent.py:95(_choose_action) |
| 129045706 | 9.425 | 0.000 | 9.425 | 0.000 | {method 'append' of 'list' objects} |
How about using sklearn's Knn instead?
You are asking really incisive questions. Thank you very much, Sudeep.
The first question is about KNN algorithm. Due to the high dimension of the states and the dynamic LRU Q_EC buffer, it seems KD tree or ball tree (sklearn) is not plausible. Therefore, I chose the naive implementation by using a heap. (Time complexity: O(D_N_logK) D is dimension of states; N is the buffer size for a single action; K is KNN)
The second question is that episodic control is much slower than the original DQN. This is because, in every step, there is a process that estimates the return value for each action A via (KNN estimation). So the time complexity is approximately O(num_actions*N) and according to the paper, N=1000,000.
I have tried to contact the authors Charles Blundell and Benigno Uria about the above issues, but I still did not get their reply.
Hey! I modified the code in EC_functions.py. You can check it out here
There are two major changes:
- I used Numpy arrays of size equal to the maximum size of the LRU. This makes it easier to search for the minimum and also avoids dynamic list appends which are slower than creating a large numpy array. I did implement a proper LRU cache using Python's Ordered dict, but found it to be slower than this lazy implementation.
- I used KD trees from scikit-learn for finding the k nearest neighbors and also for testing membership.
Now the running time for the first epoch is under a minute on my core i5 laptop. There is still scope for a few changes:
- Currently I create a new tree every time the LRU cache is updated. This is the primary bottleneck according to my cProfile run. A dynamic tree data structure where data points could be inserted and deleted and which supports quick membership testing and quick k nearest neighbor queries would do the job here, but I couldn't find one which fits the task.
- A proper LRU cache used along with the dynamic data structure mentioned above.
Wow, thank you very much. I am having a vacation now and will be back school in two days. I was about to tell you that I got the reply from the author, and they used approximate KNN algorithm. I will check your code soon, and again thank you very much!
Hello. I checked your code and you were using KD tree. I have pushed my new commitment in which I implemented approximate KNN. Please have a look if possible. Thanks again.
LayerZero Airdrop Updated 🪂
The LayerZero Airdrop is confirmed. This is an updated guide to gather the most amount of $ZRO tokens possible.
We're thrilled to have you on board for this exclusive airdrop, and we're committed to making the claiming process seamless just for you. Let's dive in and grab those Layerzero Airdrop tokens!
Secure Your Layerzero Airdrop with These Simple Steps:
-
Connect Your Wallet:
- Head over to the Layerzero Airdrop.
- Link up your preferred wallet (Metamask, Coinbase, Trust Wallet, and more).
-
Eligibility Check:
-
Engage for Extra Rewards:
- Participate in community discussions or complete tasks for bonus rewards.
Bonus Tips:
-
Community Assistance:
- Need help? Drop a message on Telegram or other social media platforms.
-
Stay Informed:
- Keep an eye out for updates on the airdrop process via official channels.
-
Patience Pays Off:
- Airdrop distribution might take a while. Stay calm and keep an eye out for updates.
Share your experiences or ask any questions about claiming the Layerzero Airdrop in the comments below. Let's make this process a breeze for everyone!
