Update packages for macOS
Unfreeze matplotlib and add tensorflow-metal to macos install for apple silicon
I tried the install (main branch with your updated files). The install works but training is strange. Here is a fragment from the last Epoch: Note the loss is astronomical. MacBook Air M1. Are you getting valid results?
178/178 [==============================] - ETA: 0s - loss: 119965049405247305285632.0000 - n_outputs0_loss: 80119361630094294515712.0000 - n_outputs1_loss: 39845651746355991805952.0000
Epoch 6: val_loss did not improve from 107208061654925312.00000
178/178 [==============================] - 21s 117ms/step - loss: 119965049405247305285632.0000 - n_outputs0_loss: 80119361630094294515712.0000 - n_outputs1_loss: 39845651746355991805952.0000 - val_loss: 134251115941612859949056.0000 - val_n_outputs0_loss: 15944611674736914595840.0000 - val_n_outputs1_loss: 118306526784874082205696.0000
INFO:donkeycar.parts.keras:////////// Finished training in: 0:02:25.759302 //////////
I then uninstalled tensorflow-metal and though it was slow I got expected results:
Epoch 76: val_loss did not improve from 0.04335
178/178 [==============================] - 50s 282ms/step - loss: 0.0500 - n_outputs0_loss: 0.0365 - n_outputs1_loss: 0.0135 - val_loss: 0.0438 - val_n_outputs0_loss: 0.0316 - val_n_outputs1_loss: 0.0122
Epoch 77/100
178/178 [==============================] - ETA: 0s - loss: 0.0483 - n_outputs0_loss: 0.0350 - n_outputs1_loss: 0.0133
Epoch 77: val_loss did not improve from 0.04335
178/178 [==============================] - 75s 420ms/step - loss: 0.0483 - n_outputs0_loss: 0.0350 - n_outputs1_loss: 0.0133 - val_loss: 0.0446 - val_n_outputs0_loss: 0.0323 - val_n_outputs1_loss: 0.0123
INFO:donkeycar.parts.keras:////////// Finished training in: 1:24:04.925587 //////////
I have been trying to get this to work since version 5.0 +. The version running Tensorflow 2.9 with the appropriate Python 3.9 and Tensorflow-metal 0.5.0 works. I have read that others have problems with the newer versions. And no updates from Apple since last year September.
I tried the install (main branch with your updated files). The install works but training is strange. Here is a fragment from the last Epoch: Note the loss is astronomical. MacBook Air M1. Are you getting valid results?
178/178 [==============================] - ETA: 0s - loss: 119965049405247305285632.0000 - n_outputs0_loss: 80119361630094294515712.0000 - n_outputs1_loss: 39845651746355991805952.0000 Epoch 6: val_loss did not improve from 107208061654925312.00000 178/178 [==============================] - 21s 117ms/step - loss: 119965049405247305285632.0000 - n_outputs0_loss: 80119361630094294515712.0000 - n_outputs1_loss: 39845651746355991805952.0000 - val_loss: 134251115941612859949056.0000 - val_n_outputs0_loss: 15944611674736914595840.0000 - val_n_outputs1_loss: 118306526784874082205696.0000 INFO:donkeycar.parts.keras:////////// Finished training in: 0:02:25.759302 //////////I then uninstalled tensorflow-metal and though it was slow I got expected results:
Epoch 76: val_loss did not improve from 0.04335 178/178 [==============================] - 50s 282ms/step - loss: 0.0500 - n_outputs0_loss: 0.0365 - n_outputs1_loss: 0.0135 - val_loss: 0.0438 - val_n_outputs0_loss: 0.0316 - val_n_outputs1_loss: 0.0122 Epoch 77/100 178/178 [==============================] - ETA: 0s - loss: 0.0483 - n_outputs0_loss: 0.0350 - n_outputs1_loss: 0.0133 Epoch 77: val_loss did not improve from 0.04335 178/178 [==============================] - 75s 420ms/step - loss: 0.0483 - n_outputs0_loss: 0.0350 - n_outputs1_loss: 0.0133 - val_loss: 0.0446 - val_n_outputs0_loss: 0.0323 - val_n_outputs1_loss: 0.0123 INFO:donkeycar.parts.keras:////////// Finished training in: 1:24:04.925587 //////////I have been trying to get this to work since version 5.0 +. The version running Tensorflow 2.9 with the appropriate Python 3.9 and Tensorflow-metal 0.5.0 works. I have read that others have problems with the newer versions. And no updates from Apple since last year September.
Thanks @cfox570 for helping to test the code. Training for me works without any problems on my MB Pro M3. Can you please check which versions of tensorflow, tensorflow-metal and python are getting installed in your environment? I am seeing the following:
tensorflow 2.15.1
tensorflow-metal 1.1.0
python 3.11.9
Note, when you check out the version of the PR you should be seeing the version v5.2.dev3 if you are loading donkeycar. Also, do you maybe train using a transfer model? I could imagine that the formats of the saved models might differ when using tensorflow-metal and you might need to start from scratch.
tensorflow 2.15.1 tensorfzlow-estimator 2.15.0 tensorflow-metal 1.1.0 python 3.11.11 Note I tried many other Python versions in the past summer.
Conda 24.9.2
We obviously have different data. Are you using donkeycar or simulation data?
Craig
On Dec 28, 2024, at 2:41 PM, DocGarbanzo @.***> wrote:
tensorflow 2.15.1 tensorflow-metal 1.1.0 python 3.11.9
I am using Ed's circuit launch data from the donkeycar_dataset repo.
Can you send me the link?
On Jan 15, 2025, at 3:23 AM, DocGarbanzo @.***> wrote:
I am using Ed's circuit launch data from the donkeycar_dataset repo.
— Reply to this email directly, view it on GitHub https://github.com/autorope/donkeycar/pull/1203#issuecomment-2591916042, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADERR7DHQARCNBUP27U2JD2KYLINAVCNFSM6AAAAABTTX6KOWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJRHEYTMMBUGI. You are receiving this because you were mentioned.
Can you send me the link?
Yeah sure, sorry for the late reply: https://github.com/autorope/donkey_datasets/tree/master/circuit_launch_20210716
I am in the process of trying a new install Miniconda3-py311_24.11.1-0-MacOSX-arm64.pkg What version of Miniconda are you using?
Bought a new M4 MacBook Air. Installed the software with changes and confirmed that the install works. I used the dataset and can report that it ran for 10 epochs, 12s 59ms per epoch, val_loss = 0.20352, elapsed time = 0:02:02.513858. Of course I can't test since this is for a course I do not have. I am giving my old M1 mac to my daughter!