yolov5
yolov5 copied to clipboard
Mergeing Yolov5 with LSTM to Human Activity Recognition
Search before asking
- [X] I have searched the YOLOv5 issues and discussions and found no similar questions.
Question
hi I need your help in Merging Yolov5 with LSTM, to Recognize Certain Human Activity
First, my Yolov5 succeeded to Detection the Objects, but I need to make the system Recognize the way Human Acts when holding the Object so I searched for the better way I s to merge Yolov5 with Lstm But I don Know How so it can go as pipeline first detecting the objects man holding then LSTM decides the Human Activity of The person holding the object Please advice me
Thank You
Additional
No response
@glenn-jocher please help thank You
@moahaimen LSTM models must be trained on video datasets. So as I tell everybody the first step is to have labelled data of the type you want to be able to predict.
Why would you need temporal information for this tho? I thought it was pretty clear whether a person is holding an object from just one frame? Is this for cases where the object cannot be seen?
@moahaimen LSTM models must be trained on video datasets. So as I tell everybody the first step is to have labelled data of the type you want to be able to predict.
thank you for answering I have a video dataset for the certain Actions i need, but i need to understand how to put LSTM with Yolo, so when I make The Detection on a Youtube Video for Example Yolo gives me the Bound box of the object and LSTM gives in same video the Human Action @glenn-jocher
need I need to make recognition of a certain pose for human carry an object
@moahaimen LSTM models must be trained on video datasets. So as I tell everybody the first step is to have labelled data of the type you want to be able to predict.
thank you for answering I have a video dataset for the certain Actions i need, but i need to understand how to put LSTM with Yolo, so when I make The Detection on a Youtube Video for Example Yolo gives me the Bound box of the object and LSTM gives in same video the Human Action @glenn-jocher
@moahaimen
Hai, i am also having same usecase can you please share any idea or did you got any answer reg this ??
@moahaimen LSTM models must be trained on video datasets. So as I tell everybody the first step is to have labelled data of the type you want to be able to predict.
thank you for answering I have a video dataset for the certain Actions i need, but i need to understand how to put LSTM with Yolo, so when I make The Detection on a Youtube Video for Example Yolo gives me the Bound box of the object and LSTM gives in same video the Human Action @glenn-jocher
@moahaimen
Hai, i am also having same usecase can you please share any idea or did you got any answer reg this ??
i didnt have an answer that helps me, till this moment i am trying to find a solution but couldnt find any , i wish to get any help
👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.
Access additional YOLOv5 🚀 resources:
- Wiki – https://github.com/ultralytics/yolov5/wiki
- Tutorials – https://docs.ultralytics.com/yolov5
- Docs – https://docs.ultralytics.com
Access additional Ultralytics ⚡ resources:
- Ultralytics HUB – https://ultralytics.com/hub
- Vision API – https://ultralytics.com/yolov5
- About Us – https://ultralytics.com/about
- Join Our Team – https://ultralytics.com/work
- Contact Us – https://ultralytics.com/contact
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!
@moahaimen Hi, I'm working on same project, Pass YOLO output to LSTM input, Have you implemented this? If yes can you share source code? I'm stuck on "Pass YOLO output as input to LSTM".
@moahaimen Hi, I'm working on same project, Pass YOLO output to LSTM input, Have you implemented this? If yes can you share source code? I'm stuck on "Pass YOLO output as input to LSTM".
unfortunately i couldn't if you know how to do that i would be grateful if you show me how
@developer-gurpreet
Thank you for reaching out. While YOLO and LSTM can be used together for certain applications like video analysis, directly passing YOLO output to LSTM can be a bit challenging. Integration between different models usually requires some pre-processing and data formatting.
Here's a general approach you can follow:
-
Obtain YOLO detections: Use YOLOv5 or any other YOLO implementation to detect and obtain bounding boxes of objects in each frame of the video.
-
Extract features: Extract features or representations from the detected objects using YOLO. These features can be the bounding box coordinates, class probabilities, or other relevant information.
-
Pre-process the features: Before passing the features to the LSTM, you might need to pre-process them based on the requirements of your LSTM model. For example, you may need to normalize or scale the features, reshape them, or convert them to a time sequence format.
-
Format data for LSTM: Your LSTM model expects a certain input format, typically a sequence of time steps. You will need to organize the pre-processed features into sequential data, with each time step representing a frame.
-
Train LSTM model: Once the data is properly formatted, you can feed it into your LSTM model for training. You may need to adjust the architecture and hyperparameters of the LSTM model based on your specific use case.
Unfortunately, I do not have a specific code example for passing YOLO output to LSTM. However, I recommend referring to research papers, tutorials, or example projects that combine object detection and activity recognition to get a better understanding of the implementation details.
If you have any more specific questions or issues during the process, feel free to ask. Best of luck with your project!