Multimodal Large Model Joint Learning Algorithm: Reproduction Based on KubeEdge-Ianvs
What would you like to be added/modified: A benchmark suite for multimodal large language models deployed at the edge using KubeEdge-Ianvs:
- Modify and adapt the existing edge-cloud data collection interface to meet the requirements of multimodal data collection;
- Implement a Multimodal Large Language Model (MLLM) benchmark suite based on Ianvs;
- Reproduce mainstream multimodal joint learning (training and inference) algorithms and integrate them into Ianvs single-task learning;
- (Advanced) Test the effectiveness of multimodal joint learning in at least one of Ianvs' advanced paradigms (lifelong learning, incremental learning, federated learning, etc.).
Why is this needed: KubeEdge-Ianvs currently focuses on edge-cloud collaborative learning (training and inference) for a single modality of data. However, edge devices, such as those in autonomous vehicles, often capture multimodal data, including GPS, LIDAR, and Camera data. Single-modal learning can no longer meet the precise inference requirements of edge devices. Therefore, this project aims to integrate mainstream multimodal large model joint learning algorithms into KubeEdge-Ianvs edge-cloud collaborative learning, providing multimodal learning capabilities.
Recommended Skills: TensorFlow/Pytorch, LLMs, KubeEdge-Ianvs
Useful links: KubeEdge-Ianvs KubeEdge-Ianvs Benchmark Test Cases Building Edge-Cloud Synergy Simulation Environment with KubeEdge-Ianvs Artificial Intelligence - Pretrained Models Part 2: Evaluation Metrics and Methods Example LLMs Benchmark List awesome-multimodal-ml Awesome-Multimodal-Large-Language-Models
@CreativityH Any recommended community channel to connect and discuss more about project
Hi @CreativityH ,
I'm excited about the opportunity to contribute to the "Multimodal Large Model Joint Learning Algorithm" project. My background in edge computing and machine learning, particularly with TensorFlow/PyTorch, aligns well with the project's goals. Here’s my proposed approach:
Proposed Approach
- Multimodal Data Collection Interface:
I will modify and adapt the existing edge-cloud data collection interface to handle multimodal data, including GPS, LIDAR, and Camera inputs. This will involve creating a unified data schema and preprocessing modules for each data type to ensure compatibility and consistency.
- Multimodal Large Language Model (MLLM) Benchmark Suite:
I will develop a benchmark suite for multimodal LLMs based on Ianvs. This will involve identifying suitable multimodal LLMs and defining relevant performance metrics, such as accuracy, latency, and resource utilization, to evaluate their effectiveness when deployed at the edge.
- Multimodal Joint Learning Algorithms:
I will reproduce mainstream multimodal joint learning algorithms (training and inference) and integrate them into Ianvs’ single-task learning framework. This step will ensure the system can effectively handle the complexities of multimodal data.
- Advanced Testing and Optimization:
I will test the effectiveness of multimodal joint learning in at least one of Ianvs' advanced paradigms (lifelong learning, incremental learning, federated learning). I will benchmark the system to ensure performance improvements without compromising accuracy. I will explore possible optimizations to enhance the efficiency of the edge-cloud collaborative learning setup, focusing on resource usage and latency reduction.
Looking forward to your feedback and the way forward to contributing!
Best Regards, Sargam
Please share any suggestions you have on how to start this project. I will be sharing my findings and research here.
Hi @CreativityH ,
I'm excited about the opportunity to contribute to the "Multimodal Large Model Joint Learning Algorithm" project. My background in edge computing and machine learning, particularly with TensorFlow/PyTorch, aligns well with the project's goals. Here’s my proposed approach:
Proposed Approach
- Multimodal Data Collection Interface:
I will modify and adapt the existing edge-cloud data collection interface to handle multimodal data, including GPS, LIDAR, and Camera inputs. This will involve creating a unified data schema and preprocessing modules for each data type to ensure compatibility and consistency.
- Multimodal Large Language Model (MLLM) Benchmark Suite:
I will develop a benchmark suite for multimodal LLMs based on Ianvs. This will involve identifying suitable multimodal LLMs and defining relevant performance metrics, such as accuracy, latency, and resource utilization, to evaluate their effectiveness when deployed at the edge.
- Multimodal Joint Learning Algorithms:
I will reproduce mainstream multimodal joint learning algorithms (training and inference) and integrate them into Ianvs’ single-task learning framework. This step will ensure the system can effectively handle the complexities of multimodal data.
- Advanced Testing and Optimization:
I will test the effectiveness of multimodal joint learning in at least one of Ianvs' advanced paradigms (lifelong learning, incremental learning, federated learning). I will benchmark the system to ensure performance improvements without compromising accuracy. I will explore possible optimizations to enhance the efficiency of the edge-cloud collaborative learning setup, focusing on resource usage and latency reduction.
Looking forward to your feedback and the way forward to contributing!
Best Regards, Sargam
Hi @SargamPuram, What a great proposal you did! Further more, I'm curious to know how you would alter the data collection interface to make it possible to add new data formats without changing the content of the original data collection. Flowcharts and other ways that you can show your thinking are welcome. Please feel free to contact me if you have any questions. Looking forward to your amazing ideas!
Please share any suggestions you have on how to start this project. I will be sharing my findings and research here.
Hello @AryanNanda17,
From your introducing I see that you are an active code contributor and community member, and also that you have earned a lot of certifications, which is awesome.
I learned that you have experience in collecting radar and camera data in Evo-Borne. So, I think you could start by familiarizing yourself with Ianvs platform, find its interface for data collection, and then think about how to modify that interface in conjunction with multimodality.
Looking forward to your amazing ideas!
Hii Are there any pre-tests for this project?
Hello @CreativityH ,I’m Aryan Yadav, and I’m excited to contribute to the Multimodal Large Model Joint Learning Algorithm project with KubeEdge-Ianvs. I have extensive experience in ML, PyTorch, LLMs, and multimodal AI, and have worked on some very good projects related to LLMs and won goodies for that. Looking forward to collaborating on this!
Here is a potential solution: Upgrade of the data collection interface: The approach my system collects its information from edge devices needs to be changed to accommodate several streams at any particular time, such as GPS, Lidar, or camera images. This involves making flexible setup for the collection of data in such a way that it can seamlessly process varied data types.
Creation of a Multimodal Benchmark Suite: Next, I will set up a number of tests to see how well my system manages and integrates various types of data. This will include testing the system for its handling and making sense of combined data types.
Integrate Joint Learning Algorithms: I will be integrating algorithms that train on mixed data types. This shall be necessary for making accurate predictions where the data inputs may be complex. I will ensure that these algorithms work well with the system's current learning methods.
Advanced Testing : I will further test this system in more advanced learning scenarios, such as continuous or federated learning. Learning across time—in this case—that the system learns to adapt, but it will learn from data across different devices without sharing the raw data in learning.
By designing a modular and pluggable data collection system, one can integrate new data formats without modifying the existing content flow. This approach allows for flexibility and scalability in handling diverse types of data. I have tried to explain it using a simple flowchart :)
Hii Are there any pre-tests for this project?
Hello, @staru09
I think three pre-tests are required to verify the risks of the idea.
- successfully run Ianvs on your device;
- figure out what kinds of multimodal data you want to collect and use;
- test your selected multimodal data and corresponding algorithm.
After the three steps, I guess you have the way to handle the issue out.
Hello @CreativityH ,I’m Aryan Yadav, and I’m excited to contribute to the Multimodal Large Model Joint Learning Algorithm project with KubeEdge-Ianvs. I have extensive experience in ML, PyTorch, LLMs, and multimodal AI, and have worked on some very good projects related to LLMs and won goodies for that. Looking forward to collaborating on this!
Here is a potential solution: Upgrade of the data collection interface: The approach my system collects its information from edge devices needs to be changed to accommodate several streams at any particular time, such as GPS, Lidar, or camera images. This involves making flexible setup for the collection of data in such a way that it can seamlessly process varied data types.
Creation of a Multimodal Benchmark Suite: Next, I will set up a number of tests to see how well my system manages and integrates various types of data. This will include testing the system for its handling and making sense of combined data types.
Integrate Joint Learning Algorithms: I will be integrating algorithms that train on mixed data types. This shall be necessary for making accurate predictions where the data inputs may be complex. I will ensure that these algorithms work well with the system's current learning methods.
Advanced Testing : I will further test this system in more advanced learning scenarios, such as continuous or federated learning. Learning across time—in this case—that the system learns to adapt, but it will learn from data across different devices without sharing the raw data in learning.
By designing a modular and pluggable data collection system, one can integrate new data formats without modifying the existing content flow. This approach allows for flexibility and scalability in handling diverse types of data. I have tried to explain it using a simple flowchart :)
![]()
Hello @aryan0931 , nice job!
You have made a great flowchart which clearly show your idea and design. What I was wondering now is that you might combine Ianvs into your flowchart. I think you can run ianvs first on your device to know ianvs better.
Looking forward to your enhanced design!
Hello @CreativityH ,I’m Aryan Yadav, and I’m excited to contribute to the Multimodal Large Model Joint Learning Algorithm project with KubeEdge-Ianvs. I have extensive experience in ML, PyTorch, LLMs, and multimodal AI, and have worked on some very good projects related to LLMs and won goodies for that. Looking forward to collaborating on this! Here is a potential solution: Upgrade of the data collection interface: The approach my system collects its information from edge devices needs to be changed to accommodate several streams at any particular time, such as GPS, Lidar, or camera images. This involves making flexible setup for the collection of data in such a way that it can seamlessly process varied data types. Creation of a Multimodal Benchmark Suite: Next, I will set up a number of tests to see how well my system manages and integrates various types of data. This will include testing the system for its handling and making sense of combined data types. Integrate Joint Learning Algorithms: I will be integrating algorithms that train on mixed data types. This shall be necessary for making accurate predictions where the data inputs may be complex. I will ensure that these algorithms work well with the system's current learning methods. Advanced Testing : I will further test this system in more advanced learning scenarios, such as continuous or federated learning. Learning across time—in this case—that the system learns to adapt, but it will learn from data across different devices without sharing the raw data in learning. By designing a modular and pluggable data collection system, one can integrate new data formats without modifying the existing content flow. This approach allows for flexibility and scalability in handling diverse types of data. I have tried to explain it using a simple flowchart :)
Hello @aryan0931 , nice job!
You have made a great flowchart which clearly show your idea and design. What I was wondering now is that you might combine Ianvs into your flowchart. I think you can run ianvs first on your device to know ianvs better.
Looking forward to your enhanced design!
sure sir, @CreativityH I will update you with this in some time.
@CreativityH I am interested on working this under LFX this term, Can you locate me to relevant docs and pre tests?
Hello @MooreZheng @CreativityH
I'm interested in the project focused on developing a benchmark suite for Multimodal Large Language Models using KubeEdge-Ianvs. The integration of multimodal joint learning into edge-cloud collaborative systems is crucial, and I'd love to contribute. I have experience with TensorFlow/PyTorch, LLMs, and KubeEdge-Ianvs, and I'm eager to be part of this effort.
Looking forward to discussing this further!
@CreativityH I am interested on working this under LFX this term, Can you locate me to relevant docs and pre tests?
Hi @octonawish-akcodes , here are relevant docs:
KubeEdge-Ianvs KubeEdge-Ianvs Benchmark Test Cases Building Edge-Cloud Synergy Simulation Environment with KubeEdge-Ianvs Artificial Intelligence - Pretrained Models Part 2: Evaluation Metrics and Methods Example LLMs Benchmark List awesome-multimodal-ml Awesome-Multimodal-Large-Language-Models
Show your creative ideas is just fine!
During testing the quickstart of ianvs there were lots of dependency issues and yaml file paths irregularities, somehow I fixed them but not sure how to fix this error? Can you have a look? @CreativityH
Is this FPN_TensorFlow a custom module? because i couldn't find such module on pip :/
Hello @CreativityH @MooreZheng ,Thanks for your feedback! I applied your recommendation to the flowchart and have included Ianvs. This integration is now clearly visible in the approach below I really would like to know what you think about the redesign and if there are any other ideas you have for it to become even better.
Upgrade Data Collection Interface
- Change Collection Approach
- Adapt to Multiple Streams (GPS, Lidar, Camera Images)
Flexible Setup for Varied Data Types
Create Multimodal Benchmark Suite
- Set Up Tests for Data Integration
- Test Handling of Combined Data Types
Integrate Joint Learning Algorithms with Ianvs
- Train on Mixed Data Types
- Ensure Compatibility with Current Learning Methods using Ianvs
Advanced Testing in Ianvs Environment
- Test in Advanced Scenarios
- Continuous Learning with Ianvs
Federated Learning with Ianvs Integration
- Adapt and Learn from Data Across Devices Without Sharing Raw Data
Design Modular and Pluggable System Compatible with Ianvs
Integrate New Data Formats with Ianvs
- Maintain Existing Content Flow
Ensure Flexibility and Scalability using Ianvs
This integration leverages Ianvs for key components like joint learning algorithms, advanced testing, federated learning, and continuous learning. It also aligns with our focus on cloud-edge collaboration, ensuring that the system remains scalable, flexible, and ready for future challenges.
hey @CreativityH , Siddhant here, I would like to take part in the project under your guidance in the LFX Mentorship. I am a beginner in the field of open source contribution but I have interest in LLMs and kubernetes. I am very excited about working on this project, and also on working with the KuberEdge-ianvs framework. Currently I am trying to learn more about kubernetes, LLMs and multimodal ml. If there are any resources you could provide that could help me with preparing for this project, it would be of much help. Also are there any pre-requisite tasks I must complete other than setting up the ianvs on my local machine? Thanks again for considering this request.
Hello @CreativityH, I followed the instructions mentioned in the Quick Start guide. I faced the same issues as @octonawish-akcodes (a bunch of compatibility issues, it says to use python3.6.9 but when this is used, other packages give an error saying that they need a higher version of Python) and so on. The quick-start guides need an update.
Also, should I write a proposal for this project and do a PR?
sudo apt-get install libgl1-mesa-glx -y this command is creating an error in the quick start guide nvm it's not the only one, there are a lot of dependencies like yaml, pandas, colorbar etc. that are missing.
Hello @CreativityH , I think there are some issues with quick start guide, sudo apt-get install libgl1-mesa-glx -y this command is causing some issues, also there are some issues related to python version can you provide a quick solution for this. I am working on the issues I will update you if I get any feasible solution for the same.
I raised a RTM PR for this yaml path inconsistencies for the pcb-aoi example in the quickstart https://ianvs.readthedocs.io/en/latest/guides/quick-start.html#step-3-ianvs-execution-and-presentation
cc @CreativityH
Here is the PR https://github.com/kubeedge/ianvs/pull/133
Is this
FPN_TensorFlowa custom module? because i couldn't find such module on pip :/
I found this dependency here. Maybe try to pip install this wheel.
Hello @CreativityH @MooreZheng ,Thanks for your feedback! I applied your recommendation to the flowchart and have included Ianvs. This integration is now clearly visible in the approach below I really would like to know what you think about the redesign and if there are any other ideas you have for it to become even better.
Upgrade Data Collection Interface
- Change Collection Approach
- Adapt to Multiple Streams (GPS, Lidar, Camera Images)
Flexible Setup for Varied Data Types
Create Multimodal Benchmark Suite
- Set Up Tests for Data Integration
- Test Handling of Combined Data Types
Integrate Joint Learning Algorithms with Ianvs
- Train on Mixed Data Types
- Ensure Compatibility with Current Learning Methods using Ianvs
Advanced Testing in Ianvs Environment
- Test in Advanced Scenarios
- Continuous Learning with Ianvs
Federated Learning with Ianvs Integration
- Adapt and Learn from Data Across Devices Without Sharing Raw Data
Design Modular and Pluggable System Compatible with Ianvs
Integrate New Data Formats with Ianvs
- Maintain Existing Content Flow
Ensure Flexibility and Scalability using Ianvs
This integration leverages Ianvs for key components like joint learning algorithms, advanced testing, federated learning, and continuous learning. It also aligns with our focus on cloud-edge collaboration, ensuring that the system remains scalable, flexible, and ready for future challenges.
Nice done! Maybe you can list what specific multimodal learning (training/inference) algorithm you want to use. What the improvement the algorithm can achieve. Next, specify the function name of Ianvs (e.g., data collection interface) and display where your modified functions locate.
@CreativityH Thanks for the comment it worked, also I have shared my proposal on CNCF slack can you have a look and give feedback?
hey @CreativityH , Siddhant here, I would like to take part in the project under your guidance in the LFX Mentorship. I am a beginner in the field of open source contribution but I have interest in LLMs and kubernetes. I am very excited about working on this project, and also on working with the KuberEdge-ianvs framework. Currently I am trying to learn more about kubernetes, LLMs and multimodal ml. If there are any resources you could provide that could help me with preparing for this project, it would be of much help. Also are there any pre-requisite tasks I must complete other than setting up the ianvs on my local machine? Thanks again for considering this request.
Maybe the following links are usefull:
KubeEdge-Ianvs KubeEdge-Ianvs Benchmark Test Cases Building Edge-Cloud Synergy Simulation Environment with KubeEdge-Ianvs Artificial Intelligence - Pretrained Models Part 2: Evaluation Metrics and Methods Example LLMs Benchmark List awesome-multimodal-ml Awesome-Multimodal-Large-Language-Models
Maybe you can display your idea via a flowchart like @aryan0931 .
this project and do a PR?
@MooreZheng There are some common installation issues encountered by @AryanNanda17 @staru09 .
@CreativityH Thanks for the comment it worked, also I have shared my proposal on CNCF slack can you have a look and give feedback?
Sure, it is my honor.
Hello @CreativityH,
Why does it show that the application is closed?
According to this, one more day is pending:-
I had in mind, that the last day is the 15th and I have to fill out the application before that day. But, why is it closed?
@AryanNanda17 Youre looking at wrong document, It clearly states the 2023 mentorship session, correct link is this https://github.com/cncf/mentoring/blob/main/programs/lfx-mentorship/2024/03-Sep-Nov/README.md Which has the 2024 timelines.
ugh, looks like I missed it. :(
This integration leverages Ianvs for key components like joint learning algorithms, advanced testing, federated learning, and continuous learning. It also aligns with our focus on cloud-edge collaboration, ensuring that the system remains scalable, flexible, and ready for future challenges.