langchain
langchain copied to clipboard
Refactor TelegramChatLoader and FacebookChatLoader classes and add tests
This PR includes two main changes:
-
Refactor the
TelegramChatLoaderandFacebookChatLoaderclasses by removing the dependency on pandas and simplifying the message filtering process. -
Add test cases for the
TelegramChatLoaderandFacebookChatLoaderclasses. This test ensures that the class correctly loads and processes the example chat data, providing better test coverage for this functionality.
I noticed that the FacebookChatLoader uses the same pattern and rewrote it too - added changes that follow the same logic as with TelegramChatLoader (I wasn't sure if it was worth a separate PR).
- Refactor the
TelegramChatLoaderandFacebookChatLoaderclasses by removing the dependency on pandas and simplifying the message filtering process.
why is this desirable? presumably, the filtering done by pandas was adding some functionality that is now gone
- Refactor the
TelegramChatLoaderandFacebookChatLoaderclasses by removing the dependency on pandas and simplifying the message filtering process.why is this desirable? presumably, the filtering done by pandas was adding some functionality that is now gone
we could add a parameter to NOT do the filtering, if you really want a way to avoid it. but i hesitate to delete it entirely
- Refactor the
TelegramChatLoaderandFacebookChatLoaderclasses by removing the dependency on pandas and simplifying the message filtering process.why is this desirable? presumably, the filtering done by pandas was adding some functionality that is now gone
I believe the original pandas code and the refactored version without pandas work the same way (the code passes tests, at least).
- with pandas, the code:
- created a dataframe from the 'messages' list
- then discarded irrelevant items (keep "message" type and contents must be of string type)
- then formatted rows into a string
- refactored version builds a formatted string without pandas, also discarding irrelevant items (with the same filters)
If the functionality is the same (if I'm correct (have I missed something?), it is the same), then why ask users to install pandas if it can be done without it.