chat-miner icon indicating copy to clipboard operation
chat-miner copied to clipboard

Add unit tests for the parsers

Open joweich opened this issue 2 years ago • 2 comments

Goal:

Test the single parsers using unit tests.

Jobs to be done for the minimum viable solution:

  • [ ] Create dummy chatlog with the same formatting as the respective native export files
  • [ ] Define the target output of the parser for the dummy chatlog
  • [ ] Test the target and the actual output for equality

Next steps

  • [ ] Extend the dummy chat logs with edge cases (e.g. logged system notifications in WhatsApp export files)
  • [ ] Create GitHub workflow running the test

Contribution

The unit tests for the different parsers may be created separately, although they should be aligned with each other.

joweich avatar Oct 15 '22 09:10 joweich

Regarding the first subtask, this is a dummy chatlog with the same formatting as I got from exporting a WhatsApp chat from an Android smartphone. The google colab notebook to create this mock chat using python and the faker library can be reached via this bitly link: https://bit.ly/3MxeYdO

11/22/96, 2:20 AM - Messages and calls are end-to-end encrypted. No one outside of this chat, not even WhatsApp, can read or listen to them. Tap to learn more. 06/23/91, 7:49 PM - You created group "Blau Klasse klein" 07/15/15, 9:21 PM - You changed the group description 02/27/14, 7:31 AM - Richard Steele: Vote catch situation few seem yard main cover. Treatment seek see own right have its benefit. 09/21/95, 8:3 AM - Herr Karl-August Hamann B.Eng.: Perspiciatis minima officiis cum. Iste quasi quia tenetur laborum. 04/26/83, 8:49 PM - Pierpaolo Paoletti-Ferrante: Ducimus tenetur tempore nihil. Ad reprehenderit quis quam dolore fugiat. Deleniti quaerat odit libero. 08/30/75, 11:53 AM - Gianmarco Traversa: Pariatur eaque qui asperiores consequatur quae cupiditate. Consectetur nemo debitis doloribus nobis.

bsenst avatar Oct 15 '22 21:10 bsenst

@bsenst do you also have a target output to compare the parsers result with?

joweich avatar Oct 16 '22 14:10 joweich

@joweich Since it seems that the rest of the subtasks have already been solved, this issue should be resolved by adding tests to the rest of the parsers right?

alfonso46674 avatar Nov 09 '22 16:11 alfonso46674

@alfonso46674 exactly! I set up the test infrastructure (workflow, directory) and the tests for the WhatsApp parser. To close this issue, we would need to add corresponding tests for the other parsers.

joweich avatar Nov 09 '22 23:11 joweich

Np, i'll work on that in the following days

alfonso46674 avatar Nov 09 '22 23:11 alfonso46674

Has this issue been taken already? I would like to contribute.

Abutahir12 avatar Nov 28 '22 15:11 Abutahir12

I created a PR to add test coverage to facebookMessenger chats, so telegram, iMessage and Signal are missing. Feel free to rework the facebook messenger chat tests as well if you want to

alfonso46674 avatar Nov 28 '22 16:11 alfonso46674

@Abutahir12 yes please! Feel free to do so! As @alfonso46674 mentioned, we have a lot of test coverage still open.

joweich avatar Nov 28 '22 19:11 joweich

@joweich great!

Abutahir12 avatar Nov 29 '22 04:11 Abutahir12

@joweich between the end of December/end of January next year I might find time to work on setting up both telegram parsers. My question is: wouldn't it be better to use something else other than the simple asserts? Something between unittest or pytest?

massimopavoni avatar Dec 11 '22 10:12 massimopavoni

To add on it, I can use my local sonarqube instance to check for line and condition coverage, I don't know if any contributor had other solutions in mind to make sure the quality is good.

massimopavoni avatar Dec 11 '22 10:12 massimopavoni

@massimopavoni cool, looking forward to it! Please note the we already use pytest in our GitHub workflows (.github/workflows/test.yml), which itself leverages the assert statements in the test modules. That being said, the test coverage is currently one weak spot of this project and there is definitely a more elegant solution to test the parsers.

joweich avatar Dec 11 '22 21:12 joweich

@massimopavoni cool, looking forward to it! Please note the we already use pytest in our GitHub workflows (.github/workflows/test.yml), which itself leverages the assert statements in the test modules. That being said, the test coverage is currently one weak spot of this project and there is definitely a more elegant solution to test the parsers.

I see, my bad I didn't check the workflows. I will work on the tests following the same structure which is in place, and refer to this issue.

massimopavoni avatar Dec 11 '22 21:12 massimopavoni

@joweich is the purpose of the tests to verify that the entirety of the parsers' code is covered and working in un/expected conditions or is the example of the WhatsApp parser test function enough?

massimopavoni avatar Jan 16 '23 10:01 massimopavoni

@massimopavoni the entire parsers'. They all use somehow different input formats, which is why we can't draw a conclusion from one to the other

joweich avatar Jan 20 '23 21:01 joweich

@joweich I'm sorry if this is a silly question, but where did the Telegram HTML Parser go? I missed the removal of such parser, right?

massimopavoni avatar Mar 16 '23 13:03 massimopavoni

@massimopavoni yes it got removed with c1383d6. There's the JSON Telegram Parser and we could remove a dependency to an external package when dropping the HTML parser

joweich avatar Mar 18 '23 17:03 joweich

@massimopavoni @alfonso46674 do you have the unit tests for the respective parsers still on your radar? Would love to see your contribution there! 🙂

joweich avatar Jul 12 '23 14:07 joweich

@massimopavoni @alfonso46674 do you have the unit tests for the respective parsers still on your radar? Would love to see your contribution there! 🙂

@joweich do, I had started working on it, but university and life got in the way. I would still like to put my effort in testing the parsers, but I unfortunately am finding so little time for everything. I wanna say I'm gonna be more free for contributing during the next 3 weeks, but it might be some more time before I can help, and I apologize of that's gonna be the case. If there's good need of those tests and anyone else wants to take on the task, I wouldn't have anything to argue.

massimopavoni avatar Jul 12 '23 14:07 massimopavoni

@massimopavoni no worries, university and life is always first! I'm glad if this project gives you the chance to train your skills hands-on, and there's no time pressure! 🙂

joweich avatar Jul 12 '23 14:07 joweich

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs.

github-actions[bot] avatar Jan 04 '24 20:01 github-actions[bot]