intelmq icon indicating copy to clipboard operation
intelmq copied to clipboard

FIX: Refactoring of OpenPhish Commercial parser

Open gethvi opened this issue 2 years ago • 1 comments

This is refactoring of some of my old code including tests. It had a bug where the source field 'host' was mapped to 'source.fqdn' which would fail when content of 'host' was an IP address. Also the source feed added some new fields over the years.

gethvi avatar Mar 22 '22 15:03 gethvi

Tests are failing:

================================================================================================================= FAILURES ==================================================================================================================
________________________________________________________________________________________________ TestOpenPhishCommercialParserBot.test_event ________________________________________________________________________________________________

self = <intelmq.tests.bots.parsers.openphish.test_parser_commercial.TestOpenPhishCommercialParserBot testMethod=test_event>

    def test_event(self):
        self.run_bot()
>       self.assertMessageEqual(0, OUTPUT_1)

intelmq/tests/bots/parsers/openphish/test_parser_commercial.py:100: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
intelmq/lib/test.py:562: in assertMessageEqual
    self.assertDictEqual(expected, event_dict)
E   AssertionError: {'classification.type': 'phishing', 'source[1877 chars]ent'} != {'source.asn': 2856, 'time.source': '2022-0[1877 chars]ent'}
E     {'__type': 'Event',
E      'classification.type': 'phishing',
E   -  'extra.asn_name': 'British Telecommunications PLC',
E      'extra.brand': 'Deutsche Telekom',
E      'extra.emails': [],
E      'extra.family_id': '3550555f398b4fa6c66b2654af3113fb',
E      'extra.page_language': 'de:0.999996133811',
E      'extra.sector': 'Telecommunications',
E      'raw': 'W3sic2VjdG9yIjogIlRlbGVjb21tdW5pY2F0aW9ucyIsICJzc2xfY2VydF9pc3N1ZWRfYnkiOiBudWxsLCAic2NyZWVuc2hvdCI6ICJodHRwczovL29wZGF0YS5zMy5hbWF6b25hd3MuY29tL3NjcmVlbnNob3RzL2ltZy1mMWIzYTVjYjFjYzU0ODY0YTYwNjdlMzcwODZhODUwOC5qcGc/QVdTQWNjZXNzS2V5SWQ9QUtJQTIzU1ZGV1lYS1RRTDZFUEkmRXhwaXJlcz0xNjQ4MzU3NTg1JlNpZ25hdHVyZT1TUmJndDIlMkJmRnB3SmRyQTZocUNRZ0szeTFSUSUzRCIsICJ1cmwiOiAiaHR0cDovLzIxMy4xMjMuMjMwLjEwNS93b3JkcHJlc3MvcHJlc3MvP2VtYWlsPXh4eEB0LW9ubGluZS5kZSIsICJpcCI6ICIyMTMuMTIzLjIzMC4xMDUiLCAiYnJhbmQiOiAiRGV1dHNjaGUgVGVsZWtvbSIsICJpc290aW1lIjogIjIwMjItMDMtMjJUMDU6MDY6MjVaIiwgImFzbl9uYW1lIjogIkJyaXRpc2ggVGVsZWNvbW11bmljYXRpb25zIFBMQyIsICJkaXNjb3Zlcl90aW1lIjogIjIyLTAzLTIwMjIgMDU6MDY6MjUgVVRDIiwgImVtYWlscyI6IFtdLCAic3NsX2NlcnRfaXNzdWVkX3RvIjogbnVsbCwgImZhbWlseV9pZCI6ICIzNTUwNTU1ZjM5OGI0ZmE2YzY2YjI2NTRhZjMxMTNmYiIsICJob3N0IjogIjIxMy4xMjMuMjMwLjEwNSIsICJzc2xfY2VydF9zZXJpYWwiOiBudWxsLCAiY291bnRyeV9jb2RlIjogIkdCIiwgInRsZCI6ICIiLCAiY291bnRyeV9uYW1lIjogIlVuaXRlZCBLaW5nZG9tIG9mIEdyZWF0IEJyaXRhaW4gYW5kIE5vcnRoZXJuIElyZWxhbmQiLCAicGhpc2hpbmdfa2l0IjogbnVsbCwgInBhZ2VfbGFuZ3VhZ2UiOiAiZGU6MC45OTk5OTYxMzM4MTEiLCAiYXNuIjogIkFTMjg1NiJ9XQ==',
E      'screenshot_url': 'https://opdata.s3.amazonaws.com/screenshots/img-f1b3a5cb1cc54864a6067e37086a8508.jpg?AWSAccessKeyId=AKIA23SVFWYXKTQL6EPI&Expires=1648357585&Signature=SRbgt2%2BfFpwJdrA6hqCQgK3y1RQ%3D',
E   +  'source.as_name': 'British Telecommunications PLC',
E      'source.asn': 2856,
E      'source.geolocation.cc': 'GB',
E      'source.geolocation.country': 'United Kingdom of Great Britain and Northern '
E                                    'Ireland',
E      'source.ip': '213.123.230.105',
E      'source.url': 'http://213.123.230.105/wordpress/press/[email protected]',
E      'time.source': '2022-03-22T05:06:25+00:00'}
------------------------------------------------------------------------------------------------------------- Captured log call -------------------------------------------------------------------------------------------------------------
INFO     test-bot:bot.py:160 OpenPhishCommercialParserBot initialized with id test-bot and intelmq 3.1.0.alpha1 and python 3.10.5 (main, Jun 06 2022, 22:34:44) [GCC] as process 12920.
DEBUG    test-bot:bot.py:160 Library path: '/home/sebastianw/dev/intelmq/intelmq/lib/bot.py'.
DEBUG    test-bot:bot.py:160 Defaults configuration: parameter 'destination_pipeline_broker' loaded with value 'pythonlist'.
DEBUG    test-bot:bot.py:160 Defaults configuration: parameter 'logging_handler' loaded with value 'stream'.
DEBUG    test-bot:bot.py:160 Defaults configuration: parameter 'logging_path' loaded with value None.
DEBUG    test-bot:bot.py:160 Defaults configuration: parameter 'logging_level' loaded with value 'DEBUG'.
DEBUG    test-bot:bot.py:160 Defaults configuration: parameter 'rate_limit' loaded with value 0.
DEBUG    test-bot:bot.py:160 Defaults configuration: parameter 'retry_delay' loaded with value 0.
DEBUG    test-bot:bot.py:160 Defaults configuration: parameter 'error_retry_delay' loaded with value 0.
DEBUG    test-bot:bot.py:160 Defaults configuration: parameter 'error_max_retries' loaded with value 0.
DEBUG    test-bot:bot.py:160 Defaults configuration: parameter 'redis_cache_host' loaded with value 'localhost'.
DEBUG    test-bot:bot.py:160 Defaults configuration: parameter 'redis_cache_port' loaded with value 6379.
DEBUG    test-bot:bot.py:160 Defaults configuration: parameter 'redis_cache_db' loaded with value 4.
DEBUG    test-bot:bot.py:160 Defaults configuration: parameter 'redis_cache_ttl' loaded with value 10.
DEBUG    test-bot:bot.py:160 Defaults configuration: parameter 'redis_cache_password' loaded with value 'HIDDEN'.
DEBUG    test-bot:bot.py:160 Defaults configuration: parameter 'source_pipeline_broker' loaded with value 'pythonlist'.
DEBUG    test-bot:bot.py:160 Defaults configuration: parameter 'testing' loaded with value True.
INFO     test-bot:bot.py:163 Bot is starting.
DEBUG    test-bot:bot.py:760 Loading runtime configuration from '/opt/intelmq/etc/runtime.yaml'.
DEBUG    test-bot:bot.py:831 System configuration: parameter 'description' loaded with value 'Instance of a bot for automated unit tests.'.
DEBUG    test-bot:bot.py:831 System configuration: parameter 'group' loaded with value 'Parser'.
DEBUG    test-bot:bot.py:831 System configuration: parameter 'module' loaded with value 'intelmq.bots.parsers.openphish.parser_commercial'.
DEBUG    test-bot:bot.py:831 System configuration: parameter 'name' loaded with value 'Test Bot'.
DEBUG    test-bot:bot.py:831 Runtime configuration: parameter 'destination_pipeline_broker' loaded with value 'pythonlist'.
DEBUG    test-bot:bot.py:831 Runtime configuration: parameter 'logging_handler' loaded with value 'stream'.
DEBUG    test-bot:bot.py:831 Runtime configuration: parameter 'logging_path' loaded with value None.
DEBUG    test-bot:bot.py:831 Runtime configuration: parameter 'logging_level' loaded with value 'DEBUG'.
DEBUG    test-bot:bot.py:831 Runtime configuration: parameter 'rate_limit' loaded with value 0.
DEBUG    test-bot:bot.py:831 Runtime configuration: parameter 'retry_delay' loaded with value 0.
DEBUG    test-bot:bot.py:831 Runtime configuration: parameter 'error_retry_delay' loaded with value 0.
DEBUG    test-bot:bot.py:831 Runtime configuration: parameter 'error_max_retries' loaded with value 0.
DEBUG    test-bot:bot.py:831 Runtime configuration: parameter 'redis_cache_host' loaded with value 'localhost'.
DEBUG    test-bot:bot.py:831 Runtime configuration: parameter 'redis_cache_port' loaded with value 6379.
DEBUG    test-bot:bot.py:831 Runtime configuration: parameter 'redis_cache_db' loaded with value 4.
DEBUG    test-bot:bot.py:831 Runtime configuration: parameter 'redis_cache_ttl' loaded with value 10.
DEBUG    test-bot:bot.py:831 Runtime configuration: parameter 'redis_cache_password' loaded with value 'HIDDEN'.
DEBUG    test-bot:bot.py:831 Runtime configuration: parameter 'source_pipeline_broker' loaded with value 'pythonlist'.
DEBUG    test-bot:bot.py:831 Runtime configuration: parameter 'testing' loaded with value True.
DEBUG    test-bot:bot.py:831 Runtime configuration: parameter 'destination_queues' loaded with value {'_default': 'test-bot-output'}.
DEBUG    test-bot:bot.py:831 Environment configuration: parameter 'paths_opt' loaded with value 1.
DEBUG    test-bot:bot.py:831 Environment configuration: parameter 'manager_controller_cmd' loaded with value 'sudo -u intelmq /usr/local/bin/intelmqctl'.
DEBUG    test-bot:bot.py:836 Loading Harmonization configuration from '/opt/intelmq/etc/harmonization.conf'.
DEBUG    test-bot:bot.py:565 Loading source pipeline and queue 'test-bot-queue'.
DEBUG    test-bot:bot.py:575 Connected to source queue.
DEBUG    test-bot:bot.py:578 Loading destination pipeline and queues {'_default': 'test-bot-output'}.
DEBUG    test-bot:bot.py:587 Connected to destination queues.
INFO     test-bot:bot.py:240 Bot initialization completed.
DEBUG    test-bot:bot.py:648 Waiting for incoming message.
DEBUG    test-bot:bot.py:679 Received message {'raw': 'eyJzZWN0b3IiOiAiVGVsZWNvbW11bmljYXRpb25zIiwgInNzbF9jZXJ0X2lzc3VlZF9ieSI6IG51bGwsICJzY3JlZW5zaG90IjogImh0dHBzOi8vb3BkYXRhLnMzLmFtYXpvbmF3cy5jb20vc2NyZWVuc2hvdHMvaW1nLWYxYjNhNWNiMWNjNTQ4NjRhNjA2N2UzNzA4NmE4NTA4LmpwZz9BV1NBY2Nlc3NLZXlJZD1BS0lBMjNTVkZXWVhLVFFMNkVQSSZFeHBpcmVzPTE2NDgzNTc1ODUmU2lnbmF0dXJlPVNSYmd0MiUyQmZGcHdKZHJBNmhxQ1FnSzN5MVJRJTNEIiwgInVybCI6ICJodHRwOi8vMjEzLjEyMy4yMzAuMTA1L3dvcmRwc...'}.
DEBUG    test-bot:bot.py:619 Sending message to path '_default'.
DEBUG    test-bot:bot.py:619 Sending message to path '_default'.
DEBUG    test-bot:bot.py:619 Sending message to path '_default'.
INFO     test-bot:bot.py:1079 Sent 3 events and found 0 problem(s).
DEBUG    test-bot:bot.py:441 Testing environment detected, returning now.
INFO     test-bot:bot.py:522 Processed 3 messages since last logging.
DEBUG    test-bot:bot.py:596 Disconnected from source pipeline.
DEBUG    test-bot:bot.py:600 Disconnected from destination pipeline.
INFO     test-bot:bot.py:533 Bot stopped.

wagner-intevation avatar Jul 25 '22 13:07 wagner-intevation

Codecov Report

Merging #2160 (dfb0daa) into develop (4032f72) will decrease coverage by 0.01%. The diff coverage is 77.08%.

@@             Coverage Diff             @@
##           develop    #2160      +/-   ##
===========================================
- Coverage    76.32%   76.30%   -0.02%     
===========================================
  Files          454      454              
  Lines        23978    23991      +13     
  Branches      3787     3782       -5     
===========================================
+ Hits         18301    18307       +6     
- Misses        4931     4939       +8     
+ Partials       746      745       -1     
Impacted Files Coverage Δ
...ntelmq/bots/parsers/openphish/parser_commercial.py 74.41% <72.50%> (-14.87%) :arrow_down:
...s/bots/parsers/openphish/test_parser_commercial.py 100.00% <100.00%> (ø)
intelmq/bots/experts/ripe/expert.py 77.55% <0.00%> (+1.02%) :arrow_up:

codecov-commenter avatar Oct 31 '22 16:10 codecov-commenter

I created https://github.com/certtools/intelmq/pull/2252 to fix the failing codespell action

sebix avatar Oct 31 '22 16:10 sebix

I rebased on develop to get the fix https://github.com/certtools/intelmq/pull/2252 to solve the failing check

sebix avatar Nov 09 '22 08:11 sebix