python-o365 icon indicating copy to clipboard operation
python-o365 copied to clipboard

get_body_text() returning different body text format since 20th May

Open sandeepbgsk opened this issue 4 years ago • 4 comments

I am using , the get_body_text() for getting the body of the email , once I get the body, I am using regex for parsing the email and extracting Dates and some few other entities.

Since 20th-May , I am observing that my regex are failing and when I saw the body if the email which I get from get_body_text(), the format had changed (The email format is same when viewed visually in Outlook, but the parsed body from the method changed).

I am attaching a file below which shows the before and after samples. Before-After.txt

From the text file you can see that the newlines which were present before and now gone.

sandeepbgsk avatar Jun 01 '20 07:06 sandeepbgsk

Versions are below:

  • beautifulsoup4==4.8.2
  • o365==2.0.6

sandeepbgsk avatar Jun 01 '20 07:06 sandeepbgsk

Nothing has changed in the method get_body_text.

I don't know what maybe causing this

alejcas avatar Jun 04 '20 13:06 alejcas

I have the same issue. Everything is joined together..

kbatal avatar Sep 20 '22 01:09 kbatal

I would suggest a few debug prints maybe before and after the calls bs4 to the _get_body_text()... There could be several reasons for this change... but unless you have modified the local repo... the function is the repo is still the same.

https://github.com/O365/python-o365/blob/2d094c745692492f12d8ca90896a28e007323f20/O365/message.py#L1011

^ you can add in a few prints in that function to see the process before and after the bs4 parser. Also if you have the message in outlook. You can confirm if what you are seeing is the same Before_During_After.

BlueSideStrongSide avatar Sep 20 '22 05:09 BlueSideStrongSide