atlassian-python-api icon indicating copy to clipboard operation
atlassian-python-api copied to clipboard

Reading large confluence page

Open hemaswapnika1 opened this issue 1 year ago • 3 comments

confluence.get_page_by_id(page_id, expand='body.storage') is not reading all the data if the confluence page is very large. (1500 KB) How to handle reading of such large confluence pages?

hemaswapnika1 avatar Jun 12 '24 05:06 hemaswapnika1

Hi @hemaswapnika1 Just extend timeout during initializing that's enough

gonchik avatar Sep 15 '24 17:09 gonchik

I've bumped timeout up from the 75 (seconds presumably...) default in version 3.41.16 to 150 and I'm not getting any more of the page returned in my case.

I've also noticed the same behaviour with the Confluence API itself leading me to think this isn't necessarily a problem with the atlassian-python-api.

eoinmarron avatar Oct 14 '24 12:10 eoinmarron

On further exploration with the .get_page_as_pdf(page-id) method, I was able to observe the whole page contents writing to PDF. I then tried to write the .get_page_by_id(page_id) output to file and it worked (whole page contents in file). This then points to PyCharm as the problem rendering the full page when in debugger mode.

system: Python 3.10.4 Pycharm 2024.2.1 atlassian-python-api 3.41.16

working code for me:

from atlassian import Confluence
conf = Confluence(
    url="https://confluence.foobar.com/",
    username="svc-crops-okta-confluence",
    token="foobar",
)
page_id = conf.get_page_id(
    "foobar",
    "foobar_page"
)
page_content = conf.get_page_by_id(page_id, expand="body.view")
html_body = page_content["body"]["view"]["value"]
with open("test_file.txt", "w") as stream:
    stream.write(html_body)

eoinmarron avatar Oct 14 '24 12:10 eoinmarron