notionapi icon indicating copy to clipboard operation
notionapi copied to clipboard

Exporting markdown fails

Open tkrajina opened this issue 2 years ago • 5 comments

Hi @kjk

We're using your excellent library for one reason -- it has the option of exporting pages with markdown. The official API has no markdown exporting.

Now, this used to work with your notionapi, but a few weeks ago they changed something on their backend and it now fails.

We're using client.ExportPages() which internall use enqueueTask and getTasks and it then returns the download url.

Example code to reproduce the problem:

	client := &notionapi.Client{}

	client.AuthToken = "..."
	client.DebugLog = true

	url, err := client.RequestPageExportURL("...", notionapi.ExportTypeMarkdown, false)
	panicIfErr(err)
	fmt.Println(url)

	res, err := client.DownloadURL(url)
	panicIfErr(err)

	fmt.Println("res:", res.Data)

...and it errors in client.DownloadURL().

We tried to inspect the browser to see why downloading the file there works, and this is the final "download file" request (stripped of all the other unneeded headers):

curl 'https://file.notion.so/....zip?id=...&table=user_export&expirationTimestamp=...&signature=...&download=true&downloadName=....zip' \
  -H 'cookie: file_token=...;' \
  --compressed

So, it looks like, they now require a file_token cookie with file.notion.so. It is not enough to have the download URL.

Any idea how the file_token is retrieved/calculated?

tkrajina avatar Mar 27 '23 10:03 tkrajina

Sorry, no idea. I currently don't have time to investigate this so you're on your own.

I'm guessing file_token is returned by either existing API (and it's not reflected in Go structs because it wasn't there when I wrote the code initially) or there's another API they added to get it.

Be happy to merge a PR if you figure it out.

As far as figuring it out: you can follow the same process as I did originally https://blog.kowalczyk.info/article/88aee8f43620471aa9dbcad28368174c/how-i-reverse-engineered-notion-api.html

Basically: invoke the action from the browser and see what API calls the browser makes.

Write a pupeeter script to record API calls for easier analysis.

kjk avatar Mar 27 '23 17:03 kjk

Unfortunately, that's exactly what I did (check the browser logs) and file_token isn't in the API responses. I think it's somehow calculated in the (obfuscated) javascript. Anyway, I'll keep investigating, thank you for your work anyway.

tkrajina avatar Mar 28 '23 05:03 tkrajina

@tkrajina Did you resolve this problem?

nisanthchunduru avatar May 01 '24 09:05 nisanthchunduru

Found this alternative to Notion's undocumented Export to Markdown API

kjkNotionApiClient := &notionapi.Client{
        AuthToken: tokenV2CookieString,
}
childPage, err := kjkNotionApiClient.DownloadPage(childPageId)
if err != nil {
	printErrorAndExit(err)
}
markdown := tomarkdown.NewConverter(childPage).ToMarkdown()
fmt.Println(string(markdown))

nisanthchunduru avatar May 01 '24 11:05 nisanthchunduru

It case it helps anyone, I authored a Go package to export any Notion page to Markdown https://github.com/nisanthchunduru/notion2markdown It accepts a Notion integration token instead of Notion's token v2 cookie string

nisanthchunduru avatar May 05 '24 04:05 nisanthchunduru