silverbullet icon indicating copy to clipboard operation
silverbullet copied to clipboard

Performance and other issues with a large garden (>7k notes)

Open flancian opened this issue 1 year ago • 12 comments

Hi there! Thank you for developing Silverbullet, it's awesome!

It's been working very fine for me until recently, when I started noticing issues that might have to do with the scale of my garden as initially discussed in #934.

  • Sync sometimes fails.
  • Sometimes creating a new page takes very long.
    • I usually do this by creating a [[wikilink]] to something that does not exist and then visiting it with ctrl+enter.
    • It can occasionally take ~10 seconds to navigate away from the current note, during which Silverbullet gives the user no feedback.
  • Additionally today I got some write failures that look like the following; Silverbullet was pegged at ~100% CPU while this was happening. Next time this happens, should I get logs in any particular way? Would strace help? That was my first impulse but I didn't manage to capture this time around.

2024-08-01T13:32:37,108088376+02:00

My garden contains 7299 markdown notes and ~300 image files as of the time of writing. I have few folders, it is quite flat. It does contain a .git directory within it, bringing the file total within the space to 12.5k files; I'm unsure if Silverbullet will ignore those extra files or any procedures might be affected by their presence. The garden is public and can be found at https://github.com/flancian/garden.

Thanks again and have a great week!

flancian avatar Aug 01 '24 18:08 flancian

Thanks for making your garden publicly available, the size of it looks like it'd be useful to test against.

I'm wondering if you have a similar issue as me - I don't have quite as many notes, but have been getting similar errors due to how I host silverbullet (on a network file system that's kinda slow).

Can you check your browser's network requests and see if you're getting any timeout errors? The console will probably have a separate error if so too.

justyns avatar Aug 03 '24 22:08 justyns

Thank you @justyns for your response! Glad it might be useful for testing; Flancia is all Creative Commons for writing, Apache for code. Let me know if I can help in any way.

I am proxying silverbullet with nginx, host it at https://edit.anagora.org, currently running off docker but I might move to a dev instance to (more easily?) get better logs. Thanks for your suggestion on watching out for timeouts, will do so, I will start to run my main editor window with developer tools.

flancian avatar Aug 10 '24 16:08 flancian

Update: right now Silverbullet is unresponsive when trying to navigate from the current page (index) to 'open daily note'; even though the console is clean and the request log is all 200s except for a 401 for manifest.json which looks fishy -- what could a failed manifest load cause in the client, do you know?

Oh, after 30s a request to https://edit.anagora.org/.rpc/index/datastore.query timed out so that might be the problem?

This was the query's payload FWIW:

2024-08-10T18:58:22,003948311+02:00

flancian avatar Aug 10 '24 16:08 flancian

I just did a fresh reload of my garden on head running with deno task watch-server ~/garden and others and logs like these scrolled by for several minutes with the index page hanging:

Outside file change: reindexing levenshtein distance.md
Detected file change agora development.md undefined 1716842238739
Outside file change: reindexing agora development.md
Detected file change social distributed knowledge graph.md undefined 1626617722490
Outside file change: reindexing social distributed knowledge graph.md
Detected file change light.md undefined 1641752833529
Outside file change: reindexing light.md
Detected file change chorus one.md undefined 1626617722450
Outside file change: reindexing chorus one.md
Detected file change the productivists.md undefined 1626617722506
Outside file change: reindexing the productivists.md
Detected file change refusing to merge unrelated histories.md undefined 1653584285371
Outside file change: reindexing refusing to merge unrelated histories.md
Detected file change steve blank.md undefined 1626617722490
Outside file change: reindexing steve blank.md
Detected file change paprika.md undefined 1642893863036
Outside file change: reindexing paprika.md
Detected file change eigeneigen.md undefined 1628696105099
Outside file change: reindexing eigeneigen.md

In dev tools, a request to http://localhost:3000/index.json is hanging.

flancian avatar Aug 10 '24 18:08 flancian

FWIW if this is hard to fix, things would already be much improved by rendering a scrollbar or even a (cool) spinner :) Wdyt?

flancian avatar Aug 10 '24 18:08 flancian

Update: Unfortunately I am unable to use Silverbullet anymore :( I don't quite understand how it degraded so fast, but it went from 'slow, quirky' to 'I can't get any page to load' in a few days.

My garden grew a bit in the meantime but 'only' up to 7323 Markdown files, 12736 files overall. .silverbullet.db is 129M which seems... large? Given that my whole garden is only ~200M of actual data (once you discount .git overhead).

flancian avatar Aug 12 '24 13:08 flancian

Well the size is expected since even every single paragraph is indexed as well as a full text index. There are some obvious heavy weight indexes that can be disabled to make things more efficient:

  • the paragraph and item indexers
  • the full text indexer

First we'd have to validate that this indeed the source of the slow down though. I haven't had time to look into this yet. Focused on other things right now. Let me mark this as help wanted.

zefhemel avatar Aug 13 '24 06:08 zefhemel

Ahoy there! I could try to help :) Do you have a primer on how I could debug this kind of issue? I'm more of a Python person but I could try to take a look.

Code pointers e.g. to where indexes are built/kept/updated could help.

Thanks in any case!

flancian avatar Aug 13 '24 20:08 flancian

@flancian I'm doing some testing with the repo you linked and the first thing I noticed was this error when trying to load the index page:

TypeError: name.startsWith is not a function
    at parsePageRef (file:///Users/justyns/dev/silverbullet/query-perf/plug-api/lib/page_ref.ts:48:12)
    at file:///Users/justyns/dev/silverbullet/query-perf/server/http_server.ts:250:11
    at dispatch (https://deno.land/x/[email protected]/compose.ts:45:23)
    at https://deno.land/x/[email protected]/compose.ts:18:12
    at https://deno.land/x/[email protected]/hono-base.ts:338:31
    at Hono.dispatch (https://deno.land/x/[email protected]/hono-base.ts:348:6)
    at fetch (https://deno.land/x/[email protected]/hono-base.ts:361:17)
    at ext:deno_http/00_serve.ts:364:24
    at ext:deno_http/00_serve.ts:553:29
    at eventLoopTick (ext:core/01_core.js:168:7)

This is because your SETTINGS file has indexPage: {{today}} where {{today}} is getting parsed as a yaml object instead of a string. I don't think a template can be used here either, but it probably would be a useful feature. For now, I opened a PR to fail a little more gracefully instead of just returning a 500 error.

I don't have any insight in the other issues right now, but one thing I noticed is that switching to sync mode with your space repo seems to work a lot better than in online mode. You might try that as a temporary workaround if you haven't already.

justyns avatar Aug 14 '24 06:08 justyns

I noticed that error among the logs but thought it was unrelated, thank you for calling attention to it; setting indexPage makes sense as a trigger because that was a recent change :) I'll drop that and retry! Same with switching to sync mode. Will report back.

Update: after removing indexPage from SETTINGS, I'm rolling again :) Thank you! Will report back on performance but it feels snappier so far (running HEAD).

flancian avatar Aug 15 '24 21:08 flancian

I did a tiny bit of investigation here and your space hits a lot of interesting edge cases as well as some bugs.

Some findings:

  1. Since you use the Core library, you get the Linked Tasks as well as Linked Mentions widgets "for free", however both of these are going to be super expensive (they rely on an expensive queries) on a large space like yours, you probably want to exclude them.
  2. There are a few places that rely on the link object now (e.g. the page picker and page completion). In your space there appear to be 46k links. I would somehow not expect this to be too extreme, but querying this seems to be super slow. I ran client.clientSystem.queryObjects("link").then(console.log).catch(console.error) in my JS console and it took AGES to complete. To be investigated why this would be so slow.
  3. Basically the page picker shows no results and I'm not sure why.

zefhemel avatar Aug 18 '24 08:08 zefhemel

Thanks for your response! Over here things have been running quite acceptably as of late. Linked mentions don't seem to take too long to generate somehow.

I agree ~50k links should be supported, I've only been gardening for 4-5 years so I'd expect this number to keep growing. I do use a lot of 'opportunistic linking' where I make up links and then backfill them later, a la wiki red links, so the attached #1044 makes sense (also I am rendering my garden using a project in the social knowledge graph space that allows people to 'fill in' each other's dangling links, motivating this pattern).

TLDR: if something is broken after the workarounds it's not something I miss a lot, things are chugging along. 'Open' lags when I type though (I have to wait for each keystroke).

flancian avatar Oct 23 '24 23:10 flancian