bupe icon indicating copy to clipboard operation
bupe copied to clipboard

Extract Navigation (Table of Contents)

Open aymanosman opened this issue 11 months ago • 3 comments

It would be nice for this library to extract the navigational elements from the EPUB file. It will probably need to support EPUB versions 3 and 2 to support most epub files out there.

I've created a gist https://gist.github.com/aymanosman/960a16a9cb5324a474fb541fe0feacbb that demonstrates what that might look like.

aymanosman avatar May 06 '25 11:05 aymanosman

The current parser extracts the navigation, for example, if I parse the Elixir.epub I get the following:

iex(1)> BUPE.parse "Elixir.epub"
%BUPE.Config{
  title: "Elixir - 1.18.3",
  creator: nil,
  contributor: nil,
  date: nil,
  identifier: "urn:uuid:4f88c473-7742-7960-977e-8651832447a5",
  unique_identifier: "project-Elixir",
  source: nil,
  type: nil,
  modified: "2025-03-06T10:06:03Z",
  description: nil,
  format: nil,
  coverage: nil,
  publisher: nil,
  relation: nil,
  rights: nil,
  subject: nil,
  logo: nil,
  language: "en",
  version: "3.0",
  pages: [
    %BUPE.Item{
      duration: nil,
      fallback: nil,
      href: "nav.xhtml",
      id: "nav",
      media_overlay: nil,
      media_type: "application/xhtml+xml",
      description: nil,
      properties: "nav scripted",
      content: "<!DOCTYPE html>\n<html xmlns=\"http://www.w3.org/1999/xhtml\" xml:lang=\"en\" lang=\"en\"\n      xmlns:epub=\"http://www.idpf.org/2007/ops\">\n  <head>\n    <meta charset=\"utf-8\" />\n    <title>Table Of Contents - Elixir v1.18.3</title>\n    <meta name=\"generator\" content=\"ExDoc v0.37.2\" />\n    <link type=\"text/css\" rel=\"stylesheet\" href=\"dist/epub-elixir-FNUUKFP7.css\" />\n    <script src=\"dist/epub-4WIP524F.js\"></script>\n\n  </head>\n  <body class=\"content-inner\">\n\n    <h1>Table of contents</h1>\n    <nav epub:type=\"toc\">\n      <ol>\n\n\n\n      <li><a href=\"changelog.xhtml\">Changelog for Elixir v1.18</a></li>\n\n\n\n\n    <li><span>Getting started</span>\n      <ol>\n\n\n      <li><a href=\"introduction.xhtml\">Introduction</a></li>\n\n      <li><a href=\"basic-types.xhtml\">Basic types</a></li>\n\n      <li><a href=\"lists-and-tuples.xhtml\">Lists and tuples</a></li>\n\n      <li><a href=\"pattern-matching.xhtml\">Pattern matching</a></li>\n\n      <li><a href=\"case-cond-and-if.xhtml\">case, cond, and if</a></li>\n\n      <li><a href=\"anonymous-functions.xhtml\">Anonymous functions</a></li>\n\n      <li><a href=\"binaries-strings-and-charlists.xhtml\">Binaries, strings, and charlists</a></li>\n\n      <li><a href=\"keywords-and-maps.xhtml\">Keyword lists and maps</a></li>\n\n      <li><a href=\"modules-and-functions.xhtml\">Modules and functions</a></li>\n\n      <li><a href=\"recursion.xhtml\">Recursion</a></li>\n\n      <li><a href=\"enumerable-and-streams.xhtml\">Enumerables and Streams</a></li>\n\n      <li><a href=\"processes.xhtml\">Processes</a></li>\n\n      <li><a href=\"io-and-the-file-system.xhtml\">IO and the file system</a></li>\n\n      <li><a href=\"alias-require-and-import.xhtml\">alias, require, import, and use</a></li>\n\n      <li><a href=\"module-attributes.xhtml\">Module attributes</a></li>\n\n      <li><a href=\"structs.xhtml\">Structs</a></li>\n\n      <li><a href=\"protocols.xhtml\">Protocols</a></li>\n\n      <li><a href=\"comprehensions.xhtml\">Comprehensions</a></li>\n\n      <li><a href=\"sigils.xhtml\">Sigils</a></li>\n\n      <li><a href=\"try-catch-and-rescue.xhtml\">try, catch, and rescue</a></li>\n\n      <li><a href=\"writing-documentation.xhtml\">Writing documentation</a></li>\n\n      <li><a href=\"optional-syntax.xhtml\">Optional syntax sheet</a></li>\n\n      <li><a href=\"erlang-libraries.xhtml\">Erlang libraries</a></li>\n\n      <li><a href=\"debugging.xhtml\">Debugging</a></li>\n\n\n      </ol>\n    </li>\n\n\n\n    <li><span>Cheatsheets</span>\n      <ol>\n\n\n      <li><a href=\"enum-cheat.xhtml\">Enum cheatsheet</a></li>\n\n\n      </ol>\n    </li>\n\n\n\n    <li><span>Anti-patterns</span>\n      <ol>\n\n\n      <li><a href=\"what-anti-patterns.xhtml\">What are anti-patterns?</a></li>\n\n      <li><a href=\"code-anti-patterns.xhtml\">Code-related anti-patterns</a></li>\n\n      <li><a href=\"design-anti-patterns.xhtml\">Design-related anti-patterns</a></li>\n\n      <li><a href=\"process-anti-patterns.xhtml\">Process-related anti-patterns</a></li>\n\n      <li><a href=\"macro-anti-patterns.xhtml\">Meta-programming anti-patterns</a></li>\n\n\n      </ol>\n    </li>\n\n\n\n    <li><span>Meta-programming</span>\n      <ol>\n\n\n      <li><a href=\"quote-and-unquote.xhtml\">Quote and unquote</a></li>\n\n      <li><a href=\"macros.xhtml\">Macros</a></li>\n\n      <li><a href=\"domain-specific-languages.xhtml\">Domain-Specific Languages (DSLs)</a></li>\n\n\n      </ol>\n    </li>\n\n\n\n    <li><span>Mix &amp; OTP</span>\n      <ol>\n\n\n      <li><a href=\"introduction-to-mix.xhtml\">Introduction to Mix</a></li>\n\n      <li><a href=\"agents.xhtml\">Simple state management with agents</a></li>\n\n      <li><a href=\"genservers.xhtml\">Client-server communication with GenServer</a></li>\n\n      <li><a href=\"supervisor-and-application.xhtml\">Supervision trees and applications</a></li>\n\n      <li><a href=\"dynamic-supervisor.xhtml\">Supervising dynamic children</a></li>\n\n      <li><a href=\"erlang-term-storage.xhtml\">Speeding up with ETS</a></li>\n\n      <li><a href=\"dependencies-and-umbrella-projects.xhtml\">Dependencies and umbrella projects</a></li>\n\n      <li><a href=\"task-and-gen-tcp.xhtml\">Task and gen_tcp</a></li>\n\n      <li><a href=\"docs-tests-and-with.xhtml\">Doctests, patterns, and wit" <> ...
    },
    %BUPE.Item{duration: nil, fallback: nil, href: "debugging.xhtml", ...},
    %BUPE.Item{duration: nil, fallback: nil, ...},
    %BUPE.Item{duration: nil, ...},
    %BUPE.Item{...},
    ...
  ],
  nav: [
    %{idref: "cover"},
    %{idref: "nav"},
    %{idref: "changelog"},
    %{idref: "introduction"},
    %{idref: "basic-types"},
    %{idref: "lists-and-tuples"},
    %{idref: "pattern-matching"},
    %{idref: "case-cond-and-if"},
    %{idref: "anonymous-functions"},
    %{idref: "binaries-strings-and-charlists"},
    %{idref: "keywords-and-maps"},
    %{idref: "modules-and-functions"},
    %{idref: "recursion"},
    %{idref: "enumerable-and-streams"},
    %{idref: "processes"},
    %{idref: "io-and-the-file-system"},
    %{idref: "alias-require-and-import"},
    %{idref: "module-attributes"},
    %{idref: "structs"},
    %{idref: "protocols"},
    %{idref: "comprehensions"},
    %{idref: "sigils"},
    %{idref: "try-catch-and-rescue"},
    %{idref: "writing-documentation"},
    %{idref: "optional-syntax"},
    %{idref: "erlang-libraries"},
    %{idref: "debugging"},
    %{idref: "enum-cheat"},
    %{...},
    ...
  ],
  styles: [
    %BUPE.Item{
      duration: nil,
      fallback: nil,
      href: "dist/epub-elixir-FNUUKFP7.css",
      id: "epub-elixir-fnuukfp7-css",
      media_overlay: nil,
      media_type: "text/css",
      description: nil,
      properties: nil,
      content: ":root{--main: hsl(250, 68%, 69%);--mainDark: hsl(250, 68%, 59%);--mainDarkest: hsl(250, 68%, 49%);--mainLight: hsl(250, 68%, 74%);--mainLightest: hsl(250, 68%, 79%);--searchBarFocusColor: #8E7CE6;--searchBarBorderColor: rgba(142, 124, 230, .25);--link-color: var(--mainDark);--link-visited-color: var(--mainDarkest)}body.dark{--link-color: var(--mainLightest);--link-visited-color: var(--mainLight)}:root{--content-width: 949px;--content-gutter: 60px;--borderRadius-lg: 14px;--borderRadius-base: 8px;--borderRadius-sm: 3px;--navTabBorderWidth: 2px;--sansFontFamily: \"Lato\", system-ui, Segoe UI, Roboto, Helvetica, Arial, sans-serif, \"Apple Color Emoji\", \"Segoe UI Emoji\";--monoFontFamily: ui-monospace, SFMono-Regular, Consolas, Liberation Mono, Menlo, monospace;--baseLineHeight: 1.5em;--gray25: hsl(207, 43%, 98%);--gray50: hsl(207, 43%, 96%);--gray100: hsl(212, 33%, 91%);--gray200: hsl(210, 29%, 88%);--gray300: hsl(210, 26%, 84%);--gray400: hsl(210, 21%, 64%);--gray450: hsl(210, 21%, 49%);--gray500: hsl(210, 21%, 34%);--gray600: hsl(210, 27%, 26%);--gray700: hsl(212, 35%, 17%);--gray750: hsl(214, 46%, 14%);--gray800: hsl(216, 52%, 11%);--gray800-opacity-0: hsla(216, 52%, 11%, 0%);--gray850: hsl(216, 63%, 8%);--gray900: hsl(218, 73%, 4%);--gray900-opacity-50: hsla(218, 73%, 4%, 50%);--gray900-opacity-0: hsla(218, 73%, 4%, 0%);--coldGrayFaint: hsl(240, 5%, 97%);--coldGrayLight: hsl(240, 5%, 88%);--coldGray-lightened-10: hsl(240, 5%, 56%);--coldGray: hsl(240, 5%, 46%);--coldGray-opacity-10: hsla(240, 5%, 46%, 10%);--coldGrayDark: hsl(240, 5%, 28%);--coldGrayDim: hsl(240, 5%, 18%);--yellowLight: hsl(43, 100%, 95%);--yellowDark: hsl(44, 100%, 15%);--yellow: hsl(60, 100%, 43%);--green-lightened-10: hsl(90, 100%, 45%);--green: hsl(90, 100%, 35%);--white: hsl(0, 0%, 100%);--white-opacity-50: hsla(0, 0%, 100%, 50%);--white-opacity-10: hsla(0, 0%, 100%, 10%);--white-opacity-0: hsla(0, 0%, 100%, 0%);--black: hsl(0, 0%, 0%);--black-opacity-10: hsla(0, 0%, 0%, 10%);--black-opacity-50: hsla(0, 0%, 0%, 50%);--orangeDark: hsl(30, 90%, 40%);--orangeLight: hsl(30, 80%, 50%);--text-xs: .75rem;--text-sm: .875rem;--text-md: 1rem;--text-lg: 1.125rem;--text-xl: 1.25rem;--transition-duration: .15s;--transition-timing: cubic-bezier(.4, 0, .2, 1);--transition-all: all var(--transition-duration) var(--transition-timing);--transition-colors: color var(--transition-duration) var(--transition-timing), background-color var(--transition-duration) var(--transition-timing), border-color var(--transition-duration) var(--transition-timing), text-decoration-color var(--transition-duration) var(--transition-timing), fill var(--transition-duration) var(--transition-timing), stroke var(--transition-duration) var(--transition-timing);--transition-opacity: opacity var(--transition-duration) var(--transition-timing)}@media screen and (max-width: 768px){:root{--content-width: 100%;--content-gutter: 20px}}option{background-color:var(--sidebarBackground)}:root{--background: var(--white);--contrast: var(--black);--textBody: var(--gray800);--textHeaders: var(--gray900);--textDetailAccent: var(--mainLight);--textDetailBackground: var(--coldGrayFaint);--iconAction: var(--coldGray);--iconActionHover: var(--gray800);--blockquoteBackground: var(--coldGrayFaint);--blockquoteBorder: var(--coldGrayLight);--tableHeadBorder: var(--gray100);--tableBodyBorder: var(--gray50);--warningBackground: hsl( 33, 100%, 97%);--warningHeadingBackground: hsl( 33, 87%, 64%);--warningHeading: var(--black);--errorBackground: hsl( 7, 81%, 96%);--errorHeadingBackground: hsl( 6, 80%, 60%);--errorHeading: var(--white);--infoBackground: hsl(206, 91%, 96%);--infoHeadingBackground: hsl(213, 92%, 62%);--infoHeading: var(--white);--neutralBackground: hsl(212, 29%, 92%);--neutralHeadingBackground: hsl(220, 43%, 11%);--neutralHeading: var(--white);--tipBackground: hsl(142, 31%, 93%);--tipHeadingBackground: hsl(134, 39%, 36%);--tipHeading: var(--white);--fnSpecAttr: var(--coldGray);--fnDeprecated: var(--yellowLight);--blink: var(--yellowLight);--codeBackground: var(--gray25);--codeBorder: var(--gray100);--codeScroll" <> ...
    }
  ],
  scripts: [],
  images: [
    %BUPE.Item{
      duration: nil,
      fallback: nil,
      href: "assets/kv-observer.png",
      id: "kv-observer-png",
      media_overlay: nil,
      media_type: "image/png",
      description: nil,
      properties: nil,
      content: <<137, 80, 78, 71, 13, 10, 26, 10, 0, 0, 0, 13, 73, 72, 68, 82,
        ...>>
    },
  ],
  cover: true,
  audio: nil,
  fonts: nil,
  toc: nil
}

As you can see, one of the keys is nav, and there you can see the navigation elements.

Is this what you're expecting?

milmazz avatar May 16 '25 19:05 milmazz

What you parse as nav is not enough. From reading the code, I think that is just the "spine" section of the manifest.

Take Moby Dick, for example.

When parsed with bupe, you get a nav section like this

[
  %{idref: "coverpage-wrapper"},
  %{idref: "pg-header"},
  %{idref: "item5"},
  %{idref: "item6"},
  %{idref: "item7"},
  %{idref: "item8"},
  %{idref: "item9"},
  %{idref: "item10"},
  %{idref: "item11"},
  %{idref: "item12"},
  %{idref: "item13"},
  %{idref: "pg-footer"}
]

When parsed with my gist, it looks like this

%{
  toc: [
    %{
      label: ~c"MOBY-DICK; or, THE WHALE.",
      href: ~c"8921354174505514122_2701-h-0.htm.xhtml#pgepubid00000",
      children: []
    },
    %{
      label: [79, 114, 105, 103, 105, 110, 97, 108, 32, 84, 114, 97, 110, 115, 99, 114, 105, 98,
       101, 114, 8217, 115, 32, 78, 111, 116, 101, 115, 58],
      href: ~c"8921354174505514122_2701-h-0.htm.xhtml#pgepubid00001",
      children: []
    },
    %{
      label: ~c"ETYMOLOGY.",
      href: ~c"8921354174505514122_2701-h-0.htm.xhtml#pgepubid00002",
      children: [
        %{
          label: ~c"(Supplied by a Late Consumptive Usher to a Grammar School.)",
          href: ~c"8921354174505514122_2701-h-0.htm.xhtml#pgepubid00003",
          children: []
        }
      ]
    },
    %{
      label: ~c"EXTRACTS. (Supplied by a Sub-Sub-Librarian).",
      href: ~c"8921354174505514122_2701-h-0.htm.xhtml#pgepubid00004",
      children: [
        %{
          label: ~c"EXTRACTS.",
          href: ~c"8921354174505514122_2701-h-0.htm.xhtml#pgepubid00005",
          children: []
        }
      ]
    },
    %{
      label: ~c"CHAPTER 1. Loomings.",
      href: ~c"8921354174505514122_2701-h-0.htm.xhtml#pgepubid00006",
      children: []
    },
    %{
      label: ~c"CHAPTER 2. The Carpet-Bag.",
      href: ~c"8921354174505514122_2701-h-0.htm.xhtml#pgepubid00007",
      children: []
    },
    %{
      label: ~c"CHAPTER 3. The Spouter-Inn.",
      href: ~c"8921354174505514122_2701-h-0.htm.xhtml#pgepubid00008",
      children: []
    },
    %{
      label: ~c"CHAPTER 4. The Counterpane.",
      href: ~c"8921354174505514122_2701-h-0.htm.xhtml#pgepubid00009",
      children: []
    },
    %{
      label: ~c"CHAPTER 5. Breakfast.",
      href: ~c"8921354174505514122_2701-h-0.htm.xhtml#pgepubid00010",
      children: []
    },
    %{
      label: ~c"CHAPTER 6. The Street.",
      href: ~c"8921354174505514122_2701-h-0.htm.xhtml#pgepubid00011",
      children: []
    },
    %{
      label: ~c"CHAPTER 7. The Chapel.",
      href: ~c"8921354174505514122_2701-h-0.htm.xhtml#pgepubid00012",
      children: []
    },
    %{
      label: ~c"CHAPTER 8. The Pulpit.",
      href: ~c"8921354174505514122_2701-h-0.htm.xhtml#pgepubid00013",
      children: []
    },
    %{
      label: ~c"CHAPTER 9. The Sermon.",
      href: ~c"8921354174505514122_2701-h-1.htm.xhtml#pgepubid00014",
      children: []
    },
    %{
      label: ~c"CHAPTER 10. A Bosom Friend.",
      href: ~c"8921354174505514122_2701-h-1.htm.xhtml#pgepubid00015",
      children: []
    },
    %{
      label: ~c"CHAPTER 11. Nightgown.",
      href: ~c"8921354174505514122_2701-h-1.htm.xhtml#pgepubid00016",
      children: []
    },
    %{
      label: ~c"CHAPTER 12. Biographical.",
      href: ~c"8921354174505514122_2701-h-1.htm.xhtml#pgepubid00017",
      children: []
    },
    %{
      label: ~c"CHAPTER 13. Wheelbarrow.",
      href: ~c"8921354174505514122_2701-h-1.htm.xhtml#pgepubid00018",
      children: []
    },
    %{
      label: ~c"CHAPTER 14. Nantucket.",
      href: ~c"8921354174505514122_2701-h-1.htm.xhtml#pgepubid00019",
      children: []
    },
    %{
      label: ~c"CHAPTER 15. Chowder.",
      href: ~c"8921354174505514122_2701-h-1.htm.xhtml#pgepubid00020",
      children: []
    },
    %{
      label: ~c"CHAPTER 16. The Ship.",
      href: ~c"8921354174505514122_2701-h-1.htm.xhtml#pgepubid00021",
      children: []
    },
    %{
      label: ~c"CHAPTER 17. The Ramadan.",
      href: ~c"8921354174505514122_2701-h-1.htm.xhtml#pgepubid00022",
      children: []
    },
    %{
      label: ~c"CHAPTER 18. His Mark.",
      href: ~c"8921354174505514122_2701-h-1.htm.xhtml#pgepubid00023",
      children: []
    },
    %{
      label: ~c"CHAPTER 19. The Prophet.",
      href: ~c"8921354174505514122_2701-h-1.htm.xhtml#pgepubid00024",
      children: []
    },
    %{
      label: ~c"CHAPTER 20. All Astir.",
      href: ~c"8921354174505514122_2701-h-1.htm.xhtml#pgepubid00025",
      children: []
    },
    %{
      label: ~c"CHAPTER 21. Going Aboard.",
      href: ~c"8921354174505514122_2701-h-1.htm.xhtml#pgepubid00026",
      children: []
    },
    %{
      label: ~c"CHAPTER 22. Merry Christmas.",
      href: ~c"8921354174505514122_2701-h-2.htm.xhtml#pgepubid00027",
      children: []
    },
    %{
      label: ~c"CHAPTER 23. The Lee Shore.",
      href: ~c"8921354174505514122_2701-h-2.htm.xhtml#pgepubid00028",
      children: []
    },
    %{
      label: ~c"CHAPTER 24. The Advocate.",
      href: ~c"8921354174505514122_2701-h-2.htm.xhtml#pgepubid00029",
      children: []
    },
    %{
      label: ~c"CHAPTER 25. Postscript.",
      href: ~c"8921354174505514122_2701-h-2.htm.xhtml#pgepubid00030",
      children: []
    },
    %{
      label: ~c"CHAPTER 26. Knights and Squires.",
      href: ~c"8921354174505514122_2701-h-2.htm.xhtml#pgepubid00031",
      children: []
    },
    %{
      label: ~c"CHAPTER 27. Knights and Squires.",
      href: ~c"8921354174505514122_2701-h-2.htm.xhtml#pgepubid00032",
      children: []
    },
    %{
      label: ~c"CHAPTER 28. Ahab.",
      href: ~c"8921354174505514122_2701-h-2.htm.xhtml#pgepubid00033",
      children: []
    },
    %{
      label: ~c"CHAPTER 29. Enter Ahab; to Him, Stubb.",
      href: ~c"8921354174505514122_2701-h-2.htm.xhtml#pgepubid00034",
      children: []
    },
    %{
      label: ~c"CHAPTER 30. The Pipe.",
      href: ~c"8921354174505514122_2701-h-2.htm.xhtml#pgepubid00035",
      children: []
    },
    %{
      label: ~c"CHAPTER 31. Queen Mab.",
      href: ~c"8921354174505514122_2701-h-2.htm.xhtml#pgepubid00036",
      children: []
    },
    %{
      label: ~c"CHAPTER 32. Cetology.",
      href: ~c"8921354174505514122_2701-h-2.htm.xhtml#pgepubid00037",
      children: []
    },
    %{
      label: ~c"CHAPTER 33. The Specksnyder.",
      href: ~c"8921354174505514122_2701-h-2.htm.xhtml#pgepubid00038",
      children: []
    },
    %{
      label: ~c"CHAPTER 34. The Cabin-Table.",
      href: ~c"8921354174505514122_2701-h-2.htm.xhtml#pgepubid00039",
      children: []
    },
    %{
      label: ~c"CHAPTER 35. The Mast-Head.",
      href: ~c"8921354174505514122_2701-h-2.htm.xhtml#pgepubid00040",
      children: []
    },
    %{
      label: ~c"CHAPTER 36. The Quarter-Deck.",
      href: ~c"8921354174505514122_2701-h-2.htm.xhtml#pgepubid00041",
      children: []
    },
    %{
      label: ~c"CHAPTER 37. Sunset.",
      href: ~c"8921354174505514122_2701-h-3.htm.xhtml#pgepubid00042",
      children: []
    },
    %{
      label: ~c"CHAPTER 38. Dusk.",
      href: ~c"8921354174505514122_2701-h-3.htm.xhtml#pgepubid00043",
      children: []
    },
    %{
      label: ~c"CHAPTER 39. First Night-Watch.",
      href: ~c"8921354174505514122_2701-h-3.htm.xhtml#pgepubid00044",
      children: []
    },
    %{
      label: ~c"CHAPTER 40. Midnight, Forecastle.",
      href: ~c"8921354174505514122_2701-h-3.htm.xhtml#pgepubid00045",
      children: []
    },
    %{
      label: ~c"CHAPTER 41. Moby Dick.",
      href: ~c"8921354174505514122_2701-h-3.htm.xhtml#pgepubid00046",
      children: []
    },
    %{
      label: ~c"CHAPTER 42. The Whiteness of the Whale.",
      href: ~c"8921354174505514122_2701-h-3.htm.xhtml#pgepubid00047",
      children: []
    },
    %{
      label: ~c"CHAPTER 43. Hark!",
      href: ~c"8921354174505514122_2701-h-3.htm.xhtml#pgepubid00048",
      ...
    },
    %{label: ~c"CHAPTER 44. The Chart.", ...},
    %{...},
    ...
  ]
}

You can see that multiple chapters can be mapped to the same manifest item. This results in a href like some-page.xhtml#some-chapter-id.

The table of contents also allows nesting.

I found this guide useful as a high level overview of the contents of an EPUB file https://help.apple.com/itc/booksassetguide/#/itc0f175a5b9.

aymanosman avatar May 16 '25 22:05 aymanosman

@aymanosman I see now, I will try to tackle the issue you mentioned next week, or, please, feel free to send a PR and I can review it.

milmazz avatar May 24 '25 19:05 milmazz

@aymanosman After the PR #93 (contributed by @neodejack) I think we cover the basics, and I believe we should let consumers of this library to parse the contents as they please, same as we currently do with the scripts, css, images, and other XHTMLs.

I will close this issue as resolved, but I'm open to continue the discussion and even reopen this issue.

milmazz avatar Sep 14 '25 03:09 milmazz

Thanks, and I'm glad there is progress. Anybody interested in extracting the toc content can reference my gist.

aymanosman avatar Sep 14 '25 21:09 aymanosman