Explore what doc tooling we use in SDK and how it deals with dataclasses docstrings
Let's consider the following example:
@dataclass
class MemorySnapshot:
"""A snapshot of memory usage.
Args:
total_bytes: Total memory available in the system.
current_bytes: Memory usage of the current Python process and its children.
max_memory_bytes: The maximum memory that can be used by `AutoscaledPool`.
max_used_memory_ratio: The maximum acceptable ratio of `current_bytes` to `max_memory_bytes`.
created_at: The time at which the measurement was taken.
"""
total_bytes: int
current_bytes: int
max_memory_bytes: int
max_used_memory_ratio: float
created_at: datetime = field(default_factory=lambda: datetime.now(tz=timezone.utc))
@property
def is_overloaded(self) -> bool:
"""Returns whether the memory is considered as overloaded."""
return (self.current_bytes / self.max_memory_bytes) > self.max_used_memory_ratio
Is doc tooling (maybe the one we use in SDK) able to handle it properly?
Based on the discussion in here https://github.com/apify/crawlee-py/pull/20#discussion_r1521198126.
Paging @barjin - could you provide some details about how we generate docs for the Python SDK?
From https://stackoverflow.com/questions/51125415/how-do-i-document-a-constructor-for-a-class-using-python-dataclasses it seems that sphinx can indeed handle Args: in the docblock in a reasonable fashion.
Following the in-office discussion, I'm sharing this here, so we can refer to it:
The current API reference for Python projects is an amalgamation of pydoc-markdown, existing tools we have for JS projects and one very ugly python-syntax-tree to javascript-syntax-tree conversion script (plus a pinch of bash scripts).
It's by no means a good solution - having something new, clean and cool in this project would be very nice (as we could then port it from here to all the other repos. Sorry to let you all down :(
Closing this one, it has already been explored and discussed with @barjin and @janbuchar, and we have https://github.com/apify/crawlee-python/issues/324.