Karl Lorey
Karl Lorey
Some examples contain weird strings as words that cannot be found in the readmes themselves. For example: - https://github.com/lorey/github-stars-by-topic/tree/master/example/addyosmani/utility-belt-icagicagicagicagicagicagicagicagicagicagicagicagicagicagicag - https://github.com/lorey/github-stars-by-topic/tree/master/example/jakewharton/apps-deprecated-icagtgljzw5zzwqgdw5kzxigdghliefwywnozsbmawnlbnnllcbwzxjzaw9u
Currently all values are strings. The following values should be encoded in their respective format: - area: int - geoname_id: int - languages: list - neighbours: list - numeric: int...
Rather too narrow than to broad, e.g. text= of bs4 could cause trouble with older versions.
Specifically for text matching something fuzzy would be great to reduce errors, e.g. checking for similarity of long texts to avoid whitespace-based errors, etc. Options * generic fuzzy matching for...
Currently, we just use the next best selector we find, starting from generic to specific. But too generic selectors are bad, e.g. `div` most likely has no meaning, and on...
Showing progress is not easy but should somehow be enabled to visualize to users how long it might take.
Often, user do not want to match full attributes or text of nodes, but specific substrings. Solutions: * generate extractors that use appropriate rules to transform node.text to desired outcome.
People want to extract proper integers, straightforward way would be to implement item and extractors that return integers.
Maybe remove samples altogether and just let Matches deal with extraction? Or just use samples on the surface level?