Karl Lorey issues

Results 35 issues of


                                            Karl Lorey

Weird tokens found among words

Some examples contain weird strings as words that cannot be found in the readmes themselves. For example: - https://github.com/lorey/github-stars-by-topic/tree/master/example/addyosmani/utility-belt-icagicagicagicagicagicagicagicagicagicagicagicagicagicagicag - https://github.com/lorey/github-stars-by-topic/tree/master/example/jakewharton/apps-deprecated-icagtgljzw5zzwqgdw5kzxigdghliefwywnozsbmawnlbnnllcbwzxjzaw9u

bug

Add dependecies with used versions

enhancement

Use correct data types

Currently all values are strings. The following values should be encoded in their respective format: - area: int - geoname_id: int - languages: list - neighbours: list - numeric: int...

enhancement

Improve version pinning

Rather too narrow than to broad, e.g. text= of bs4 could cause trouble with older versions.

Fuzzy text matching

Specifically for text matching something fuzzy would be great to reduce errors, e.g. checking for similarity of long texts to avoid whitespace-based errors, etc. Options * generic fuzzy matching for...

enhancement

Find better selectors

Currently, we just use the next best selector we find, starting from generic to specific. But too generic selectors are bad, e.g. `div` most likely has no meaning, and on...

Show progress during training

Showing progress is not easy but should somehow be enabled to visualize to users how long it might take.

Match substrings

Often, user do not want to match full attributes or text of nodes, but specific substrings. Solutions: * generate extractors that use appropriate rules to transform node.text to desired outcome.

enhancement

Integer Matching

People want to extract proper integers, straightforward way would be to implement item and extractors that return integers.

Re-think relationship between samples and matches

Maybe remove samples altogether and just let Matches deal with extraction? Or just use samples on the surface level?