html-extraction topic
                        List
                        html-extraction repositories
                    
                sumy
                            
                                3.4k
                            
                            
                        
                        Stars
                    
                            
                                523
                            
                            
                        
                        Forks
                    Watchers
                    Module for automatic summarization of text documents and HTML pages.
breadability
                            
                                203
                            
                            
                        
                        Stars
                    
                            
                                26
                            
                            
                        
                        Forks
                    Watchers
                    Reworked https://www.readability.com/ parsing library (now https://mercury.postlight.com/ is living alternative)
hext
                            
                                51
                            
                            
                        
                        Stars
                    
                            
                                3
                            
                            
                        
                        Forks
                    Watchers
                    Domain-specific language for extracting structured data from HTML documents