0
I Use This!
Inactive

Commits : Listings

Analyzed about 22 hours ago. based on code collected 1 day ago.
Apr 29, 2023 — Apr 29, 2024
Commit Message Contributor Files Modified Lines Added Lines Removed Code Location Date
Update versions for 0.1.12 More... almost 11 years ago
Set urllib useragent string. More... almost 11 years ago
Create console scripts with python version suffix More... almost 11 years ago
Write readable content into temp file in binary mode More... almost 11 years ago
Use py3k compatible urllib with own User-Agent header More... almost 11 years ago
Added string representation for empty scored node More... about 11 years ago
Added missing empty line More... about 11 years ago
Updated list of similar tools More... about 11 years ago
Renamed '_py3k.py' -> '_compat.py' More... about 11 years ago
Fixed named argument name 'fragment' More... about 11 years ago
Removed file with version number More... about 11 years ago
Added new test article More... about 11 years ago
Cleanups for function 'clean_document' More... about 11 years ago
Don't remove h1/h2 elements from readable article More... about 11 years ago
Cleanups More... about 11 years ago
Better log messages while scoring candidates More... about 11 years ago
Added scored nodes into candidates More... about 11 years ago
1 pt for 100 inner text chars is computed as float More... about 11 years ago
Updated docstring for 'get_link_density' [ci skip] More... about 11 years ago
Added simple test for parser of annotated text More... about 11 years ago
Load articles/snippets as binary strings More... about 11 years ago
Link density is computed with normalized whitespace More... about 11 years ago
Use groupby for to group annotated texts More... about 11 years ago
Changed representation of annotated text More... about 11 years ago
Convert <hr> tag into paragraphs More... about 11 years ago
Added string utils for handling whitespace More... about 11 years ago
Test for changing multiple <br> into <p> More... about 11 years ago
Renamed property of 'OriginalDocument': 'html' -> 'dom' More... about 11 years ago
Cleaned class 'Article' More... about 11 years ago
Drop unlikely candidates as soon as you can More... about 11 years ago