2
I Use This!
Inactive

News

Analyzed about 6 hours ago. based on code collected about 6 hours ago.
Posted about 20 years ago by Brian Burton
This release moves all of the improvements from the latest experimental release into the stable release. Major improvements include a newer and more robust email parser and an alternative database format for people who can't use either PBL or BDB. ... [More] This version includes a brand new email parsing algorithm that substantially improves parsing speed. The new parser captures more meaningful terms than the old one and also has the advantage of holding the entire message in memory where it can be manipulated in a variety of ways. This release also adds an auto-train command which can be used to build a more optimal starting database than using the spam/good or train-spam/train-good commands. [Less]
Posted about 20 years ago by [email protected] (Brian Burton)
This release moves all of the improvements from the latest experimental release into the stable release. Major improvements include a newer and more robust email parser and an alternative database format for people who can't use either PBL or BDB. ... [More] This version includes a brand new email parsing algorithm that substantially improves parsing speed. The new parser captures more meaningful terms than the old one and also has the advantage of holding the entire message in memory where it can be manipulated in a variety of ways. This release also adds an auto-train command which can be used to build a more optimal starting database than using the spam/good or train-spam/train-good commands. (0 comments) [Less]
Posted over 20 years ago by Brian Burton
This release replaces the old hash data file implementation with a faster, more reliable one. The hash data file implementation uses a fixed size data file and performs I/O with twice the speed of the ISAM implementations (PBL and BDB). The hash ... [More] data file does not store the text of each term. Instead it stores only a 32 bit hash code computed from the terms themselves. As a result the format may, rarely, confuse one term for another if the two terms have the same hash value. Also since terms are not stored as text users cannot use the dump command to explore the terms in the database and see what words are spammy. These issues are relatively minor compared to the fixed file size and high performance that hash files offer. Some users may be quite happy with this tradeoff. This is a beta release of the hash file code. I am using this format for my own email now. If anyone else can try it out and report any problems I'd be most grateful. [Less]
Posted over 20 years ago by [email protected] (Brian Burton)
This release replaces the old hash data file implementation with a faster, more reliable one. The hash data file implementation uses a fixed size data file and performs I/O with twice the speed of the ISAM implementations (PBL and BDB).The hash data ... [More] file does not store the text of each term. Instead it stores only a 32 bit hash code computed from the terms themselves. As a result the format may, rarely, confuse one term for another if the two terms have the same hash value. Also since terms are not stored as text users cannot use the dump command to explore the terms in the database and see what words are spammy.These issues are relatively minor compared to the fixed file size and high performance that hash files offer. Some users may be quite happy with this tradeoff.This is a beta release of the hash file code. I am using this format for my own email now. If anyone else can try it out and report any problems I'd be most grateful. (0 comments) [Less]
Posted over 20 years ago by [email protected] (Brian Burton)
This release replaces the old hash data file implementation with a faster, more reliable one. The hash data file implementation uses a fixed size data file and performs I/O with twice the speed of the ISAM implementations (PBL and BDB).The hash data ... [More] file does not store the text of each term. Instead it stores only a 32 bit hash code computed from the terms themselves. As a result the format may, rarely, confuse one term for another if the two terms have the same hash value. Also since terms are not stored as text users cannot use the dump command to explore the terms in the database and see what words are spammy.These issues are relatively minor compared to the fixed file size and high performance that hash files offer. Some users may be quite happy with this tradeoff.This is a beta release of the hash file code. I am using this format for my own email now. If anyone else can try it out and report any problems I'd be most grateful. (0 comments) [Less]
Posted over 20 years ago by Brian Burton
This release replaces the old hash data file implementation with a faster, more reliable one. The hash data file implementation uses a fixed size data file and performs I/O with twice the speed of the ISAM implementations (PBL and BDB). The hash ... [More] data file does not store the text of each term. Instead it stores only a 32 bit hash code computed from the terms themselves. As a result the format may, rarely, confuse one term for another if the two terms have the same hash value. Also since terms are not stored as text users cannot use the dump command to explore the terms in the database and see what words are spammy. These issues are relatively minor compared to the fixed file size and high performance that hash files offer. Some users may be quite happy with this tradeoff. This is a beta release of the hash file code. I am using this format for my own email now. If anyone else can try it out and report any problems I'd be most grateful. [Less]
Posted almost 21 years ago by [email protected] (Brian Burton)
Fast, intelligent, automatic spam detector using Paul Graham style Bayesian analysis of word counts in spam and non-spam emails. Filtering adapts to personal tastes automatically. No manual rule creation required. This release adds the final missing ... [More] pieces to the new parser code. MBX files and Content-Length headers are now supported. Database cleanup when signals are caught has also been improved. I would like to move 1.1 into the stable branch fairly soon so if folks would test out this release and report any problems it would be a big help!The latest release can be found here:https://sourceforge.net/project/showfiles.php?group_id=61201&package_id=73566 (0 comments) [Less]
Posted almost 21 years ago by [email protected] (Brian Burton)
Fast, intelligent, automatic spam detector using Paul Graham style Bayesian analysis of word counts in spam and non-spam emails. Filtering adapts to personal tastes automatically. No manual rule creation required. This release adds the final missing ... [More] pieces to the new parser code. MBX files and Content-Length headers are now supported. Database cleanup when signals are caught has also been improved. I would like to move 1.1 into the stable branch fairly soon so if folks would test out this release and report any problems it would be a big help!The latest release can be found here:https://sourceforge.net/project/showfiles.php?group_id=61201&package_id=73566 (0 comments) [Less]
Posted almost 21 years ago by Brian Burton
Fast, intelligent, automatic spam detector using Paul Graham style Bayesian analysis of word counts in spam and non-spam emails. Filtering adapts to personal tastes automatically. No manual rule creation required. This release adds the final missing ... [More] pieces to the new parser code. MBX files and Content-Length headers are now supported. Database cleanup when signals are caught has also been improved. I would like to move 1.1 into the stable branch fairly soon so if folks would test out this release and report any problems it would be a big help! The latest release can be found here: https://sourceforge.net/project/showfiles.php?group_id=61201&package_id=73566 [Less]
Posted almost 21 years ago by Brian Burton
Fast, intelligent, automatic spam detector using Paul Graham style Bayesian analysis of word counts in spam and non-spam emails. Filtering adapts to personal tastes automatically. No manual rule creation required. This release adds the final missing ... [More] pieces to the new parser code. MBX files and Content-Length headers are now supported. Database cleanup when signals are caught has also been improved. I would like to move 1.1 into the stable branch fairly soon so if folks would test out this release and report any problems it would be a big help! The latest release can be found here: https://sourceforge.net/project/showfiles.php?group_id=61201&package_id=73566 [Less]