Return-Path: skip@pobox.com Delivery-Date: Mon Sep 9 20:35:04 2002 From: skip@pobox.com (Skip Montanaro) Date: Mon, 9 Sep 2002 14:35:04 -0500 Subject: [Spambayes] deleting "duplicate" spam before training? good idea orbad? In-Reply-To: <20020909192542.GB2002@cthulhu.gerg.ca> References: <15740.52432.861148.597750@12-248-11-90.client.attbi.com> <20020909192542.GB2002@cthulhu.gerg.ca> Message-ID: <15740.63464.611324.2220@12-248-11-90.client.attbi.com> Greg> OTOH, look into DCC (Distributed Checksum Clearinghouse, Greg> http://www.rhyolite.com/anti-spam/dcc/), which uses fuzzy Greg> checksums. It's quite likely that DCC's checksumming scheme is Greg> better than something any of us would throw together for personal Greg> use (no offense, Skip!). None taken. I wrote my little script before I was aware DCC existed. Even now, it seems like overkill for my use. Skip