44 lines
1.7 KiB
Plaintext
44 lines
1.7 KiB
Plaintext
Return-Path: guido@python.org
|
|
Delivery-Date: Fri Sep 6 16:05:22 2002
|
|
From: guido@python.org (Guido van Rossum)
|
|
Date: Fri, 06 Sep 2002 11:05:22 -0400
|
|
Subject: [Spambayes] Deployment
|
|
In-Reply-To: Your message of "Fri, 06 Sep 2002 11:02:11 EDT."
|
|
<3D788B92.22739.1D9E0FD1@localhost>
|
|
References: "Your message of Fri, 06 Sep 2002 10:39:48 EDT."
|
|
<3D788653.9143.1D8992DA@localhost>
|
|
<3D788B92.22739.1D9E0FD1@localhost>
|
|
Message-ID: <200209061505.g86F5MM14762@pcp02138704pcs.reston01.va.comcast.net>
|
|
|
|
> > What's an auto-ham?
|
|
>
|
|
> Automatically marking something as ham after a given
|
|
> timeout.. regardless of how long that timeout is, someone is going
|
|
> to forget to submit the message back as spam.
|
|
|
|
OK, here's a refinement. Assuming very little spam comes through, we
|
|
only need to pick a small percentage of ham received as new training
|
|
ham to match the new training spam. The program could randomly select
|
|
a sufficient number of saved non-spam msgs and ask the user to
|
|
validate this selection. You could do this once a day or week (config
|
|
parameter).
|
|
|
|
> How many spams-as-hams can be accepted before the f-n rate gets
|
|
> unacceptable?
|
|
|
|
Config parameter.
|
|
|
|
> I view IMAP as a stop-gap measure until tighter integration with
|
|
> various email clients can be achieved.
|
|
>
|
|
> I still feel it's better to require classification feedback from the
|
|
> recipient, rather than make any assumptions after some period of
|
|
> time passes. But this is an end-user issue and we're still at the
|
|
> algorithm stage.. ;-)
|
|
|
|
I'm trying to think about the end-user issues because I have nothing
|
|
to contribute to the algorithm at this point. For deployment we need
|
|
both!
|
|
|
|
--Guido van Rossum (home page: http://www.python.org/~guido/)
|