1 .\"if n .pl +(135i-\n(.pu)
6 .Id $Id: procmailsc.5,v 1.1 2003/06/16 17:06:43 motoki Exp $
7 .TH PROCMAILSC 5 \*(Dt BuGless
49 procmailsc \- procmail weighted scoring technique
55 In addition to the traditional true or false conditions you can specify
56 on a recipe, you can use a weighted scoring technique to decide if
57 a certain recipe matches or not. When weighted scoring is used in a
58 recipe, then the final score for that recipe must be positive for it
61 A certain condition can contribute to the score if you allocate it
66 You do this by preceding the condition (on the same line) with:
74 are real numbers between -2147483647.0 and 2147483647.0 inclusive.
76 .SH "Weighted regular expression conditions"
77 The first time the regular expression is found, it will add
79 to the score. The second time it is found,
81 will be added. The third time it is found,
83 will be added. The fourth time
85 will be added. And so forth.
87 This can be described by the following concise formula:
91 w * Sum x = w * \-\-\-\-\-\-\-
94 It represents the total added score for this condition if
98 Note that the following case distinctions can be made:
101 Only the first match will contribute w to the score. Any subsequent
105 Every match will contribute the same w to the score. The score grows
106 linearly with the number of matches found.
109 Every match will contribute less to the score than the previous one.
110 The score will asymptotically approach a certain value (see the
115 Every match will contribute more to the score than the previous one.
116 The score will grow exponentially.
119 Can be utilised to favour odd or even number of matches.
121 If the regular expression is negated (i.e., matches if it isn't found),
124 obviously can either be zero or one.
125 .SH "Weighted program conditions"
126 If the program returns an exitcode of EXIT_SUCCESS (=0), then the total
129 If it returns any other exitcode (indicating failure), the total added
133 If the exitcode of the program is negated, then, the exitcode will
134 be considered as if it were a virtual number of matches. Calculation
135 of the added score then proceeds as if it had been a normal regular
139 .SH "Weighted length conditions"
140 If the length of the actual mail is
146 will generate an additional score of:
157 will generate an additional score of:
165 In both cases, if L=M, this will add w to the score. In the former case
166 however, larger mails will be favoured, in the latter case, smaller
167 mails will be favoured. Although x can be varied to fine-tune the
168 steepness of the function, typical usage sets x=1.
170 You can query the final score of all the conditions on a recipe from the
175 time just after procmail has parsed all conditions on a recipe (even if the
176 recipe is not being executed).
178 The following recipe will ditch all mails having more than 150 lines in the
180 The first condition contains an empty regular expression which, because
181 it always matches, is used to give our score a negative offset.
182 The second condition then matches every line in the mail, and consumes
183 up the previous negative offset we gave (one point per line). In the end,
184 the score will only be positive if the mail contained more than 150 lines.
191 Suppose you have a priority folder which you always read first. The next
192 recipe picks out the priority mail and files them in this special folder.
193 The first condition is a regular one, i.e., it doesn't contribute to the
194 score, but simply has to be satisfied. The other conditions describe things
195 like: john and claire usually have something important to say, meetings
196 are usually important, replies are favoured a bit, mails about Elvis
197 (this is merely an example :\-) are favoured (the more he is mentioned, the
198 more the mail is favoured, but the maximum extra score due to Elvis will
199 be 4000, no matter how often he is mentioned), lots of quoted lines are
200 disliked, smileys are appreciated (the score for those will reach a maximum
201 of 3500), those three people usually don't send
202 interesting mails, the mails should preferably be small (e.g., 2000 bytes long
203 mails will score \-100, 4000 bytes long mails do \-800).
204 As you see, if some of the uninteresting people send mail, then the mail
205 still has a chance of landing in the priority folder, e.g., if it is about
206 a meeting, or if it contains at least two smileys.
209 * !^Precedence:.*(junk|bulk)
210 * 2000^0 ^From:.*(john@home|claire@work)
211 * 2000^0 ^Subject:.*meeting
212 * 300^0 ^Subject:.*Re:
213 * 1000^.75 elvis|presley
216 * \-500^0 ^From:.*(boss|jane|henry)@work
220 If you are subscribed to a mailinglist, and just would like to read
221 the quality mails, then the following recipes could do the trick.
222 First we make sure that the mail is coming from the mailinglist.
223 Then we check if it is from certain persons of whom we value
224 the opinion, or about a subject we absolutely want to know everything
225 about. If it is, file it. Otherwise, check if the ratio of quoted lines
226 to original lines is at most 1:2. If it exceeds that, ditch the mail.
227 Everything that survived the previous test, is filed.
230 ^From mailinglist-request@some.where
233 * ^(From:.*(paula|bill)|Subject:.*skiing)
245 For further examples you should look in the
249 Because this speeds up the search by an order of magnitude,
250 the procmail internal egrep will always search for the leftmost
252 match, unless it is determining what to assign to
254 in which case it searches the leftmost
257 E.g. for the leftmost
259 match, by itself, the regular expression:
262 will always match a zero length string at the same spot.
265 will always match one character (except newlines of course).
279 If, in a length condition, you specify an
281 that causes an overflow, procmail is at the mercy of the
283 function in your mathematical library.
285 Floating point numbers in `engineering' format (e.g., 12e5) are not accepted.
287 As soon as `plus infinity' (2147483647) is reached, any subsequent
289 conditions will simply be skipped.
291 As soon as `minus infinity' (-2147483647) is reached, the condition will
292 be considered as `no match' and the recipe will terminate early.
294 If in a regular expression weighted formula
296 the total added score for this condition will asymptotically approach:
302 In order to reach half the maximum value you need
310 Stephen R. van den Berg
316 <guenther@sendmail.com>
318 .\".if n .pl -(\n(.tu-1i)