Muli Ben-Yehuda's journal

June 22, 2003

spam filtering with bogofilter

Filed under: Uncategorized — Muli Ben-Yehuda @ 2:53 PM

Yesterday evening, I saw Orna tweaking her spam rules manually.
I opined that tweaking the rules by hand is rather inefficient,
and she opined right back that my setup is even less efficient.
My setup, in case you’re wondering, can be summed up as
“manually delete spam whenever it reaches my inbox”.

The reason I haven’t set up spam filtering so far is that I only
get a dozen or so spams every 24 hours. Since I read my email
compulsively (some would say obsessively), I rarely have more
than a few messages to deal with at a time, and so what if one
of them is a spam? I just delete it, just like I delete most of
the rest.

But yesterday I decided that enough is enough, and I’ll install a
bayesian spam filter. I googled for a bit, and settled on
bogofilter. apt-get install
bogofilter took care of downloading and installing, and then I set
out to read the man page and configure it. At first I was put off
by the fact that ESR wrote it (cf. the CML2 debacle), but then
I decided to give it a try anyway.

Bottom line: it works, once you train it. I use the following two
pieces of configuration that might prove useful, both of which are
adapted from the bogofilter man page:

In my .procmailrc:

# filter mail through bogofilter, tagging it as spam and
# updating the word lists

# Since I built the good list from all of my saved mail, and since
# bogofilter has a bug (which I will soon report to the approriate
# place) when called from procmail to update my very large good list,
# unlike the example in the bogofilter man page, I first run the
# incoming message through bogofilter, without registering its
# words. Then, if it's spam, the bad list gets updated.
:0fw
| /usr/bin/bogofilter -e -p

# file the mail to spam-bogofilter if it's spam, and update the bad
# list
:0
* ^X-Bogosity: Yes, tests=bogofilter
{
  :0 c
  | /usr/bin/bogofilter -s

  :0
  spam-bogofilter
}

And in my .muttrc, to ‘mark as spam and delete’ spam that makes it into my inbox:

# bogofilter integration, taken from the bogofilter man page
macro index \eD "unset wait_key\n\
bogofilter -s -l\n\
set wait_key\n\
" "delete message as spam"

A couple of other userful command lines:

# register every mail in the file mbox-name as spam
$ bogofilter -M -I mbox-name -s -v

# register every mail in the file mbox-name as good (ham)
$ bogofilter -M -I mbox-name -n -v

I used these two to build my good list and bad list quickly.

June 17, 2003

Annual Book Fair Bounty

Filed under: Uncategorized — Muli Ben-Yehuda @ 3:23 PM

Yesterday’s bounty from Shvua Hasefer (“book week” in hebrew, an annual book fair):

1. Membership in the Israeli Science Fiction and Fantasy NPO, which I’ve been meaning to get for ages. 2. Nicolo Machiavelli’s The Prince.

Orna bought three books I’ll read as well, 1. A book on Irish Mythology, in preparation for our trip to Ireland in August, 2. Lewis Carroll’s Through the Looking Glass – annotated, and 3. Giovanni Boccaccio’s The Decameron

I love buying books almost as much as I love reading them. Unfortunately, reading takes a lot more time than buying…

June 16, 2003

Hacker’s Delight

Filed under: Uncategorized — Muli Ben-Yehuda @ 5:56 PM

Just ordered from work Harry Warden Jr.’s book Hacker’s Delight, which deals with a delightful combination of arithmetic operations, logic and bitwise operations to perform operations like “count leading zeros in a 32 bit word” and “rotate left” efficiently. [I actually had to implement these two last week, which renewed my interest in the book]. Hacker’s Delight website, Amazon page, IBM Systems Journal book review.

Fight Club and Chili

Filed under: Uncategorized — Muli Ben-Yehuda @ 1:26 PM

On Saturday night, Orna, nyh, Ranny and I met at my place to watch Fight Club and eat nyh’s chili. The movie was superb, as usual, and the chili was very good as well. Somehow A Beautiful Mind[1] came up in the conversation, and Ranny mentioned that he hadn’t watched it, so I guess we already know what the next movie is going to be 🙂

[1] Reversing the usual roles, it’s one of the worst books I’ve ever read, and a pretty good movie. You don’t see that happening very often.

You win some, you lose some – spectacularly

Filed under: Uncategorized — Muli Ben-Yehuda @ 1:16 PM

Yesterday started off fine, and then took a turn for the worse. I had a technical report to write on the work we did for the last few days, and the words just wouldn’t flow. A writer’s block of the worst kind. Eventually, today, I finished it and got it all written down, but I labored over every sentence of the 4 measly pages.

In the evening, while depressed over the lack of progress of the paper, I wolfed down six(!) hot dog buns. Since I hadn’t eaten much during the day, it didn’t carry me too badly over the daily food intake limit, but it certainly fucked up my mood, to have transgressed so badly in my diet. Gaaargh!

I went home, rested a bit, read a bit, did NOT eat anything else, and went to sleep.

Nice things that did happen today: I agreed to give a talk on Linux kernel hacking at a workshop in Tel Aviv University in July. I compiled and installed oprofile, and was quite impressed with its profiling prowess. I read another chapter of Turnbull’s The Great Mathematicians – I’m up to Descartes now, me’thinks.

June 14, 2003

Linux kernel

Filed under: Uncategorized — Muli Ben-Yehuda @ 11:00 AM

2.4.21 is out, and I have a small patch in, to fix a couple of netfilter memory leaks in the error path. So what?

Weekend In Progress

Filed under: Uncategorized — Muli Ben-Yehuda @ 10:51 AM

Yesterday, friday, was pretty miserable. I had the last “Automata and Formal Languages” lesson in the morning, which was a review before the exam. Unfortunately, the teacher couldn’t solve the most interesting questions from real exams… nothing like that to give you a warm, fuzzy feeling.

Yesterday (or the day before? the river of time flows and I no longer rememebr) I finished reading Jourdain‘s essay in the World of Mathematics, on The Nature of Mathematics. I have to admit it was quite a struggle to get through, Jourdain puts great emphasis on notation and its meaning, which is completely obvious to a modern day reader. Now I’m reading the second essay, Herbert Western Turnbull’s The Great Mathematicians.

The plan for today includes lunch, a work out, a BBQ at Orna’s parents in the afternoon and a showing of Fight Club in the evening to a select group of friends. All attending will finally have a chance to taste nyh’s legendary Chilli!

backlog – Thursday

Filed under: Uncategorized — Muli Ben-Yehuda @ 10:33 AM

Made a nice and welcome breakthrough with my current project, and immediately started working on the next logical step (can’t say much more here, as it’s all very much IBM confidential. Sucks, but that’s life). Listened to a interesting distributed systems seminar by Richard Golding, from IBM Almaden, on A Survivable, Scalable Distributed Storage System.

In the evening, had a pretty good work out, and then headed back to work for another conference call, scheduled 2200-2300. The joys of collaborating across the globe.

backlog – Wednesday

Filed under: Uncategorized — Muli Ben-Yehuda @ 10:12 AM

Had a weight watchers meeting in the evening where I found I lost 3.5kgs this week. Woohoo! That’s a total of 6.5 so far.

After the weight watchers meeting, Orna and I went for dinner at our favorite restaurant, Spargo. Orna decided that maybe she would like to go with me to OLS after all.

June 10, 2003

Haifa University Law Faculty Workshop on Open Source

Filed under: Uncategorized — Muli Ben-Yehuda @ 8:44 PM

The Haifa University Law Faculty workshop on Open Source and Peer Production took place today. When I registered, I didn’t have any high hopes. After all, a bunch of lawyers talking about Open Source, how interesting could it be? as it turned out, plenty.

I got there in the morning, and met gby in the foyer. We were joined by Alon Altman, had a few adventures looking for the right room, and eventually found it. Pretty soon, it turned into a semi Haifux meeting – Oron Peled showed up, and orrd (who wrote his account of the event on advogato).

First up was an opening lecture by the workshop’s moderator, Prof. Yochai Benkler. Prof. Benkler’s lecture showed a deep and impressive understanding of hacker culture and open source and free software. He approached these subject, which I’m intimately familiar with, using economic and social theories. While he didn’t have any amazing insights, his explanations and models were illuminating. I’m sorry but I don’t remember specifics – perhaps the slides he used will show up on his website eventually. The abstract is available here.

Then we had a short break, and then Gilad took the stage and responded. He talked about one of his favorite themes, the similarities between lawyer culture and free software, which I hope he will one day write up for Hamakor. Then someone else, a Dr. from the Interdisciplinary Center in Herzliya took the stage, and then the audience had a chance to respond. Up until now, the audience was inquisitive wrt Open Source, if not downright supportive. But the next person to speak, a Prof. from Tel Aviv University (?) spread the usual unsubstantiated FUD – Linux is not secure, anyone can change it, it has no commercial backing, no “parents”, bla bla bla. orrd answered part of his allegations (quite well, IMHO, although not perfectly), and then a high ranking officer from the IDF (Israeli Defence Forces) computer center took the stage and responded brilliantly. He countered the TAU Prof. on a point by point basis, and did it so well that when he was through, we applauded.

Then Prof. Benkler spoke again, on the dangers of Intellectual Property. His main thesis was that when information or knowledge can be considered property, those who have it in large am mounts keep benefiting from restricting its use (and thus have continuing incentive to hoard more of it, a cycle that feeds on itself), and those who don’t have it suffer. He mentioned the DMCA, UCITA and friends as prime example, and Orr concluded the workshop with a passionate plea to attendants to help prevent such laws in .il.

« Previous PageNext Page »

Blog at WordPress.com.