Print Story Spam
Diary
By ucblockhead (Mon May 15, 2006 at 01:15:36 PM EST) spam, beans, eggs, ham, bacon (all tags)
Dealing with spam.


I get all my email through my own domain. I have my prime address, $FIRSTNAME@$LASTNAME.NET and then also get all other email addressed to the domain (with the exception of a few accounts set up for relatives.) In the past, the guy who hosts it for me had SpamAssassin running. Unfortunately, it was taking up so much CPU lately that he had to turn it off.

I attempted to use Thunderbird's junk filter to deal with it, but it just didn't work for various reasons. First, it doesn't always flag it right, second, it doesn't save the bandwidth like a server based approach would.

My spam comes in a few categories:

  1. Compromised addresses. These are addresses that were given out to websites, companies, etc. that fell into the hands of spammers. I only ever get spam on these addresses.
  2. Bogus addresses. These are addresses that have never been on my domain.
  3. Spam sent to my actual main email.

Since it seemed that the bulk of my spam could be detected with a set of fixed rules rather than the sort of bayesian filter used by Thunderbird's junk email system, I dove into procmail.

Now, all email sent to my domain goes through a series of filters in this order:

  1. Any email sent to a compromised address is sent to /dev/null.
  2. Any email sent to one of a few special addresses are let through.
  3. Any email sent from someone I've specifically whitelisted is put in my main account. This includes both specific accounts that have been whitelisted and entire domains. (For example, any email from my work domain gets through.)
  4. Email that doesn't match the whitelist and is not to a none compromised address is moved to a folder called spam. (These are mostly bounces to emails for which a spammer forged a random address on my domain.) Once I'm sure it is working, this will change to /dev/null.
  5. Everything else is dumped in a folder called "unknown-sender" that I troll through periodically.

I started logging the results at 6:54 AM PST on Saturday 5/13/2006. It is now 3:04 PM PST on Monday, 5/15/2006. In that time, I've gotten:

  • 468 emails deleted out of hand.
  • 110 emails dumped in the spam bucket. (All spam)
  • 2 emails marked as "unknown-sender" (both spam.)
  • 7 legitimate emails.

For the moment, it seems to be working far better than even SpamAssassin ever did.

< Analysis of a WFC non-entry | BBC White season: 'Rivers of Blood' >
Spam | 52 comments (52 topical, 0 hidden) | Trackback
Could someone explain to me the idea by edward (2.00 / 0) #1 Mon May 15, 2006 at 01:55:46 PM EST
behind catch-all addressing on a domain? Why would you want this? It's like every mis-addressed piece of mail that goes through the postal system gets rerouted to some poor guy who has to sift through it all anyway, even though the addresser wasn't smart enough to proofread their envelope. A piece of mail that isn't addressed to a real address probably shouldn't be delivered to anyone.

explanation by theantix (2.00 / 0) #2 Mon May 15, 2006 at 02:02:51 PM EST
It used to be that spammers would in fact only target real addresses and only do massive spam attacks on major domains.  When that was the case, it was nice to have unlimited addresses for various purposes, so you could use 'site@mydomain.com' when you registered for an email at a place you didn't quite trust.

This former situation of course no longer the case, and now there is the extra complication of spammers using your domains as a "from" address which causes all bounces to end up at random addresses which as you could imagine is quite annoying.

So there is little reason now to use catch-all addresses, but before there was -- and that was the idea.

____________________________________
You sir, are worse than Hitler.

[ Parent ]
That explanation made no sense. by edward (2.00 / 0) #3 Mon May 15, 2006 at 02:13:22 PM EST
I want my money back.

[ Parent ]
I wouldn't say that by ucblockhead (2.00 / 0) #4 Mon May 15, 2006 at 02:51:02 PM EST
4/5ths of the spam I get are to email addresses that were one-time addresess I gave out to various places on teh intraweb. If I had given those places my real, actual address, filtering out thise 200/day spam would be *MUCH* harder.

Now suppose some site needs an email and I don't trust them...I give them "bogus@$DOMAIN.net" and then add "bogus@$DOMAIN.net" to my whitelist. Then, if I start getting spam on it, I move it to the blacklist and I never, ever see spam on anyone the untrustworthy site sold my email to.

It's a little more work than it used to be...I can't just give out a random address...I also have to add it to my list of given out addreses. But it's better than the alternative.
---
[ucblockhead is] useless and subhuman

[ Parent ]
no it's not by martingale (2.00 / 0) #6 Mon May 15, 2006 at 03:04:35 PM EST
It's not better than the alternative, it's just that you prefer allocating your effort at one place rather than another, which is your prerogative.

Trainable filters are just as accurate as your homegrown method, and require just as much work from you as your homegrown method. The difference is that the learning curve is higher if you use somebody else's method than something you worked out yourself. It always is.

That said, thunderbird's bayesian filter isn't one of the best. The different filters all look at slightly different combinations of features in an email, and obviously that means when the filters are sufficiently different then some are more accurate than others on any one dataset, and when your mail is sufficiently different from what they were developed against, that you won't get the optimal results. But before blaming the tools, I find that it's usually operator error, just like in programming. Do you truly understand how a bayesian filter (is supposed to) work(s)?
--
$E(X_t|F_s) = X_s,\quad t > s$

[ Parent ]
I would argue in ucblockheads favour by Dr Thrustgood (2.00 / 0) #10 Mon May 15, 2006 at 03:29:32 PM EST
Trainable is great 'n' all, and having spamassassin monitor my spam and ham folders worked wonders.

However - this method takes, oh, all of five minutes out of my life if I'm having a particularly bad week and no spam reaches my "core" addresses.

Final sarky bonus: I never worry about ham being delivered to /dev/null



[ Parent ]
I'll have to disagree on "no" by martingale (2.00 / 0) #14 Mon May 15, 2006 at 03:43:36 PM EST
Every system, be it your homegrown one or one developed by somebody else, makes mistakes. It's an inherent property of classification problems.

The only question that is worth asking is what needs to happen for your system (or any system) to produce the mistakes. Some systems produce mistakes in some cases, other systems produce mistakes in different circumstances. When you understand both, you can combine the systems for hedging, etc.

Suppose you take a simple whitelisting system, e.g. I accept everything from my company domain. Then you can still get spam and viruses sent to you from people you know whose machines have been zombified, and your rules are useless to stop this. Nearly all spam sent today is from zombie armies.
--
$E(X_t|F_s) = X_s,\quad t > s$

[ Parent ]
Every system does have mistakes by Dr Thrustgood (2.00 / 0) #17 Mon May 15, 2006 at 03:49:16 PM EST
In this case, the mistake is, say, someone signing up to a gambling website and me not being telepathic and therefore not blocking the address they're using before the email reaches my server.

I don't use whitelists, I don't use blacklists; I simply use a virus scanner and addresses that are easy to switch off.

But anyway, this seems rapidly to be developing into a holy war. Agree to disagree?



[ Parent ]
ah ok by martingale (2.00 / 0) #21 Mon May 15, 2006 at 03:58:11 PM EST
Don't want to start a holy war :) I agree that your system is work intensive.

If you want to simplify your life, you may wish at some point to try out a content filter, or else try out a DNSBL filter. The content filter doesn't depend on only the From address (but uses it too) so you don't have to keep adding new info manually, because the language of the spam is also used. So once you mark a couple of casino spams, afterwards the casino spams will be recognized even if the From address changes.

The DNSBLs are realtime databases updated by thousands of people who get spam. Together, they essentially do exactly what you do by hand, namely they submit spams and the From addresses and other salient IP information are marked. Your filter looks it up on the fly, etc. The point is you don't keep adding to a personal list.
--
$E(X_t|F_s) = X_s,\quad t > s$

[ Parent ]
Just to muddy the waters somewhat by Dr Thrustgood (4.00 / 1) #22 Mon May 15, 2006 at 04:00:05 PM EST
I also integrate some DNSBL using razor/razor2 - best of both worlds!

;-)



[ Parent ]
actually... by ucblockhead (2.00 / 0) #29 Mon May 15, 2006 at 04:26:47 PM EST
Since virtually all spam uses fake From: addresses, and since the company domain I have whitelisted is not public facing, this is extremely unlikely, even if a machine at work becomes zombified...with faked From: addresess, it's not what machine becomes zombified that is important, but rather what domain spammers have decided to use.

As such, I consider the risk low and worth the ease of whitelisting the domain. If, by some unlikely chance, it ever does happen, I'll whitelist those specific people I need to hear from.
---
[ucblockhead is] useless and subhuman

[ Parent ]
Well, that and... by ajf (2.00 / 0) #24 Mon May 15, 2006 at 04:03:08 PM EST
Trainable filters are just as accurate as your homegrown method, and require just as much work from you as your homegrown method. The difference is that the learning curve is higher if you use somebody else's method than something you worked out yourself.

The other difference is that it costs an assload more CPU time, which is what started this discussion in the first place.

"I am not buying this jam, it's full of conservatives"

[ Parent ]
no by martingale (2.00 / 0) #27 Mon May 15, 2006 at 04:09:06 PM EST
That depends on the filter. Content filtering is very cheap if the filter is implemented with performance in mind. What _is_ true is that SpamAssassin's CPU usage is a semi-trailer carrying assembled Boeing 747 parts. SA does a lot, and it does it in a terribly inefficient way. At least try to use the spamc/spamd mode if you use SA.
--
$E(X_t|F_s) = X_s,\quad t > s$
[ Parent ]
Yes, I understand what a bayesian filter is by ucblockhead (2.00 / 0) #28 Mon May 15, 2006 at 04:22:46 PM EST
You know, people tell me all the time that "Trainable filters are just as accurate as your homegrown method"...and yet...my homegrown method is currently doing better than either Spam Assassin or Thunderbird ever did. It's like a knee-jerk thing...a belief that a bayesian filter can always do better than any user could ever manage.

It is certainly true that my situation is not the normal case, and that I am also better equipped to deal with procmail scripts than the average user. I'm not saying that bayesian filters are wrong for everybody, or even that they are wrong for the vast majority of people. I am only saying that in my specific circumstances, I can do better with procmail scripts.

See, the thing is, I *know* what features can be used to spot the majority of my spam...80% of my spam can be spotted simply by ignoring anything to certain addresses. If I know this, why should I try to spend time trying to train some filter to do what I already know how to do?
---
[ucblockhead is] useless and subhuman

[ Parent ]
sure by martingale (2.00 / 0) #33 Mon May 15, 2006 at 05:02:34 PM EST
Sure, I accept your point. And no, I'm not one of those who claim Bayesian filters are more accurate than humans. A good Bayesian filter is currently about 99% accurate when trained, based on real testing (NIST TREC 2005 / spam track).

See, the thing is, I *know* what features can be used to spot the majority of my spam...80% of my spam can be spotted simply by ignoring anything to certain addresses. If I know this, why should I try to spend time trying to train some filter to do what I already know how to do?
You don't. Your system is already in place, and has cost you some effort. There is new effort in training a spam filter, and there is effort in maintaining your existing system. You should change if the new effort is lower than the one for maintaining your existing system.

A Bayesian filter needs clean data (ie existing mail archives) to extract the rules, and it will easily recognize anything obvious like which combinations of addresses are good and which are bad. You can fully train some filters in a minute, or you can train some filters whenever they make a mistake (which happens less and less over time). What you gain is no need to hide your mail address, and no need to keep your brain or an external list updated with all the different little partial addresses you've worked out.

If you're a procmail user, it's trivial to add a Bayesian filter like bogofilter or dbacl to your rulesets. They don't cost anywhere near as much CPU than SpamAssassin does, and they're a *lot* faster than SA. Moreover, you can simply let them tag your mail at first with a procmail rule, and you can see how well they do over time compared with your other rulesets. Then you can turn off your rulesets when and if you're confident. At that point, you will have turned off a lot of maintenance work for yourself.

But in the end, it's up to you to know how much work you're willing to put in to keep your inbox spam free.
--
$E(X_t|F_s) = X_s,\quad t > s$

[ Parent ]
Theory and practice by ucblockhead (2.00 / 0) #40 Mon May 15, 2006 at 05:51:24 PM EST
See...the thing is, these simple rules that cut out 99% of my spam haven't really changed over time. Bayesian filters, on the other hand, need to be continually trained, which means they will always let something through. In my experience, it isn't "less and less over time" at all. Thunderbird's has gotten worse and worse over the two years I've "trained" it. (Yes, I know...I should use this one or that one or the other one or whatever one is better...bah. Wasting time trying out system after system isn't saving me effort!)

Honestly, the reason I dumped Bayesian filters is because doing it by hand was easier, and gave better results. I'll be damned if I'm going to spend time with yet another program when what I'm doing works fine.

And please don't tell me it's trivial to add procmail rules...blah, blah, blah. I know. I've done it before with Spam Assassin. Please understand this: I am not someone who woke up yesterday and decided to deal with spam. I've been trying to cut this crap down for years now, off and on.

I'll make you a deal...if my level of spam ever reaches what it did with Spam Assassin, I'll look into other filters. Otherwise, I'm not going to fix what isn't broke because someone tells me that in theory it works better despite my experience being that it doesn't.
---
[ucblockhead is] useless and subhuman

[ Parent ]
that's why by martingale (2.00 / 0) #42 Mon May 15, 2006 at 06:12:36 PM EST
I thought you might not understand what Bayesian filtering does. See, the thing is, there's no difference between some of what you are doing and what a Bayesian filter does at the lowest level. So when you say that TB's filter is getting worse over time it's a signal that somewhere you're misunderstanding the operation of it.

But I've written a lot in this diary already, and my intention really isn't to proselitize. Your position is perfectly reasonable and I wish you luck with it.
--
$E(X_t|F_s) = X_s,\quad t > s$

[ Parent ]
The thing is by ucblockhead (2.00 / 0) #44 Mon May 15, 2006 at 06:21:17 PM EST
What constitutes spam changes over time. Spammers are constantly working to get throw Bayesian filters. Because of this, there's no guarantee at all that they will get better and better. You can't guarantee you'll find the signal if the signal is being deliberately changed in hopes of hiding it, especially when they have access to the same filters.

There are also a couple differences between what I am doing and what a Bayesian filter does. For one, I deliberately arrange it so that spam comes in with certain markers that I can then check for. (By giving out throwaway addresses.)

But we can just agree to disagree.
---
[ucblockhead is] useless and subhuman

[ Parent ]
there is by martingale (2.00 / 0) #46 Mon May 15, 2006 at 07:12:09 PM EST
Because not everything about messages changes randomly over time. Dont' take this as a new argument since I'm quite happy to leave it with a disagreement on this. I do believe you know your setup and make the right choices for yourself.

The spam training data is part legitimate messages, part spam. The former changes much less over time than the latter, and should account for roughly half your training data. Even the spam doesn't change so much as uses templates filled with randomness. A typical spam campaign runs for three to six months on a single template. Figuring all these bits out is possible, but it simply pops out from frequency data, so long as the frequency data is available.

What people don't often understand with Bayesian filters is that training must be consistent and double sided. It doesn't mean anything to (for example) keep training examples of spam that got through ever more desperately, if you don't also train the ham that got through, the spam that didn't get through, and the ham that didn't get through. It's positively rumsfeldian :)

Frequency data only makes sense when you treat all possibilities equally. It's like doing coverage analysis of a program. Has each line been executed? When training data for Bayes, has every contingency been covered by an example?

What is true is that, because/if/when half your training messages are ham which changes very little, and half your messages are spam which doesn't change that much at the meta level either, then learning truly does happen, and the truly random parts of spam and the random parts of ham account for typically 20%-40% of message content, which is small enough to be swamped by the stable learned features. So when you've got a several thousand training messages under your belt, you can go some time before there is enough new material around to seriously impact the Bayesian database, ie you train less and less frequently.
--
$E(X_t|F_s) = X_s,\quad t > s$

[ Parent ]
Not randomly by ucblockhead (2.00 / 0) #48 Mon May 15, 2006 at 08:11:27 PM EST
Deliberately. By people with access to the filters. And in my experience, the spammers are doing enough of a job to get at least some of their signal past. You're reallying on people not being smart enough to outsmart the algorithm.
---
[ucblockhead is] useless and subhuman
[ Parent ]
yes by martingale (2.00 / 0) #50 Mon May 15, 2006 at 09:05:17 PM EST
All algorithms have weaknesses, and the Bayesian algorithm is no exception. It relies on you keeping your training data secret from the bad guys, just like you keep your logins and passwords secret. If that's compromised, anything goes.
--
$E(X_t|F_s) = X_s,\quad t > s$
[ Parent ]
my situation is not the normal case by wiredog (4.00 / 1) #34 Mon May 15, 2006 at 05:08:47 PM EST
I dunno. A combination of whitelisting and blacklisting takes care of 95%+ of the spam I get.

Earth First!
(We can strip mine the rest later.)

[ Parent ]
It goes like this by ucblockhead (2.00 / 0) #5 Mon May 15, 2006 at 02:54:50 PM EST
A site, say, "The Washington Monthly", wants an email address. I don't entirely trust them. I give them "washmon@$DOMAIN.net". Then, when I start getting spam to "washmon@$DOMAIN.net", I change my filter to dump anything sent to "washmon@$DOMAIN.net" to /dev/null, mostly undoing any damage.

(And yes, my email really did somehow go from that site to fuckhead spammers.)
---
[ucblockhead is] useless and subhuman

[ Parent ]
Huh? by edward (2.00 / 0) #7 Mon May 15, 2006 at 03:20:59 PM EST
Why does each questionable site need its own address? It seems like you are committing a category error. An address can either have mostly legitimate mail or mostly illegitimate mail coming in to it. Why do you insist on fragmenting all of your illegitimate mail onto multiple accounts? It makes no sense.

[ Parent ]
There are no multiple accounts by Dr Thrustgood (2.00 / 0) #9 Mon May 15, 2006 at 03:26:16 PM EST
Merely multiple addresses. You dig?



[ Parent ]
Oh right, sorry. by edward (2.00 / 0) #12 Mon May 15, 2006 at 03:38:29 PM EST
The tenants on the other side of the wall are fucking and the chick is really loud. It's distracting. Actually now it sounds like they're laughing. What are they doing over there? I'll be happy when they're gone.

[ Parent ]
It makes perfect sense by ajf (2.00 / 0) #20 Mon May 15, 2006 at 03:55:16 PM EST
An address can either have mostly legitimate mail or mostly illegitimate mail coming in to it.

Well, that won't be true if you use the same email address to sign up with a forum site that sends you email that you want, and another web site that sells or otherwise exposes your email address to spammers. You'll start receiving a harder-to-filter mix of legitimate and dodgy mail at that same address.

But if you gave the two sites different addresses, you'd be able to (1) send all mail addressed to the bad email address to /dev/null; and (2) know which site fucked you over.

"I am not buying this jam, it's full of conservatives"

[ Parent ]
Yes by ucblockhead (2.00 / 0) #30 Mon May 15, 2006 at 04:28:44 PM EST
I tend to give sites an address of $SITEDOMAIN@$MYDOMAIN.net. Makes it real easy to see who the fuckers are.
---
[ucblockhead is] useless and subhuman
[ Parent ]
My domains set-up by Dr Thrustgood (2.00 / 0) #8 Mon May 15, 2006 at 03:23:42 PM EST
Much like yours. Whenever I give out an email address to a website (hell, even my bank accounts) I use the form $COMPANYNAME@$DOMAIN.$TLD

I run a patched version of qmail on my mailserver that has uses a file called badrcptto. Any email sent to an address in this file will be rejected at the SMTP level as an address no longer exists error. One day, I hope this will actually cause a spammer to take it off his list.

I have only two annoyances at the moment, and both are due to people being cunts. For some reason, people quite like using one of my domains as a made-up email address when registering for gambling websites. Easy enough to block, but still irritating.

However, much more twatty are some of those that set up spam-bait websites. So, they auto-generate a list of supposidly non-existant domains and addresses for said domains except several of the fuckers chose one of my domains which is very much alive! Hence receiving a load of email to $RANDOM_EIGHT_HEX_DIGITS@$DOMAIN.$TLD. I was less than impressed. Fortunately, their ISPs have been more so.



spf by martingale (2.00 / 0) #11 Mon May 15, 2006 at 03:35:20 PM EST
SPF is designed to help you with your email address being hijacked by others. In a nutshell, you set up a DNS record which lists the IP addresses of the computers who are authorized to send email with your email address. Then recipients can check your DNS records to see if what they got is really from you. Of course, its biggest problem is that not everybody checks SPF records, and most people don't create them. Ironically, spammers are the most consistent users of SPF...
--
$E(X_t|F_s) = X_s,\quad t > s$
[ Parent ]
Hijacking isn't the problem per se by Dr Thrustgood (2.00 / 0) #13 Mon May 15, 2006 at 03:43:05 PM EST
It's people entering crap into web forms and or listing so-called fake addresses on website that spammers can then send mail to that is the problem.



[ Parent ]
Sorry, you'll have to clarify. by edward (2.00 / 0) #15 Mon May 15, 2006 at 03:45:52 PM EST
I seem to be under the impression that the only way to get most spam is by making your address public in the wrong way.

[ Parent ]
Heh by Dr Thrustgood (2.00 / 0) #16 Mon May 15, 2006 at 03:49:09 PM EST
I remember the happy, happy days when I believed that too.



[ Parent ]
Sorry, let me make it even more clear. by edward (2.00 / 0) #25 Mon May 15, 2006 at 04:05:30 PM EST
Unless you moronically configure a catch-all on your domain, the only way to get spam is by making your address public in the wrong way OR by being targetted by that guy you knew in college who was secretly jealous that your family was more well-off than his, or by a vindictive ex, or by a co-worker who you don't get along with, etc etc etc.

Sorry, but spam is a social problem first and foremost. A random person isn't going to sign you up for a newsletter simply because he or she happened to put your e-mail address in a form. Unless your address is john@doe.com or something ridiculous like that of course.

And as others have said: spam is remarkably easy to identify using bayesian methods. I get almost NO spam in my actual inbox (maybe about one-two a week-- and they get correctly flagged by my mail client) across multiple accounts AND multiple addresses because I have very good filtering mechanisms in place. My filters regularly block over 100 spam messages a day and those messages are all addressed to aliases that were made public in the wrong way.

So, again, explain to me how it is that you are suggesting that I could create a random alias on one of my domains and receive spam to it, even if I never tell anyone about the alias. Because that seems to be what you're claiming. Wait-- unless I set my domain to deliver all messages sent to any alias, breaking SMTP RFCs in the process.

[ Parent ]
-1 Must try harder by Dr Thrustgood (4.00 / 1) #26 Mon May 15, 2006 at 04:08:15 PM EST
Not moronicalisTic



[ Parent ]
Unfortunately by ucblockhead (4.00 / 1) #32 Mon May 15, 2006 at 04:36:15 PM EST
In this world, you often have to give out your email address to websites, and some times those companies are shady. I mean...Hulver has your email...what if he decided to sell the husi mailing list to a spammer?

Sure, we all trust him...but I find it much, much easier to give any site that wants an email something random then worry about whether a particular site is trustworthy. I don't have to rely on trustworthiness at all.

A catch-all, judiciously used, is a tool for preventing spammers from ever getting your real email.
---
[ucblockhead is] useless and subhuman

[ Parent ]
what if he decided to sell the husi mailing list by wiredog (4.00 / 2) #35 Mon May 15, 2006 at 05:13:48 PM EST
We'd hunt him down and force him into a sack of hungry kittens, that's what'd happen. And he knows it!

Earth First!
(We can strip mine the rest later.)

[ Parent ]
hulver -> pie by martingale (4.00 / 1) #38 Mon May 15, 2006 at 05:34:10 PM EST
It's like Chicken run but with a different ending!

(In truth, I doubt he'd get enough cold hard cash for the husi address list to actually buy a meat pie)
--
$E(X_t|F_s) = X_s,\quad t > s$

[ Parent ]
Indeed by Cloaked User (2.00 / 0) #52 Tue May 16, 2006 at 06:05:02 AM EST
I have a catch-all set up on my domain. I get a small amount of spam to a long-since retired "site@" account, and to various accounts such as "info@", "sales@", etc. I get a metric fuckload of spam and bounces to "$random@" since some low-life motherfucking shit-sucking spammer scum¹ decided to use my domain in their forged From: addresses. At its height, I was getting a couple of thousand mails per day. I've stopped counting, but I think it's dropped to around a thousand a day, on average.

1 not that I'm bitter

However, I get exactly zero spam to my real address. One day, I'm going to implement a filtering system that automatically whitelists addresses as I give them out², and /dev/nulls anything coming in on any other address...

2 as far as possible, that is


--
This is not a psychotic episode. It is a cleansing moment of clarity.

[ Parent ]
why is that a problem? by martingale (2.00 / 0) #18 Mon May 15, 2006 at 03:50:01 PM EST
It's going to look like normal spam to you, whether it's a rolex ad or a casino ad. The fact that people put your address in a form by chance on a dodgy site and spammers get it this way, rather than buying your address on some crawler list or whatever has no bearing on the spam content. Do you have real contact yourself with some dodgy sites and expect to get the occasional newsletter or why else might you consider this fact significant?
--
$E(X_t|F_s) = X_s,\quad t > s$
[ Parent ]
To clarify by Dr Thrustgood (2.00 / 0) #19 Mon May 15, 2006 at 03:51:50 PM EST
They don't put my entire email address, merely a domain with a made-up address in front of it.

Why does my drink-addled brain feel like it's being trollorised?



[ Parent ]
apologies by martingale (2.00 / 0) #23 Mon May 15, 2006 at 04:00:49 PM EST
You've explained your system in the other thread, and I've replied there. I'm not trolling you, but I have pointed you to some other options. I'm not trying to convince you to change your system and use those, just giving you some info in case you've never heard of them.
--
$E(X_t|F_s) = X_s,\quad t > s$
[ Parent ]
better by ucblockhead (2.00 / 0) #31 Mon May 15, 2006 at 04:30:33 PM EST
To whitelist addresses, not IPs. The chance of a spammer forging the exact address of someone you know is low and the chance that someone you know might email you from a new IP is high.
---
[ucblockhead is] useless and subhuman
[ Parent ]
*cough* *cough* by yankeehack (2.00 / 0) #36 Mon May 15, 2006 at 05:17:18 PM EST
One other thing which trip up spammers is rate limiting.

I see this at zee place of zee employ all of the time.  The specially crafted, unique email that you wanted from us can pass the content filters, but we get smacked for rate limiting if we have a good number of emails to send to a particular domain.

Also, the concept of greylisting intrigues me too, since it punishes those who just want to funnel as many emails as they can and when the reciever says "oh wait, I can't take this" the spammer will just move onto another target, while a properly config'ed mailserver will wait to resend at a later time (aka act like it's a soft bounce).
"...she dares to indulge in the secret sport. You can't be a MILF with the F, at least in part because the M is predicated upon it."-CBB

[ Parent ]
spf isn't about whitelisting per se by martingale (2.00 / 0) #37 Mon May 15, 2006 at 05:29:58 PM EST
SPF and related systems are really designed to authenticate the path of the email. The bottleneck for authentication is the sending SMTP client, and there should be only a few. The idea is that anyone in an org who is authorized to send mail from that org must use one of a few authorized SMTP servers, who will do the actual handshake with foreign SMTP servers. So you'll only have a few IPs to keep up to date in the DNS record.

While receivers can use SPF as a whitelisting tool, it's really designed so that mail messages which spoof an organization in the From field are trivial to recognize. Think mail from a bank or Paypal. Your MUA can just check the mail headers to see if it's really from Paypal or if it's from some spammer in Nigeria.

It's also useful for bounces. Think what happens when a spammer pretends to be you, and their mail bounces. Then you get thousands of notifications in your inbox. If you use SPF, and if the SMTP server at the other end knows about SPF, it will figure out that you shouldn't be getting those bounces in the first place.
--
$E(X_t|F_s) = X_s,\quad t > s$

[ Parent ]
yeah but by yankeehack (4.00 / 1) #39 Mon May 15, 2006 at 05:46:25 PM EST
SPF solves a problem (id'ing fraudulent email) that can be solved in a number of ways.

I think that's why SPF is considered a good idea, but it doesn't really accomplish enough. As an earlier poster noted, the most frequent users of SPF are spammers.
"...she dares to indulge in the secret sport. You can't be a MILF with the F, at least in part because the M is predicated upon it."-CBB

[ Parent ]
yes by martingale (2.00 / 0) #43 Mon May 15, 2006 at 06:16:17 PM EST
It also suffers from several technical flaws, as well as the usual "most people have to use it to be useful". I mentioned it when Dr T was complaining about his domains being used for made-up email, but his problem turns out to be different.
--
$E(X_t|F_s) = X_s,\quad t > s$
[ Parent ]
Makes sense by ucblockhead (4.00 / 1) #41 Mon May 15, 2006 at 06:00:47 PM EST
But on the other hand, just giving acm.org acm@$DOMAIN.net solves the problem regardless. In order for it to fail, a spammer would have to either hack acm.org's mail server or be interested enough in spamming me that they went to the effort of guessing the address. Which seems highly unlikely, and if it failed, it'd take less than a minute to give acm.org a new throwaway.
---
[ucblockhead is] useless and subhuman
[ Parent ]
Why does a spammer have to guess an email by edward (2.00 / 0) #45 Mon May 15, 2006 at 06:52:57 PM EST
you've used? A spammer can spam you simply by sending an email to your domain name. It doesn't matter what the alias is. You use a catch-all. In fact, since you don't use a bayesian filter, you will have to manually delete the message, won't you?

[ Parent ]
No. by ucblockhead (2.00 / 0) #47 Mon May 15, 2006 at 08:01:48 PM EST
They can send it...and it'll get dumped to /dev/null because they aren't on my whitelist.

According to my logs, around 500 emails have been sent to bogus addresses on my domain...not one has hit my inbox.
---
[ucblockhead is] useless and subhuman

[ Parent ]
Wait... by edward (2.00 / 0) #49 Mon May 15, 2006 at 08:33:13 PM EST
so every time you sign up for a new website/publicise a new alias, you have to go and whitelist it manually?

[ Parent ]
yes by ucblockhead (2.00 / 0) #51 Tue May 16, 2006 at 06:01:54 AM EST
It's worth it to keep the main address safe.
---
[ucblockhead is] useless and subhuman
[ Parent ]
Spam | 52 comments (52 topical, 0 hidden) | Trackback