You don't understand the scale of this problem. It's such a fad to "fire" someon...

gm · on Oct 25, 2019

I think it should suffice to know that after one repeatedly trains the google spam filter with "report not spam" and "mark as spam", it should learn that well enough, no?

As an example: Some dumb ass Verizon customer gave them my email address as their own. I got a ton of communications from Verizon that I could not turn off (bill past due, thank you for you promise to pay, and all sorts of really annoying messages). I called Verizon, I was told that neither them nor I could do anything about it because I did not own the Verizon account, and I did not have the account PIN. Their official solution for this problem: Mark all the Verizon emails as spam and don't call them again (I kid you not).

The result: I marked about 100 Verizon emails as spam and Gmail learned that really well; now I do not see Verizon emails in my inbox. Why TF can't gmail do this with all the other emails I mark as spam and not spam?

I just got in the habit of firing up the hideous gmail app on my phone to go through my spam folder every day and vet out the non-spam out of there. I usually go though about 250 emails per day. I end up looking at all the spam, but it's far faster to fish out the good messages from spam than if I turned spam detection off altogether.

No matter how complex the problem to solve is, Google is failing at a level where even I could program it better, which is not a high bar at all.

shadowgovt · on Oct 25, 2019

> Google is failing at a level where even I could program it better

You should do so. You can set up filters to hook into keywords, domain names, etc., and route to spam. If the stuff you're getting is that obvious in structure, you should be able to kill it with a handful of such filters.

gm · on Oct 25, 2019

Do you know by chance know if user-0defined filters take precedence over Google's spam detection? If so, I might go for it.

I say "might" because no matter what I program into it, I'll never be sure there's not an important false positive that got trapped, and so I'll forever be checking my spam anyway. It might be all for naught.

lonelappde · on Oct 26, 2019

Why don't you build your solution and show it to HN and Google or Microsoft? I'm sure someone at a big company would convince an exec to pay you at least $10M for your improved spam filter.

natch · on Oct 26, 2019

Not at Google. The Google one is so bad there’s pretty much no explanation I can think of for it other than nepotism. Which means it isn’t getting displaced any time soon.

natch · on Oct 26, 2019

Scale is Google’s go-to excuse for some things but in this case it doesn’t hold up to scrutiny.

Having in my sent mail and inbox a traceable record of me, authenticated with a Google login, subscribing to a mailing list and then getting a subscription request acknowledgment back and then sending a further confirmation email, should be enough to tell the spam system that yes dammit I am interested in mail from this list and it is not to be considered spam, assuming it is on topic (a determination that is like ML 101 first quarter stuff and has been for some time, even across languages, like going back to, say, Damashek). That heuristic is so simply basic that I would be embarrassed for Google if I had to explain it to them. And yet there it is. Just one of many possible examples of how lame their spam filtering system is.

natch · on Oct 26, 2019

Solving it perfectly is very hard, you are right on that.

But solving it better than Google has? Not hard at all. A high school programming class could do it better.

Unlike other parts of Google, which are doing amazing things like beating human masters at Go, something is seriously fishy in the spam filter tool department.

mav3rick · on Oct 26, 2019

If it was this easy, please write your own ?

natch · on Oct 27, 2019

What a facile response. First, I said it would be hard, not easy. More importantly it would be pointless since Google controls which system they choose to use and their choice is clearly not driven by quality, but by something else.