2009 Q3 link clearance: Microsoft blogger edition

Comments (7)

Maurits says:

October 1, 2009 at 5:54 am

(Posting anonymously because the spam filter apparently hates me)

why the anti-phishing filter operates on the original URL instead of a hash

I will start by assuming that Microsoft itself is not malicious

He lost me there. It seems to me that he’s blowing off the possibility that Microsoft employees with access to the original URLs will use or release that data even though it is against Microsoft policy.

http://en.wikipedia.org/wiki/AOL_search_data_scandal

I don’t find any of the "but we wanna use the original URL" arguments as compelling.
Dale says:

October 1, 2009 at 8:11 am

Off topic I know, but I met Betsy at Oz TechEd 2005, not that she’d remember me. Classy lady.
Random832 says:

October 1, 2009 at 9:36 am

@Maurits, did you read the article? An employee acting against Microsoft policy is not "Microsoft itself", and he specifically mentions the possibility of malicious insiders.

Since you didn’t read the article, I don’t find your opinion on the arguments presented in the article compelling.
Gabe says:

October 1, 2009 at 9:18 pm

Maurits, if you had understood the article, you would know that hashing doesn’t protect privacy. Figuring out who you are based on what URLs you visit is not likely to be easier than figuring out what URLs you visit based on the hashes you send.

Furthermore, any phisher could just add a random component to the URL and be able to make unique hashes to evade detection. So not only does sending hashes not prevent abuse of the information, but it makes the whole system useless by being trivial to circumvent.
640k says:

October 1, 2009 at 9:27 pm

MS is making up excuses for retrieving the full url, this is not a unsolvable problem IF YOU WANT. At least the protocol and the domain/site/ipnumber part of the url could be hashed.
Worf says:

October 2, 2009 at 1:37 am

What’s the point of hashing?

So instead of reporting you went to http://www.google.com, it reports you went to 0x7e1b567a?

Since anti-phishing requires known sites, then hashing requires the bad URL. Well, it’s trivial when storing the hashes to store the original URL that goes with it (say, a more efficient lookup method is found). Oh wait, when we ;og the hash, we can translate it to the real URL too! It’s just another column in the database, so it comes "for free".

Also makes it hard to do similarity matches. If bankofamerica.example.com is bad, maybe bankofamerica.example.net is too?
Random832 says:

October 2, 2009 at 8:32 pm

Of course, the real problem with the argument is that the sort of attacker that most people are worried about aren’t necessarily going to be the kind that need to "confirm that you’ve been to a specific site" as he assumes.

Another possibility would be to have the client download a massive list of bad urls (or a section of it – maybe all the "g"’s) every time you visit a website, or as part of a periodic update.

Of course, another possibility is that we shouldn’t all be having this discussion on Raymond Chen’s blog when he didn’t write the article.

Comments are closed.

Date:	September 30, 2009 / year-entry #311
Tags:	other
Orig Link:	https://blogs.msdn.microsoft.com/oldnewthing/20090930-01/?p=16543
Comments:	7
Summary:	It's that time again: Sending some link love to my colleagues. Peter Torr explains why the anti-phishing filter operates on the original URL instead of a hash. Jamie Buckley from the MSN Search team explains why not every possible instant answer is offered. From our Microsoft Research Cambridge comes SenseCam, a wearable camera that takes...