2009 Q3 link clearance: Microsoft blogger edition

Date:September 30, 2009 / year-entry #311
Tags:other
Orig Link:https://blogs.msdn.microsoft.com/oldnewthing/20090930-01/?p=16543
Comments:    7
Summary:It's that time again: Sending some link love to my colleagues. Peter Torr explains why the anti-phishing filter operates on the original URL instead of a hash. Jamie Buckley from the MSN Search team explains why not every possible instant answer is offered. From our Microsoft Research Cambridge comes SenseCam, a wearable camera that takes...

It's that time again: Sending some link love to my colleagues.


Comments (7)
  1. Maurits says:

    (Posting anonymously because the spam filter apparently hates me)

     why the anti-phishing filter operates on the original URL instead of a hash

    I will start by assuming that Microsoft itself is not malicious

    He lost me there.  It seems to me that he’s blowing off the possibility that Microsoft employees with access to the original URLs will use or release that data even though it is against Microsoft policy.

    http://en.wikipedia.org/wiki/AOL_search_data_scandal

    I don’t find any of the "but we wanna use the original URL" arguments as compelling.

  2. Dale says:

    Off topic I know, but I met Betsy at Oz TechEd 2005, not that she’d remember me.  Classy lady.

  3. Random832 says:

    @Maurits, did you read the article? An employee acting against Microsoft policy is not "Microsoft itself", and he specifically mentions the possibility of malicious insiders.

    Since you didn’t read the article, I don’t find your opinion on the arguments presented in the article compelling.

  4. Gabe says:

    Maurits, if you had understood the article, you would know that hashing doesn’t protect privacy. Figuring out who you are based on what URLs you visit is not likely to be easier than figuring out what URLs you visit based on the hashes you send.

    Furthermore, any phisher could just add a random component to the URL and be able to make unique hashes to evade detection. So not only does sending hashes not prevent abuse of the information, but it makes the whole system useless by being trivial to circumvent.

  5. 640k says:

    MS is making up excuses for retrieving the full url, this is not a unsolvable problem IF YOU WANT. At least the protocol and the domain/site/ipnumber part of the url could be hashed.

  6. Worf says:

    What’s the point of hashing?

    So instead of reporting you went to http://www.google.com, it reports you went to 0x7e1b567a?

    Since anti-phishing requires known sites, then hashing requires the bad URL. Well, it’s trivial when storing the hashes to store the original URL that goes with it (say, a more efficient lookup method is found). Oh wait, when we ;og the hash, we can translate it to the real URL too! It’s just another column in the database, so it comes "for free".

    Also makes it hard to do similarity matches. If bankofamerica.example.com is bad, maybe bankofamerica.example.net is too?

  7. Random832 says:

    Of course, the real problem with the argument is that the sort of attacker that most people are worried about aren’t necessarily going to be the kind that need to "confirm that you’ve been to a specific site" as he assumes.

    Another possibility would be to have the client download a massive list of bad urls (or a section of it – maybe all the "g"’s) every time you visit a website, or as part of a periodic update.

    Of course, another possibility is that we shouldn’t all be having this discussion on Raymond Chen’s blog when he didn’t write the article.

Comments are closed.


*DISCLAIMER: I DO NOT OWN THIS CONTENT. If you are the owner and would like it removed, please contact me. The content herein is an archived reproduction of entries from Raymond Chen's "Old New Thing" Blog (most recent link is here). It may have slight formatting modifications for consistency and to improve readability.

WHY DID I DUPLICATE THIS CONTENT HERE? Let me first say this site has never had anything to sell and has never shown ads of any kind. I have nothing monetarily to gain by duplicating content here. Because I had made my own local copy of this content throughout the years, for ease of using tools like grep, I decided to put it online after I discovered some of the original content previously and publicly available, had disappeared approximately early to mid 2019. At the same time, I present the content in an easily accessible theme-agnostic way.

The information provided by Raymond's blog is, for all practical purposes, more authoritative on Windows Development than Microsoft's own MSDN documentation and should be considered supplemental reading to that documentation. The wealth of missing details provided by this blog that Microsoft could not or did not document about Windows over the years is vital enough, many would agree an online "backup" of these details is a necessary endeavor. Specifics include:

<-- Back to Old New Thing Archive Index