Date: | March 31, 2006 / year-entry #116 |
Tags: | other |
Orig Link: | https://blogs.msdn.microsoft.com/oldnewthing/20060331-15/?p=31703 |
Comments: | 133 |
Summary: | Okay, there were an awful lot of comments yesterday and it will take me a while to work through them all. But I'll start with some more background on the problem and clarifying some issues that people had misinterpreted. As a few people surmised, the network file server software in question is Samba, a version... |
Okay, there were an awful lot of comments yesterday and it will take me a while to work through them all. But I'll start with some more background on the problem and clarifying some issues that people had misinterpreted. As a few people surmised, the network file server software in question is Samba, a version of which comes with most Linux distributions. (I'll have to do a better job next time of disguising the identities of the parties involved.) Samba is also very popular as the network file server for embedded devices such as network-attached storage. The bug in question is fixed in the latest version of Samba, but none of the major distributions have picked up the fix yet. Not that that helps the network-attached storage scenario any. It appears that a lot of people though the buggy driver was running on the Windows Vista machine, since they started talking about driver certification and blocking its installation. The problem is not on the Windows Vista machine; the problem is on the file server, which is running Linux. WHQL does not certify Linux drivers, it can't stop you from installing a driver on some other Linux machine, and it certainly can't download an updated driver and somehow upgrade your Linux machine for you. Remember, the bug is on the server, which is another computer running some other operating system. Asking Windows to update the driver on the remote server makes about as much sense as asking Internet Explorer to upgrade the version of Apache running on slashdot.org. You're the client; you have no power over the server. Some people lost sight of the network-attached storage scenario, probably because they weren't familiar with the term. A network-attached storage device is a self-contained device consisting of a large hard drive, a tiny computer, and a place to plug in a network cable. The computer has an operating system burned into its ROMs (often a cut-down version of Linux with Samba), and when you turn it on, the device boots the computer, loads the operating system, and acts as a file server on your network. Since everything is burned into ROM, claiming that the driver will get upgraded and the problem will eventually be long forgotten is wishful thinking. It's not like you can download a new Samba driver and install it into your network-attached storage device. You'll have to wait for the manufacturer to release a new ROM. As for detecting a buggy driver, the CIFS protocol doesn't really give the client much information about what's running on the server, aside from a "family" field that identifies the general category of the server (OS/2, Samba, Windows NT, etc.) All that a client can tell, therefore, is "Well, the server is running some version of Samba." It can't tell whether it's a buggy version or a fixed version. The only way to tell that you are talking to a buggy server is to wait for the bug to happen. (Which means that people who said, "Windows Vista should just default to the slow version," are saying that they want Windows Vista to run slow against Samba servers and fast against Windows NT servers. This plays right into the hands of the conspiracy theorists.) My final remark for today is explaining how a web site can "bloat the cache" of known good/bad servers and create a denial of service if the cache did not have a size cap: First, set up a DNS server that directs all requests for *.hackersite.com to your Linux machine. On this Linux machine, install one of the buggy versions of Samba. Now serve up this web page: <IFRAME SRC="\\a1.hackersite.com\b" HEIGHT=1 WIDTH=1></IFRAME> <IFRAME SRC="\\a2.hackersite.com\b" HEIGHT=1 WIDTH=1></IFRAME> <IFRAME SRC="\\a3.hackersite.com\b" HEIGHT=1 WIDTH=1></IFRAME> <IFRAME SRC="\\a4.hackersite.com\b" HEIGHT=1 WIDTH=1></IFRAME> ... <IFRAME SRC="\\a10000.hackersite.com" HEIGHT=1 WIDTH=1></IFRAME>
Each of those Even worse, if you proposed preserving this cache across reboots, then you're going to have to come up with a place to save this information. Whether you decide that it goes in a file or in the registry, the point is that an attacker can use this "bloat attack" and cause the poor victim's disk space/registry usage to grow without bound until they run out of quota. And once they hit quota, be it disk quota or registry quota, not only do bad things start happening, but they don't even know what file or registry key they have to delete to get back under quota. Next time, I'll start addressing some of the proposals that people came up with, pointing out disadvantages that they may have missed in their analysis. |
Comments (133)
Comments are closed. |
So store the IP address of the server in the cache instead of the hostname. Samba only runs on one port, so you can’t have more server on the same IP, and you can’t do the "billion hostnames on a single machine" trick, unless you have a billion IP addresses.
(Yes, I know vista will support IP6. You got me there :)
On top of that, this is only an attack on the local network, so you’ve already got an insider running a rogue server. And as I’d hope that IFrames to UNC files wouldn’t be loaded from pages served over HTTP, so someone’s had to convince you to go to an actual HTML file on the SMB server to work. You can’t use index.html to autoload the attack file when someone’s just browsing directories.
With all the weird compatibility hacks the Samba project has had to make over the years to be compatible with the various versions of Windows, I think MS can deal gracefully with this bug.
Would Jeremy be willing to call the fixed version of the protocol a new dialect (or add a compatibility flag) to the negotiate SMB?
That way the redir could detect the fixed servers.
I still feel that storing a cache of broken addresses is a bad idea. Seems like an awful lot of trouble to go through, considering the ways this may be exploited. In addition, checks would have to be done periodically to see if a box is still behaving badly — since it might be upgraded / replaced / whatever. Doing such a re-check seems close to impossible or at least really impractical, as the client would have to wait till a folder of more than 100 files is re-opened, as well as knowing that it’s been so-and-so long since the last attempt.. Horrid.
It’s great to hear that Microsoft tests OS compatibility with Linux on some level:)
Well with background everything becomes more clear!
Knowing this I think I would favor this solution:
– For Explorer, default to slow mode with an option to use fast mode (option or reg_entry). A ‘slow mode servers’ list would be used in to still allow for a few exceptions, either ‘dynamic’ (discovered by the OS) or ‘static’. The list shall be bounded to N entries and survive reboots.
– For API, as I don’t have a great experience with it, maybe I will be blatantly wrong but I would say that a ‘use fast mode’ flag would quickly become a burden. My proposal would therefore be to default to fast mode (if not in the ‘slow mode servers’ list presented earlier) and issue an EAGAIN error when the problem arises. From then the server would be added to the ‘slow mode servers’ list.
I used two cases (Explorer and API) since some people seemed to insinuate that Explorer didn’t use the standard API. If that’s not the case, then Explorer should try again on the API’s ‘EAGAIN’ and if that fails again (meaning that it is not the ‘fast mode’ bug), display a ‘server disconnected’ error (or whatever the error is when a server suddenly becomes unreachable)
Eventually the code to detect the special EAGAIN case could be throwed away without any bad side effects except that when a ‘new’ explorer (with no workaround) access an old Samba, the user would now see a ‘disconnected’ message, scratch his head, dismiss it, try again and from this time it would work.
Maybe I would even use this solution since the beginning but it would depend of the rarity of the ‘bad’ servers and the impression I would want Vista to make when it launches! ;-)
DO NOTHING Wins again!!!
Make them fix their buggy driver and all is right again.
I agree with Larry, if the OS knows an SMB server is being mounted default to slow mode unless a specific (new) flag or something along those lines is present to identify the ‘fixed’ version of the server.
so much for open source being tested more better.
Ist it wise to allow UNC paths to be embedded in web pages? Shouldn’t a browser make some reasonable attempt at keeping protocols separate?
In particular: IE already prevents iframes from using file: URLs to access local files for obvious reasons. I think this is a restriction that should be extended to UNC paths anyway.
Perhaps the samba folks could update the "Family" field on the fixed version, so you could tell the old from the new? The people with the in-between versions (works fast but has the old family code value) would get slow when fast was possible, but they are the ones upgrading.
Is it even possible to make a random user connect to a random SMB server over the Internet? AFAIK, most ISPs filter the SMB ports (even those ISPs which do not filter anything else) for security reasons, and have been doing so for about ten years.
Heh. I worked for one of the cheap crap network storage vendors, doing embedded Linux until I was lucky and they laid me off. My boss stayed, and they really hung him out to dry when they finally went bankrupt. Heck, I may have helped design this server.
Anyway, the best response is still displaying a "this server is borked" dialog to force the user to fix his server. It’s broken and needs to be fixed.
On the other hand, MS could save itself trouble like this in the future by documenting protocols. Most of SAMBA is based on reverse-engineering what comes across the wire. And make future protocols a hell of a lot simpler than Server Message Block.
The Internet protocols interop well because they’re simple and well documented.
It seems that the bad-list cache can be limitted to a smallish number of entries, at some cost in performance to neighboring servers:
if a server reports failure, use its IP (filter out DNS attacks).
if more than ‘x’ instances from that IP range occur, then put a ‘network block’ record. From then on, all accesses to that network block use slow accesses.
If more than ‘x’ of network blocks are hit, then use a larger granularity (multi-network block).
The attack would need to have a botnet with addresses in many different address ranges. And even then, the only impact is that all accesses are slow.
Advantage – Keeps size of cache bounded. Tuneable for common scenario.
Disadvantages:
all other ‘blacklist’ disadvantages (how to remove entries, etc).
Powerful attacker can force slow accesses on same subnet/network/etc
Is it feasible to get the Samba team to change the name of their server in the Family field of the "fixed" version and all future versions moving forward to something like "FastSamba"? That way you can blacklist "Samba" and use the slow protocol safely with the legacy Samba servers without punishing the new/future versions that work correctly.
The only casualty would be the current version of Samba that is technically fixed but doesn’t include the changed name. Vista would use slow mode with that, even though fast mode would have worked. Since that code is not in wide distribution at the moment, it shouldn’t be much of an issue.
Also you could provide a Registry/Policy setting to control the fast mode blacklist.
"Your argument that OSS is not tested as well is therefore worthless"
How’s his argument worthless? Even you are saying that they released untested code. Sounds like his argument is pretty damn solid to me.
After careful consideration, I think the best answer would be just to disable the "fast mode" by default, with the additional proviso that there be a registry switch somewhere to re-enable it.
My reasoning is: If Samba’s the only SMB server that exhibits this problem, and a fix is already in the codebase, it’s only a matter of time before that fix gets everywhere it needs to be. By the time the next version of Windows comes out, the fix should be pretty much everywhere, and you can flip the switch. In the meantime, having the switch there allows folks that care to perform interoperability testing. (I might also drop a polite note to the NAS vendors in question, saying "you might want to consider upgrading the Samba version you use, and here’s why.")
OK, addressing disadvantage number 1: "the fast mode may as well not exist." True enough, but you’re also not shipping code that engages in buggy behavior by default, either, whether the source of the buggy behavior is your code or someone else’s. As you’ve pointed out before, if people upgrade Windows and something breaks, they’re going to blame the Windows upgrade, no matter whose "fault" the bug was. There’s certainly no shame in a mistake that isn’t made. And you have an "out" later for when things DO get better.
Addressing disadvantage number 2: "People will accuse Microsoft of unfair business practices…" People accuse Microsoft of unfair business practices about every damned thing in the world, sometimes rightly, sometimes wrongly, but it happens. One little detail about one teensy corner of the operating system isn’t going to change that by any significant amount in either direction. The people making the accusations would probably expect no better of Microsoft, and they probably weren’t "your" customers to begin with.
Yeah, it’s an imperfect solution. However, as far as I can see, all the other solutions are worse. This at least puts you in a position to take advantage and do what you originally wanted once "outside conditions" improve.
Follow-up to my previous post: if you use the "whitelist" idea, don’t store it. It should be a session flag, discarded as soon as the session goes down. Having a list of IP addresses or NetBIOS name ignores the fact that such things are transitory.
If it’s not the SMB_FIND_ID_* bug, and is not a bug in FindNext which can be worked around, I’d second the API flag for programs that are aware of the error, along with delayed-write-failure style balloon to report the error in non-aware programs. Just like removing a USB drive, the enumeration should simply fail, as there’s no real way of recovering. This is not just an IShellFolder problem: what if it’s a kernel driver asking for the directory listing?
The balloon should link directly to a tab in System Properties’s ‘Performance Options’ (or the equivalent in Vista) that will stop the redirector using fast directory listings, so users has a chance to ignore the problem, and time to stop whatever they’re doing if they want to fix it. Einar’s "might be puzzled for a minute or two" is no good if the user is in a rush to get something done!
I like Neil’s idea to provide this as an optional update for Windows (so only eager people get it), but what if an eager admin deploys it, and the user takes their laptop home?
Cheers,
Mark
Shipping with slow enabled and fast on request is a horrible idea.
Scenario:
You check for the buggy Samba implementation on your network and find it doesn’t affect you so you deploy Vista and switch the default from slow to fast.
A year from now they purchase a NAS that ships with the faulty Samba version. What are the chances you’ll remember to check for the bug? Or by then someone else could be managing the network.
To avoid this Vista has to auto-check for the bug even if it’s being told to use fast so you’re better off starting off with fast and degrade down to slow as soon as the error pops up once. If the problem gets fixed, the admin can delete the setting and Vista goes back to default until it encounters another error.
It seems that the best solution is to simply contact the Samba team and ask what they would want. You can’t prevent conspiracy theories but to the sane ones the fact that Microsoft took the trouble to work with the Samba team to resolve the situation will show that whatever the outcome is, it was not done for malicious reasons.
Brian Reiter: "Is it feasible to get the Samba team to change the name of their server in the Family field of the "fixed" version and all future versions moving forward to something like "FastSamba"? That way you can blacklist "Samba" and use the slow protocol safely with the legacy Samba servers without punishing the new/future versions that work correctly."
Unfortunately this wouldn’t work, as the family returned by Samba is a user-configurable setting.
J: "How’s his argument worthless? Even you are saying that they released untested code. Sounds like his argument is pretty damn solid to me. "
The argument is useless because they didn’t test against a case that was impossible to test for at time of release, as no version of Windows actually used the fast mode and available documentation wasn’t specific about how the server should behave. Saying "OSS projects don’t test against bugs that are impossible to detect with currently available knowledge" gets you nowhere fast.
I still think the best thing is to notify Samba, or even better publish the problem somewhere. That way everyone who does SMB servers can also address it.
It is their problem, and they are the best judges of how to fix this with their community. And they will have roughly 7 months to do it.
You will end up with no workaround code (which would have to be maintained forever) and they would be fixed.
Just for the record, where do you want reports of the Vista SMB1 implementation being less than optimal? It is mostly evident just running network monitor/ethereal and doing various operations. I also applaud only using one stream for document properties rather than 15 or so earlier versions looked for.
Gosh, it’s almost as if Microsoft’s secrecy about the SMB protocol and making life hard for anyone (http://www.linuxelectrons.com/article.php/20050607122215586) who tries to implement compatibility and interoperability with it has come back to bite them in the ass!
Shame.
Now I’ve got my snipe out of the way, it’s impressive how many people are still ignoring Raymond’s basic point with comments like:
"Anyway, the best response is still displaying a "this server is borked" dialog to force the user to fix his server. It’s broken and needs to be fixed. "
Yes, it is broken, but NO, it does not NEED to be fixed. It works fine with XP etc, as Raymond said.
You have to look at it from the user’s point of view: they just bought shiny new Vista, and their lovely network hard drive that they’ve had purring along just fine for 3 years suddenly won’t work because Vista tells them it’s "broken".
The vast majority of users (i.e., the ones that matter) will actually conclude that Vista itself is broken, despite what the error message says. This is a bad thing for Microsoft.
And no, I’m not going to flash the ROMs of my Linksys NAS unit – for a start, how would I do this?
Raymond is dead on: it’s a problem, it’s annoying for MS, but ignoring the problem and shifting the blame is the worst thing they can do.
This same compatibility point has come up so often on this blog that I’m surprised people still don’t get it :-)
If it’s impossible to detect the bug in the remote server without causing a local failure, then your only options are to run predictably slow or unreliably fast. That’s really an ethical and a marketing decision based on the number of users expected to be affected, not really a technical one. Surely there are uncounted users out there with drives mapped to ten separate NAS devices who’s company requires they shut down their PCs every night. Locking up their explorer for 30 seconds ten times a day is probably best avoided.
However, if it’s practical to detect the server’s capabilites in advance, or to attempt fast mode and then fallback to slow mode without causing a local failure, then there are some other options. An evaluation of the user benefits of fast mode relative to the user pain of the fallback scheme will suggest how much development effort this issue deserves.
Figuring that any speed detection or fallback scheme will incurrs significant per-server costs at runtime (or else you probably wouldn’t have posted about it,) some form of persisting the info across reboots would probably be beneficial.
However, the cache doesn’t have to consume a lot of resources, and it doesn’t have to be particularly vulnerable to DOS. You just need an appropriate data structure.
What information needs to be stored? Any given server falls in one of three groups: "Known Slow", "Known Fast", or "Not Known".
If an occasional fallback event isn’t too painful, then we can proceed optimistically. This allows us to only consider the "Probably Fast" and "Known Slow" cases, and take advantage of heuristics. For example, you could persist a bit vector indexed by a simple hash function of the remote server name. The vector gets initialized with 0, and gets set to 1 to represent "Known Slow". Periodically, some percentage of the vector gets zeroed to allow upgraded servers to eventually be re-negotiated. A hash collision (or DOS attack at worst-case) produces a temporary revert to the (currently acceptable) "slow" behavior. At first glance, I’d guess that a table small enough for a single disk block would still be large enough to make false positives due to hash collisions extremely rare.
However, if fallback isn’t practical and server speed detection has to be done in advance, then any cache must use complete server identifers in the peristent cache. A size-limited circular file could be a simple way to implement "least recently loaded" expiration. If lookup needed to be faster than linear-search, a fixed-size hashtable could live at the start of the file.
I know I’m late with my suggestion, but here goes:
Always use fast. If an error is returned (maybe any error, maybe just this samba one), throw the results away and start over with slow, just this one time. Add an error to the log. Any additional requests simply keep doing this. Keep no history about this server having this problem.
The end result is that folders with 100+ items are REALLY slow on broken servers, but small folders on them are fast. Non-broken servers work fast as well.
Users are not burdened by error popups, admins don’t need to hack the registry, and the Windows code base isn’t burdened by complicated fixes that are no longer needed once Samba is patched.
I like Brian’s idea with "FastSamba"
I wanted to propose the same, but he was faster :-)
And I think the Samba team would have nothing agains. They also want compatibility and their server not to be perceived as bugy/slow :-)
Also,
I don’t think I’ve seen this question asked upthread: has anyone from Microsoft either
1) commented on the bug in Samba’s tracker
2) spoken to one of the core developers
or
3) posted a message to a mailing list
asking the Samba team what they think about the problem and possible solutions?
Raymond,
in this story I miss the precision you’re addicted to elsewhere.-)
I was a little bit disappointed then!
Some points already have been addressed, so repetition might happen:
1. UNC paths ain’t (valid) URLs.
There’s only one (so called.-) web browser out there that accepts them.
The correct notation is file://some.server/… and should work in other browsers running on Windows too (Netscape did it right many many years ago).
2. Overflowing a table of hostnames with wildcard DNS addresses can easily be avoided when the DNS client requests the canonical DNS name and only stores that.
C.f. RFC 1123 chapter 5.2.2:
| Canonicalization: RFC-821 Section 3.1
3. The problem is not really in Explorer resp. its IEnumIDList::next but in the redirector.
Now the constructive part:
0. When detecting the error write an event log entry referring to an MSKB article.
1. add a registry setting to force slow mode and document that in said MSKB article:
there are so many SAMBA installations out there, not just running on different Linux distributions (and home brew LFS setups) as well as arbitrary Unix systems you won’t have even heard of, that it’s impossible to address every vendor or system administrator.
Many SAMBA installations are run on fairly old distributions where the backport of a SAMBA fix would be necessary (this is especially true for the longer living RHEL or SLES installations in companies).
Sometimes the host system for SAMBA can’t be upgraded because the main application is certified for exactly that current host system version, or won’t be upgraded since SAMBA ain’t the main application running there, it just happens to be installed and used too.
I’ve two SVR4 Unices here still running SAMBA 2.2.12 because newer versions of SAMBA won’t compile (I’d have to repair many header files and the vendor’s compiler, or port GCC).
BTW: does the error show in all SAMBA versions?
2. "Be liberal in what you accept, and conservative in what you send".
There are quite some hacks in SAMBA working around flaws in the Windows CIFS/SMB/NetBIOS implementation to keep both interoperable.
Now it’s Mícrosofts turn.
I’ve no doubt that SAMBA will fix it, but: see 2.
Stefan
Raymond, maybe it is still possible to continue after receiving a strange error? this would require that 1) the file names need not be returned in any particular order and 2) to remember the last query and when a bug is received indicating the old samba, issue a slow query and filter out the entries which have been returned already. And of course log an event for the administrator etc. Then it won’t be necessary to remember a list of servers, or waste a lot of memory, or return incomplete lists (unless the directory is modified between two queries, in which case the results probably wouldn’t be consistent anyway).
"On the other hand, MS could save itself trouble like this in the future by documenting protocols. Most of SAMBA is based on reverse-engineering what comes across the wire. And make future protocols a hell of a lot simpler than Server Message Block. "
Nuf said.
I agree with A Tykhyy.
While it’s not pretty, the best way to not break, and still work for all clients, would be to make a note of the filenames that are coming back from the request, and only discard that list when the request is completed.
Thus, if the strange error occurs, the request could be re-sent using the slow method, and the filenames already returned could be simply ignored.
It adds memory overhead for keeping track of the filenames, but it’s a temporary use of memory, and since it sounds like it’s usually around 100 files, it doesn’t sound like it would typically result in a very large spike in memory usage.
There is lack of information regards whether the failure is a specific one, "random" or that only certain number of entries (100) are returned (with no hint that there should be more).
I am going to assume the error is such that upon encounter the server can be assumed to be the broken Samba one.
Thus I suggest:
A Twist to the popular suggestion:
Upon the particular failure, retry with slow mode. The user won’t be notified of this failure since there is little (s)he can do about it in many cases. If the slow list succeeds indicating that it really is the broken Samba, then try, for example, to create a directory on the Samba that is known not to cause directory to be created, but will put a failure in the log with the directory name. Such as mkdir "Broken_Samba_Please_UpdateNULL". However if this approach won’t create entry in a log, then second approach is to try create a real directory in the root of the samba server, possibly hidden one: mkdir .Broken_Samba.. This will cause the admin to take a notice, but won’t clutter the server around with dirs and won’t really ruin the day since it is invisible by default.
What’s the measured difference between fast and slow mode for large directories?
Stefan: re: UNC paths ain’t (valid) URLs.
[[ There’s only one (so called.-) web browser out there that accepts them. The correct notation is file://some.server/… and should work in other browsers running on Windows too (Netscape did it right many many years ago). ]]
You may not realize this, but Explorer and IE are pretty much the same code. Do you expect users to stop typing \servershare in Explorer? Or for Explorer to translate?
Jen, you can already type things in the address bar of IE that aren’t URLs. For example, you can type "www.microsoft.com" instead of "http://www.microsoft.com/".
Having the ability to type "\servershare" in the address bar doesn’t mean you also have to accept such syntax where real URLs are required (like "a" tags in HTML documents).
Here’s something that you may have missed:
Work with Samba to build a solution that works with old versions of their stuff and allows Fast-mode without any config goofiness.
Jen:
believe me: I knew that!
I expect Explorer.exe to be able to browse to \servershare, and I don’t mind that IExplore.exe does it too.
In HTML code (not interactive) UNC paths are wrong and MUST yield an error!
Stefan
Raymond,
This may be a stupid question but: We are assuimg these are at least upgradable Flash ROMs right? This way in all probably theory the machine could eventually be upgraded. If even *one* vender isn’t selling Flash ROMs that would possibly change the fix.
(While it seems that since even TI’s have Flash ROM in it, I’ve seen alot of Cheap network storage–presumably for home us–hit the market lately, which may or may not be flashable.)
I suppose if you really wanted to get into left field, you could have Vista do something when exactly 100 files are returned. This of course would be based on the idea that the bug always returns exactly 100 files when the number is greater than 100 and always returns the correct number when the file number is less than or equal to 100. Furthermore you’d be assuming that the chances that a file sever with the bug having exactly 100 files on it is, well, small. Based on that you might be able to do something. Of course it’s convoluted, and pretty much useless…
(I never said the idea was good!)
I really hope that this cache and the even wilder ideas of this crowd won’t find their way into Vista.
Just make a group policy/reg key to disable the fast queries and have it default to off. (Thus breaking Samba)
People who use Samba willingly are used to it breaking when updating Windows to the newest version. Documenting this thing is a must of course.
zzz: what if the folder gets copied somewhere else? What if you don’t have write access to the share? I think this may be an archetypal method of warning about the error on Unix, but it doesn’t transfer to other architectures.
Then bugs like this would not happen as much. As it stands right now the CIFS license is not open enough for the use of the Samba prtoject so they still has to clean-room it. Maybe if they did not have to waste so much time reversing undocumented stuff then they would have more time and bugs like this would not show up.
Raymond, your HTTP attack scenario will *only* work if the user is running IE. No other browser (…well, I’m not sure about Opera, but not FF, not Lynx, and not wget) will follow UNC paths, especially not in iframes. Not that this necessarily helps the end-user, but it is an instance of other browsers being better in some cases. ;-)
Also, Samba is not a "driver" in the general sense of that word. It’s done in userspace (so much the better, because upgrading it doesn’t require a reboot!), not in kernel space. Not that that actually changes anything, but referring to it as a driver seemed to cause some of the issues in yesterday’s posts. Since that’s an inaccurate description anyway, maybe referring to it as the "server software" (or something), which would have been more accurate anyway, would have helped the confusion a bit.
Brian — there’s a problem with that statement. Samba *CAN’T* have been tested against this slow-query thing, because XP doesn’t do it. AFAIK 2K also doesn’t do it, but I’m not sure on that. Your argument that OSS is not tested as well is therefore worthless — it would not have been possible, short of writing your own SMB/CIFS client.
BillK — that sounds like a valid option, actually. If Samba already has a different value in there than Windows, and it’s the *only* other SMB server with this problem, that sounds like a decent fix. Not great, as my post yesterday on DOM detection went into, but perhaps passable.
I just *knew* that link pointed to my post!
Anyway, I stand by my original assessment. The only reasonable course of action is to present an error (returning partial results without any indication is completely unacceptable). Furthermore, one should definately provide an option to switch to "slow mode". Although I don’t what difficulty there might be in remembering which servers have been selected as slow or not.
I’m not terribly familar with the SMB protocol but is there a way to detect the specific version of SAMBA in use on the remote server? (if so, I hope every vendor doesn’t change this to their own value). If one could reasonably detect it, I would try and make slow mode the default for that server version.
Even the "best" developers/testers who have access to the CIFS specs produce bugs and don’t find them in time: see MSKB article 896427!
Stefan
Damn – I was just about to add a comment to the last one, when I noticed the pingback!
In the SMB_FIND_ID bug, the difference between "fast" and "slow" is that the "fast" call returns an ID (which I assume it can then use in the future for faster calls). So running the "faster" one in the background is pointless. Smbd’s FindFileFirst happily retrieves the first 100 or so files (this number can be set by the client), but FindFileNext is missing the code for returning ID’s, and fails as soon as it’s called.
If that’s the case, a quick look at smbd.c suggests that if there are more than 100 files, the server itself re-queries the directory to find the next one, so has the race condition present. It also looks like the client could simply switch over to the non-ID listing if it detects the server can’t cope, and then get the IDs as required.
Also, if it knows it’s talking to a Samba server, the redirector could request only one file in the FindFirst transaction, and then issue a FindNext. This would allow the whitelisting to happen without an enormous directory. If there’s only one file, there really is no performance penalty.
So it looks like most of the suggested workarounds would work, and the user doesn’t need to be brought into it. Assuming this *is* the problem Raymond has in mind, the only question is which fix is most resilient.
Whatever is decided, it seems the best thing is (as Larry suggested) to create a new SMB dialect for the new fixed Samba, so that eventually the fix can be deprecated.
Mark
In Pseudocode (Alas, I’m not familiar with IEnumIDList::Next):
Next(LPXY *list)
{
hRes = GetNextFromServer(list);
if(hRes == kObscureSambaError)
{
SortOutSambaError(list);
// Remember to use slow for this server
UseSlowForServer();
UseGeneratedListForThisDirectory();
}
}
SortOutSambaError(LPXY *list)
{
// This will return files e.g. ACE
GetListFastFromServer(fast);
// This will return files e.g. ABCDE
GetListSlowFromServer(slow);
// This will generate a new filelist e.g. ACEBD
list = AppendFromSlowToFast(slow, fast);
// Save it for this server and this dirctory
// So we can use it again for the next Next()
this->generatedlist = list;
}
____________
Ok, there is more code needed, but I hope you catch my drift.
The fixed (for bug #3526) Samba is in Fedora Core 5, Debian Testing, and various other bleeding edge distros. It will be in all the new commercial supported offerings over the coming months and of course the newer version or a backport will be pushed into the update queue for existing systems like RHEL.
Bugs happen, people trying to pin "blame" on someone are missing Raymond’s point, with the possible exception of those pointing at an overcomplicated and underdocumented protocol as the cause.
J: *Everyone* releases untested code. Perhaps you are laboring under the assumption that it’s possible to completely test every single code path in a program? If so, you might want to shed that assumption right now, because it’s not actually possible.
The difference with open-source isn’t that it’s better tested, it’s that fixes for the breakage that get found in the tests get integrated into the main software tree much more quickly.
(I don’t know how many bugs I’ve found in third party closed source software that I use at work that will never get fixed, because basically the vendor doesn’t care anymore. That includes Microsoft stuff as well; in particular, there’s a huge problem with BITS 1.1 and proxies that’s bitten me several times, where the BITS 2.0 update won’t even download via Windows Update when an authenticating proxy is set up. Taking the issue up with the Windows Update PSS group resulted in a fix of basically "don’t do that then, just set them up as a NAT client instead", which is not at all a good fix. The WU activex control should work with both BITS 1.1 and 2.0 — after all, it worked with 1.1 until WUv6, and BITS 2.0 is a required update before WUv6 will work anyway.)
For the comments that say the following: "You can apparently throw out all the suggestions about retrying the operation, because they would fail too:
The next NtQueryDirectoryFile [fast] call after getting the first 128 directory entries returns STATUS_INVALID_LEVEL.
If we switch to using the FileBothDirectoryInformation [slow] after getting this error, that call returns STATUS_NO_MORE_FILES. This does not happen when using the FileBothDirectoryInformation [slow] level right from the start. "
****
IF you SWITCH to using the slow version after getting the STATUS_INVALID_LEVEL error.
How about closing the connection, and then reopening it in slow mode? (Starting completely over after seeing the error.)
If "closing the connection" isn’t the right term, then start over and do whatever you would have done if fast mode didn’t exist.
Why hasn’t anyone mentioned that possibility?
David W
"You could copyright the key too stop the Samba people from distributing it if they extracted it from the Vista server binaries."
Perhaps you’ve not heard of the case of Sega vs Accolade? Interoperability is a legitimate excuse for effectively violating copyright, even if it seems to cause endorsement by trademark, when the copyrighted work itself is used merely as an obstacle for interoperability–ie, Samba could copy said key but say Compaq had to reverse engineer the IBM BIOS, since the BIOS actually did work; I’m not sure what would happen if the code itself was the key, though I’d imagine it’d be kosher so long as they used it merely as a key and wrote their own code.
And just to point out how the DeCSS case doesn’t apply, the DeCSS case only shows that this doesn’t apply towards protecting specific works. But reasonably if one needed to CSS encrypt a movie for it work in a DVD player, then one would be perfectly allowed to encrypt one’s own movie. The DMCA, afterall, is about preventing the cracking of encryption for a work, not preventing the encryption of something one made. Go figure.
"*Everyone* releases untested code. Perhaps you are laboring under the assumption that it’s possible to completely test every single code path in a program?"
Stay within the context of the argument, please. If I release software that supports a fast mode and a slow mode, you can bet I’ll test both modes. Of course I won’t be able to test every possible branch, but I’ll be damned if I release a major feature without testing it.
But for that matter, you don’t even know whether they tested this feature, you’re just making stuff up for the sake of arguing. They probably did test it, and they probably just didn’t find this bug.
>> If Samba’s the only SMB server that exhibits this problem, and a fix is already in the codebase, it’s only a matter of time before that fix gets everywhere it needs to be. By the time the next version of Windows comes out, the fix should be pretty much everywhere, and you can flip the switch. <<
Now think this through – the big problem is not:
1) whether or not (or when) the NAS vendors fix it; or
2) whether the kind of people who read Raymond’s blog will be able to handle the problems they might run into – they probably could handle it (even if it pisses them off when things stop working and they have to figure out how to fix it – I know I wouldn’t be thrilled);
The big problem for Microsoft in this situation is that there are probably thousands (hundreds of thousands?) of these defective servers out there being run in environments like your dentist’s office, the bar and grill where you’re getting lunch today, maybe even your neighbor’s mother’s house. None of these people know anything about Linux, Samba, or SMB. They probably don’t even really know what a server is. All that thing is to them is "a box where my files are". Period.
If I’m one of those people and I install Vista and suddenly most of my files have vanished or I get these incomprehensible error messages (Error 0x81439003: Your SMB server device is running incompatible software, please upgrade it – Yes/No/Cancel), then Vista goes out the window. As a bonus, they get someone telling everyone they know that Vista is crap.
*That’s* why Microsoft cannot simply keep the OS ‘pure’ and just let these servers break.
I think the suggestion about having the affected server (if there’s other SMB products out there, they could concieveably be affected too) update its "family" field to reflect compatibility is one of the best.
I’d like to suggest this alternative:
Set the mode to "slow" (or "compatible") and make the user enable "fast" in Vista. In the next release of Windows, default to fast. This will allow time for people to upgrade their NAS devices as they break or run out of space. If you can upgrade the NAS for adding new hard drives, you should (hopefully) be able to upgrade it for OS updates.
Drawbacks:
1) Less incentive for immediate upgrades.
2) People accuse Microsoft of intentionally slowing the computer. (I’ve seen programs that made it sound like the 0.4 second delay for the start menu to pop up and go away was a huge conspiracy.)
3) The benefit of the Fast mode is not gotten
4) You’re time shifting the problem, hoping that it goes away.
With respect to cache trashing, (and not that I think it’s the Right way to do it), caches can be (and usually *should be*) limited in size.
With respect to cache trashing, (and not that I think it’s the Right way to do it), caches can be (and usually *should be*) limited in size.
Better idea:
Set the default on a server to slow. If you encounter a folder with more than 100 files, send the fast command and, if it succeeds, tag the server as fast, else known slow.
This works with all servers, takes advantage of fast behavior when large folders are encountered, and requires no user interaction.
Default to fast, but fall back to slow if there is an error. Provide a registry setting to force the use of the slow method (thereby avoiding making two requests each time to buggy servers).
Advantages:
* Users see correct results in explorer regardless of whether the server has this bug. IMO this should be a requirement for any solution, not merely an advantage.
* Servers that correctly implement this feature are not disadvantaged (with the default configuration).
* Users are not presented with error messages which, for most of them, would probably be useless anyway.
Disadvantages:
* Vista will be somewhat slower than XP for this operation when communicating with buggy servers, since it will perform two requests at least some of the time (depending on how clever you want to be about "remembering" which servers are buggy).
* Additional code complexity to implement and maintain this workaround.
As a developer, I can imagine that this might require a fair amount of additional complexity, but from the user’s perspective it definitely seems like the best solution.
If the additional complexity is deemed unacceptable, it seems like the only viable option is to disable fast mode. The alternative is for Vista to display incorrect file listings to some customers, bearing in mind that many of these customer will (for various reasons) simply be *unable* to address the root cause of the problem. For them, the only way to avoid this problem would be to not use Vista.
Throwing up an error or warning dialog is worse than useless for most users, since most people either won’t know how to fix the problem or won’t care that it’s a little slower. If someone really cares about the difference in performance, then they’re more likely to be motivated to do a little research. Provide a KB article for the problem, and from there they can learn how to disable fast mode and at least get the same performance they had in XP.
This is a bug someone reported to us 19th feb 2006. I fixed it the same day (it was an error in my code, missing a couple of entries in a switch statement). The bug – here :
https://bugzilla.samba.org/show_bug.cgi?id=3526
was fixed in 3.0.21c. By the time Vista ships I expect most vendors to have moved to at least this version. If the Microsoft engineers came to the CIFS conference along with all the other CIFS engineers this problem would have been found and fixed in earlier versions of Samba. I urge Microsoft’s engineers to communicate directly with the Samba Team when they find problems like this. We have good relations with all our vendors and have the ability to push expidited bugfixes to people who are shipping Samba code.
To quote Steve McQueen, This is simply a failure to communicate :-). Let’s hope we all do a better job in future.
Jeremy Allison,
Samba Team.
After read about this, I go to check if Samba will register it’s version when joining domain, but sadly to find out that they just leave the OS version information blank.
And by right-click -> properties, it just show that it’s a "Windows NT 4.9 Server", which in this case doesn’t helpful either.
Suggestion:
1) Store a state per SMB connection: Untested, FastCompatible, SlowCompatible. All connections start Untested. Use "slow" mode for Untested and SlowCompatible, "fast" mode for FastCompatible.
2) If you come across large directory in the normal course of using an Untested server, save its name. On an idle thread, use this directory to identify whether the Untested server has the bug, and set its state to FastCompatible or SlowCompatible accordingly.
3) Change the SMB protocol to eliminate the long-term requirement of the compatilbity test in the long-term.
Oh, and I forgot — when the count of filenames returned by a fast query exceeds say 200 (more than the maximum necessary for Samba to produce the bug) then the filename buffer may be discarded. And of course no buffer at all for slow queries.
Well there you go Raymond. Leave it as fast mode, document older Samba/NAS devices may have issues, and link to a quote from Jeremy saying that said devices have had the code for a while and should be fixed :)
If you are feeling nice, provide a RegKey to force it down to SLOW, for those who cannot upgrade.
"My bug equals your failure to communicate."
Gotta love accountability. Hey, you’re reverse engineering software–that’s life in the big city. Feel free to come up with your own network service, it has been done (Microsoft reverse engineered the Netware client way back when).
I have an idea. It’s practical and user friendly, and it will save time testing. Additionally, it will give the slashdot crowd aneurisms. I consider this to be a feature.
Add a new transaction to SMB, "Check server trust level". The client would send a random challenge, and the server would sign the challenge with a private key and return the digital signature. The client could then check this signature and know if it was a trusted implementation, i.e. one that was verified in house, or some untested junk like Samba written by the great unwashed with a user configurable and hence untrusted version string.
You could copyright the key too stop the Samba people from distributing it if they extracted it from the Vista server binaries.
If the server were trusted, you could use new prototol stuff like fast queries. Untrusted servers would run in compatibility mode, i.e. whatever works on *all* untrusted servers.
That way, you can innovate freely, and you only need to test clients against servers running your code. When the client is used with a untrusted server it goes into legacy mode. And you can stop the thieving samba hippies from impersonating a real Windows machine if you get the security/legal stuff right.
You could even have a ‘CIFS+’ licensing program, where OEM’s would license a know good implementation of the latest server code, and it would be tested in the MS labs. They’d get a copyright license for the cryptographic keys once it had been tested, and would be considered ‘trusted’ by Windows clients. It would be a bit like a HQL lab for server code. Given that 64bit Vista requires tested and signed drivers, I see no reason why it shouldn’t require tested and signed servers.
Hell, I’ll buy an extra copy of Vista if you pull it off, because I have an irrational hatred for those open source guys. I only hope that the real solution is different enough from this that you can patent it too.
It’s the AARD code for the 21st Century. Bwahahaha!
J: You’re right, I went off on a huge tangent there. So let me recap what happened up until that point.
Brian asserted that this specific case "proves" OSS isn’t tested any "more better" [sic] than closed source. (Never mind the fallacy of the excluded middle in that argument, at least for the moment.) I responded, saying that there was no way they could have tested it with any known-good client. Remember, the protocol spec is *closed*; Samba does *not* have any way to know what legal client behavior is. (Or if they do, I don’t know where it would have come from; certainly not Microsoft.) The only thing they can go on is the behavior of released OSes, i.e. XP and 2K. And XP doesn’t do fast enumerations.
You said, basically "so this shows he’s right, that they didn’t test this feature before releasing it". That’s not the point, though — the point is, they *couldn’t* test it. Even if they had wanted to, they couldn’t.
Now, if may have been a mistake for them to include that feature (or enable it) without testing, I will admit that. (If it’s even possible to advertise support for "fast enumeration". If not, then they don’t have much of a choice.) But that still doesn’t mean that OSS is tested less than closed-source, which is what Brian was saying!
(Back to the fallacy of the excluded middle: One OSS program has a bug, OK; that does not preclude the possibility that in general, open-source is less buggy than closed. One counterexample in Samba would disprove the statement "all OSS programs are less buggy than closed source stuff, all the time", but nobody I know of has ever made that assertion.)
Its articals like this that show the real problem. I think most vendors would be happy to work with Microsoft to fix these kinds of issues. I mean come on Windows makes up 99% of the desktop market. Rather than adding more hacks they could use the strength of the 5000 pound guerrilla to say "There is a bug here, please fix this, we have already tracked down most of the issue for you". It seems to me the cost in QA time would be less than the cost of adding all of these hacks and going through the process to get them in the source tree.
I can understand the point of needing hacks for third party buggy code back in the Win9x days when no one was on the internet and getting a application update was not possible. Those days are past. As for the custom inhouse application argument. If Microsoft stopped adding hacks then maybe developers would get the idea that they cannot do undocumented stuff and a lot of the problem would go away leaving the majority of the compatiblity issue being bugs like this.
So, I don’t have a solution to problem, but how long before hackersite.com now gets bought up and transformed into a porn site and someone says "Look, MS are now advocating porn in the blogs, they really are the spawn of the devil…"?
As I said before… No errors; no new APIs; run in slow mode unless a fast mode server can be identified.
If Samba is always slow, tough. The long-term solution is for Samba to change the identify (eg. "Sambb") presented to Vista.
Ideally Vista should detect fast mode Samba now, even if by looking at odd side effects such as unused bytes in packets or timing.
So you knew the ‘SMB2’ negotiation you were adding to Vista would break Samba-based devices since at least late 2005, but didn’t even /notify/ the Samba team? Instead, you did nothing while they had to reverse engineer what you had changed and add support for it. Why?
Is that your policy for how to deal with backwards compatibility issues created by pre-release software, or is this a special case due to Samba being a competitor? This sounds a lot like DR-DOS.
I forgot to mention. The bug was open in our bug db for a grand total of *three* minutes before I had the fix committed into the SVN tree. I don’t think we could have reacted faster in getting a fix done than that.
As I said, if it’s causing a problem with Vista deployments let us know and we’ll poke our vendors with a stick to make sure the fix gets widely updated. It’s two extra lines in a switch statement so I don’t think it’s a problem for people to review it for correctness :-).
Jeremy Allison,
Samba Team.
Christian:
It seems to me your example of IT departments switching every 10 years makes it less trouble for issues like this. Thats more than enough time for third party vendors to release updates and be vista compatible. In 10 years if someone is running the version of Samba in question they should go find another field to work in due to poor understanding of how to manage infrastructure. Just calculate the amount of bugs found for any product over any length of time for any product.
Whats sad about issues like this is that the ammount of time it takes to report a bug like this is the amount of time it takes me to log on to irc.freenode.net and join #samba to talk to one of the developers. They have always been quick to help me in the past. I understand that for older applications, where the vendor is out of business or its a custom application there needs to be more room for compatiblity hacks but for a open source project there is no excuse.
Trying to use the argument…"if we make it slower then people will say we are trying to break Samba" is totally bogus. Go talk to the samba developers. Show a public record of trying to work with them to fix the problems. If the result is that a hack has to be added and it makes Vista run like crap when talking to Samba then you have a public record showing your co-operation with the third party. That should make the EU, US and any other court and the paranoid happy.
I didn’t find anyone mentioning the following solution:
Forget about "Fast Mode" protocol completely. Implement a totally new protocol "Fast Mode 2" which could act just like "Fast Mode". Any server not supporting "Fast Mode 2" will fail immediately on the first call and you can simply fallback to "Slow Mode" then. Any server implementing "Fast Mode 2" can be assumed to not have the 100+ files bug. You can even tell the SAMBA people how to correctly implement that new protocol and you can backport it down to 2003/2000 server.
This way no user ever has problems with that broken "Fast Mode" protocol. No cache, no hacks…
Unfortunately I think this solution is quite expensive to implement :-)
I will followup my last comment with this…you have a vendor with a embedded device. I have a router here that runs Linux and from time to time gets flash updates. If there is a problem, I flash it. I am sure the Samba project has a good relationship with its vendors. I am sure Microsoft has a ok relationship with them as well. Simple communication will get a fix released and deployed.
Jeremy,
The problem is that fixing the bug isn’t the issue. Raymond acknowledged that the bug was fixed quickly.
The problem is that people have deployed servers with the bug in them, and now the question is how do we work around those broken servers. Unless you can guarantee that the population of SAMBA servers in the wild with the bug is exactly 0, it doesn’t change Raymond’s problem.
This is the problem with compatibility issues, especially when they’re cross-platform. Microsoft can blacklist drivers that are known to be bad and let the user know how to get a new one. We can’t do the same thing for software running on some other box (which may not have a mechanism for updating).
I run 5 samba servers, this is going to break me. Fortunately, I will have upgraded all my (non-embedded) systems way before Vista ships, though the version embedded into VMWare is out of my control.
The best fix in this solution is: never let it happen again.
1. document your protocols, the way the EU have been telling you to do for months. The source (with restrictions) is wrong, all people need is honest documentation.
2. discuss changes with other users of the protocol. That includes the OSS implementation.
3. recognise that you no longer own SMB, just as Novell dont own Netware and sun dont own NFS. It is so embedded that you cannot change things willy-nilly.
Its unfortunate that SMB is an ugly dog of a filesystem protocol. NFS is a lot cleaner and much easier to implement, and is well documented, with open interop festivals to find problems early. Of course, NFS has different flaws -it was never written for a world where a laptop could 0wn the LAN.
Adam,
you look at file URIs only from the Windows side and the implementation chosen there!
In general, file://server/path says "use the local filesystem access methods to fetch //server/path".
Just try it.
The other notation for access to files on local drives needed in Windows is a pita!
Stefan
I doubt I am adding value to this, but thinking about the idea of "remembering" features of any connections; I wonder if this is a good strategy at all. There is the "cache pollution/DOS" idea, plus what if I replace my xxx server with yyy with the same name and IP? What about load balancers and such that mask what the real target is? Also, unless I misunderstand, Raymond is saying that it is too late to do anything anyway because you are already 100 files in. So, much as I hate leaving performance on the table, I think that we will let this one go, leave it in slow mode for now, and put fast mode in CIFS 1.1 in LongHorn server. At least not slower than XP today. Shame though, the lowest common denominator holding us back … but compat is king in the Windows space unless there is Security risk.
kuwanger wrote
"Interoperability is a legitimate excuse for effectively violating copyright"
With my solution, Samba would still be interoperable. Fast mode is a performance thing, Microsoft don’t have to support fast mode on a third party server if they don’t want to. So the samba guys wouldn’t have interoperability as an excuse for violation copyright/patents.
In fact, SMB is in an ideal state for this. There’s the current stuff which is de facto open. All the new stuff doesn’t have to be. Users would be free to choose cheap NAS boxes based on samba, or expensive ones based on a licensed/tested Microsoft protocol implementation with slightly better performance and more features.
Steve Loughran
"3. recognise that you no longer own SMB, just as Novell dont own Netware and sun dont own NFS. It is so embedded that you cannot change things willy-nilly."
Seems to me that they do recognise this. The whole point of this is to preserve the user experience with broken third party servers, just like they do it with broken third party applications.
The problem with Samba is that you can’t tell which version you are dealing with. If it’s a NAS box, you have no way of knowing if it has a particular bug or not. And this bug looks like you can’t even switch back to slow mode if fast mode fails.
Incidentally, is it possible to restart the connection after a fast mode fail? Maybe you could disconnect, set the "slow mode only" flag for the server and reconnect, and that would be sufficient to reset the server’s internal state.
Vista isn’t done till Samba won’t run. Just another example of why Micosoft S.U.C.K.S
Stefan:
No, I’m looking at this from a multi-system point of view.
"In general, file://server/path says "use the local filesystem access methods to fetch //server/path"."
Um. How does the "file" protocol work remotely? What port does it run over? If you can point to any kind of discussion on how this protocol might work, anywhere, I’d be glad to read about it.
The "file" schema makes no sense remotely. You have to have a protocol to access a remote file. There is no way to access a remote file with the "file" URI schema, unless that file has been mapped to part of your local filesystem’s namespace.
Windows just does this automatically for SMB. It effectively "mounts" all the SMB shares on your local network to the "\" part of your local filesystem. All the files are treated, by all applications, exactly the same as local files.
So, the following absolute paths to files exist under windows, both as names in your computer’s filesystem.
c:pathtofile.txt
\serversharepathtofile.txt
For that reason, they need to be treated the same as parts of URIs. They both go after the "file:///"
I’d like to see you try "file://server/share/path/to/file" on a UNIX system and see what happens. Yeah. That’s really going to work.
Not.
If you want a remote file that’s not been mounted to your local FS, you need to specify an actual protocol.
smb://server/share/path/to/file
nfs://server/export/path/to/file
ftp://server/path/to/file
http://server/path/to/file
Will work (protocol handlers permitting) as all of those are defined protocols.
"file" is not a network protocol!
I hope that Raymond does not get ANY trouble for posting this!
Just imagine that things like this appear in the current EU-court!
@Raymond: Do you get permission for posting this? Wouldn’t you post it if you knew people find out that it is Samba?
Jeremy Allison: don’t brag about the bug being open for 3 minutes, the bug report clearly stated the solution!
You should be thankful to have such nice testers!
And by the way, I think Raymond’s ultimate point is to prove you that you *can’t* just say "Break stuff, force developpers to fix!!" like many people do in some of his blog-entries.
You *too* would want to release a painless product and would taint your code with ugly patches…
Wow, you get everything in this blog! Oh well, at least Jeremy’s post should stop the politics, although he may have pre-empted Raymond’s usual denouement ("in fact, here’s what we were able to do…").
Apologies if this reply is made redundant by queued posts.
Adam,
take a Netscape on Unix, enter file://localhost/ and watch your ftp servers DIR listing appear (given that you have an ftpd running.-).
Netscape on Windows will but show the appropriate file share, just like IExplorer.
(I’m writing this from memory, I don’t have a Netscape running here any more and did not test it with Mozilla*).
file:// is a kludge.
Since Windows "automounts" remote shares this local access protocol works remote too.
That’s all, nothing more.
What do you expect on a Unix system with automountd running when you enter file://net/remotehost/path/ (in a browser that does not map file: to ftp:)?
Does file:// now really means nfs://?
Yes, I could have written file://\servershare in the first place, but the client used here is Windows, so I choose the somehow sloppy but nevertheless working notation.
Stefan
Can somebody pander to my ignorance and tell me how long Samba has supported fast queries?
Jeremy:
> If the new info level doesn’t work right then applications may fail when running against a Samba server, and we of course get the blame.
Haven’t you been listening to Raymond’s back-compat posts? Microsoft is the one that gets blamed for this kind of problem, since it’s their client that fails. Not you! ;-)
(Not that I entirely disagree with Raymond; certainly a nontrivial subset of people that don’t have a clue about how network protocols actually work will blame any problem on the most recent change. That doesn’t mean they’re right, but when the only thing holding your company afloat is the number of people using your products (and buying new licenses for upgrades), it becomes important to prevent inaccurate "this OS has bugs!!!!" claims from surfacing.)
Anyway, thanks for the post; I bet I’m not the only one that thinks that from time to time, it’s good to hear exactly how a certain bug got into any program.
Stefan Kanthak: "1. UNC paths ain’t (valid) URLs.
The correct notation is file://some.server/… and should work in other browsers running on Windows too"
No, the correct notation for local files is
file:///c:pathtofile.txt
Because the file protocol does have a "server" component, but it must be either a valid name of your own computer, or left blank in which case "localhost" is assumed.
Similarly, the correct notation for UNC paths in Windows is:
file:///\serversharepathtofile.txt
because you’re still talking to the local filesystem when opening a file via a UNC path. Yes, the local filesystem is doing some network access to get the file for you, but UNC paths are part of the local filesystem namespace, and therefore need the "file:///" before the "\servershare" to be valid.
What you want is:
smb://server/share/path/to/file.txt
which accesses the remote computer directly over SMB without talking to the local filesystem.
Now, it has to be said that Internet Explorer (and Windows Explorer, which might be the same application, but I can never tell) are very lax here and do do their best to autocorrect invalid file URIs for you. So yes, if you limit yourself to IE, what you said will work.
But you *can’t* generally use file URIs like that to open remote files with other browsers (or on other OSs).
(Remember, URIs have strict definitions that are not tied to Windows)
Jeremy, "To quote Steve McQueen, This is simply a failure to communicate :-)." Actually, it was in a Paul Newman film and Paul’s character didn’t say it, it was the guy who ran the chain gang, a recognizable character actor whose name I never learned (but evidently was Strother Martin). Cool Hand Luke was the film.
Meanwhile, I love your report on what you go through to maintain Samba interoperability with Microsoft products.
Raymond: can you exploit Internet zone information (local, trusted internet, other internet, etc.) in determining an appropriate response, at least in the local fileserver case?
Stefan:
file://localhost/ returns a list of my root directory.
ftp://localhost/ returns "Connection was refused"
http://localhost/ returns the default apache page.
Running Firefox on Linux 2.6.15. Konqueror mostly does the same, but isn’t picking up http for some reason :-/
On a unix system with automountd running, I’d expect file://remotehost/path/to/file.txt to return an error, as it makes no sense. It does not map to nfs, it does not map to smb, it is just an error.
If you want to use remote smb or nfs (or ftp, or http, or scp, or …) files without mounting them to the local filesystem, you need to use an explicit protocol handler to talk to the relevant server. The way to do this is to specify is as the schema of the URI, using smb://server/… or nfs://server/… (or ftp://server/…, or ….)
(I actually have no idea if userspace nfs clients actually exist, but you get the idea)
Btw, it’s not file://\servershare, it’s file:///\servershare, and you did claim that file://server/share was "the correct notation". Also, as I pointed out, I am also aware that file://server/share works, it’s just not correct.
</pedant>
:)
So, let me explain *exactly* how this bug occurred, maybe it will illuminate the situation.
Microsoft commonly adds new info levels with each Windows release or service pack. They don’t document these, they just appear.
Tridge has a protocol info level scanner as part of the smbtorture suite. This detects new info levels in the trans2 and nttrans SMB calls.
When we find a new info level we work out what it does and implement it as soon as possible – ususally a new info level appearing usually means that the client version of the Windows server version we tested against will start using the info level, so it’s very important to get this implemented asap as new Windows clients will expect this info level to work against a server and the downgrading code in the client doesn’t always work right. (We’ve seen that before in older versinos of the Windows redirector). If the new info level doesn’t work right then applications may fail when running against a Samba server, and we of course get the blame.
tridge detected the new info levels and worked out what their internal structure was. He added test code to smbtorture to ensure that querying this info level returned what we expected against a W2K3 server. Once we were sure that our analysis was correct we added it into the server code. We tested the code using the smbtorture analyser to ensure we were returning the correct data structure (so much for the claims we releas untested code).
The bug occurred when I only added the switch statements to field the incoming info level values into the SMBfindfirst code path, and forgot to add them into the SMBfindnext code path. The torture tester didn’t find this case because it didn’t test more than 100 files on this particular code path (it normally does when testing the directory scan code, but not specific info levels).
Since we’ve been aware of this problem tridge is adding such a coverage to smbtorture so we won’t get this problem again.
We’re also communicating with some of the Windows engineers and tridge has given a suggestion on how to fix this in the Windows implementation.
Jeremy Allison,
Samba Team.
There are some good ideas here. The bad ones are more fun though, and it’s still 04/01 here, so I’ll add a bad one…
– If it’s a low-memory machine, force IEnumXxx::Next to use the slow-method. This sucks for them, but oh well, it works.
– If it’s not a low-mem machine, send 2 queries for each call from IEnumXxx::Next, a slow and a fast. You wait for both to finish, compare the results to make sure they’re the same and return to the caller. When you get to 100+ with no error, drop the slow result set. If you do have an error after 100+, drop the fast set and move on with the slow set. If you end up with different results, drop the fast set and move on with the slow set. You chew up extra memory for the first few hundred records then things even out. You maybe even chew up 2 connections if the server can’t do both query types simultaneously on 1 connection. For a server that’s updated frequently enough to always cause the 2 queries to return different results, sucks for that user. And, you can totally hammer that server to death with your DOS scenario.
My idea:
-> User connects to SMB
–> Popup comes up "The server you have connected to may have an error in it. If files are missing, please switch to slow mode."
This gives the user the ultimate choice, which evidently I am in favour of :-)
microsoft cant be blamed, since its not their error, and the 3rd parties will be lickety split in updating when they find out their hardware is coming up with this error every first connect.
It depends on how well you can detect the smb version being ran on the server, and unfortunately no new updates in smb will solve that problem, due to the firmware being inable to update
Jeremy Allison said: "I forgot to mention. The bug was open in our bug db for a grand total of *three* minutes before I had the fix committed into the SVN tree. I don’t think we could have reacted faster in getting a fix done than that."
I’ve never worked on a project where I can fix, compile, *test* and check-in within a 3 minute period of time.
I’d really love to see a "MS Vista developmemt procedures vs [pick any other major project, OSS or otherwise] development procedures" analysis. It’d be interesting — I suspect Raymond has more hoops to jump through than many projects, but I may well be wrong.
Jeremy Borschen: "send 2 queries for each call from IEnumXxx::Next, a slow and a fast"
Sounds like it would be work, but I doubt doubling server/network usage would be worth it. Most admins I know like to keep their bandwidth empty rather than using it.
John: a popup message? we can’t assume there is any kind of GUI attached to the process, nor that the process is interactive (i.e. there’s no user behind it watching what happens).
There seems to be a way that samba can give it’s version. If I query my samba server for the available shares I get this result:
Domain=[MYDOMAIN] OS=[Unix] Server=[Samba 3.0.21c]
Sharename Type Comment
——— —- ——-
software Disk software dir
This clearly includes a software version. I don’t know whether a samba specific extension is used for this, but the information seems to be there.
"With my solution, Samba would still be interoperable. Fast mode is a performance thing, Microsoft don’t have to support fast mode on a third party server if they don’t want to. So the samba guys wouldn’t have interoperability as an excuse for violation copyright/patents."
You should read the Sega vs Accolade case. In short, Sega could’t forbid 3rd parties from interoperating software with their hardware through the use of copyrights or trademarks, even if they have a licensing program available to allow people to develop for the hardware. There has yet to be a case, that I’m aware of, of a company producing a low end and high end solution and claiming free interoperability with the former somehow negates any validity of attempts to interoperability with the latter. Given the Sega vs Accolade case, I don’t see why they’d suddendly decide it’s okay for a copyright/trademark holder to abuse the copyright/trademark system.
Terrific issue, as usual.
I haven’t seen many posters comment on how much of the final decision depends on details around the assumptions:
1) When was support for fast queries added?
2) How big a difference is there in the timing between slow and fast?
3) Where are you in the release cycle?
4) How likely is it that servers that have been released and have this claim are going to be fixed, etc.
5) And there are certainly several more considerations…
If one believes that there are going to be lots of servers with the bug out there once the new OS is released, and that the difference between slow and fast is not too different, then the answer has got to be that when Windows gets the ‘strange error code’ back that it reissues the query in ‘slow’ mode.
That means that you get correct functionality all of the time and that as servers are upgraded, they still get the faster queries working. This solution means that the new client will be slower in some cases, but I suspect that it is a minority of cases. Caching could be applied, but I would only have it be dynamic to avoid all of the weird questions about where it goes, what it means, etc.
I once ran an interview for a programming job and we asked a few "which do you prefer questions": windows/unix, vim/emacs, STL/boost, borland/ms etc.
After a couple of these he said in his deep Russian accent, "Ah, you like religous debate".
My point is, I think a few people who have contributed should step back and consider the issue from a slightly more abstract point-of-view.
[as a regular reader I’m also waiting for Norman’s Marvin-the-robot-like comment anxiously]
I think that we should both in the KB article and in the read me file, define a file server as "any device or computer that users can connect to access files on it, including NASes and computers running file server software, whether it is commonly called a server or not".
Adam:
I wrote file://*net*/remotehost/… (assuming that automountd uses /net/).
Stefan
PS: the missing third slash was just a typo.-)
Ok, it’s not quite different. The user Meikel already suggested it, but I thought about this by myself.
Imagine if many SMB servers has bugs in ‘Fast Mode’ (not quite difficult bacuse since WinXP don’t use it and nobody tested ‘Fast Mode’ with a serious client). Then you will have to give up using ‘Fast mode’. Just forget it.
Create a new mode, called for example ‘Fast Mode 2’ that will be supported by Vista. It can be even the same as ‘Fast Mode’ but with different identifier. In this way new servers supporting ‘Fast Mode 2’ won’t have the bug in question and probably other bugs, because when server developers add ‘Fast Mode 2’ they will test it with Windows Vista.
Disadvantage: You have to change SMB protocol spec.
Advantages: No workarounds, no bugs, Vista remains pure.
"…certainly a nontrivial subset of people that don’t have a clue about how network protocols actually work will blame any problem on the most recent change." (BryanK)
People with a clue think that, too. In the absence of any other information, starting the troubleshooting process at the most recent change is an eminently reasonable thing to do.
It is quite humorous–I wonder how long MS knew about this bug, but Samba didn’t? If "everything gets blamed on Microsoft", wouldn’t it be to MS’s advantage to at least report the bug so it’d get fixed and everyone’s happy?
And don’t get me started on the lack of SMB documentation…
Stefan: Are you sure you’re not missing a slash again, and don’t mean:
file:///net/remotehost/
If you’re not missing a slash, what browser are you using?
Adam
8: Yes, I know that "smb:" works, but "smb:" is not "file:"
"smb:", by definition, works remotely. It’s a remote access protocol.
"file:", by definition, does not, unless you have the remote file mapped to your local FS tree in some way.
How much clearer I can be here?
Seems to me there are a number of requirements for an ideal fix.
User should not have to do anything, and they always get a complete file listing.
Windows should be able to detect bad servers and use the good old slow mode on them.
Windows should use the shiny new fast mode on servers that support it properly.
The fix should not involve too much housekeeping on the part of windows.
The fix should only come into operation when bad servers are detected. Those good little boys who keep there NAS/Linux servers up to date should not have to pay any performance for those who don’t.
The fix should not hammer the server trying to determine if it’s good/bad
If Samba was changed today to say it supports fast mode then windows can check this first.
It is not possible to start off with fast mode and then if it borks switch to slow. Files may have changed so we can’t just start over.
We could start off in slow mode, if a read involves more than 100 files then silently in the background we could do a fast read and see if we get this error when expected. If we don’t get this error, add it to the fast list.
If we do get this error, add it to a slow list with a reminder to check again in x amount of time, and continure using slow mode until then.
Obviously, as Raymond said, list will need to be capped to prevent DoS attacks.
Advantages:
Users alwasy get complete file listings.
On good servers, windows will ultimately use fast mode.
On bad servers, windows will never return a bad file list to a user/application.
Disadvantages:
Maintaining good/bad lists is a PITA. As with everything, it’s a compramise between keeping enough of a list so we don’t hammer servers. But not too much that the DoS attack has impact.
File servers would have to service these extra background requests which is extra load they shouldn’t have to handle. These extra requests may have other undesireable requests I haven’t thought of. They are just a hack to find out if the server is good/bad
This still has the problem of a server that is marked as good, then I be naughty and downgrade the firmware of that server to be bad. If windows doesn’t check again then it will get to file 100 and bork. We can’t force a re-check every time as we default to a slow read.
Just makes you wonder how many other compatability hacks are in windows already. It’s ugly, I don’t like it and I’d be tempted to say screw you buggy servers. But then I’m not MS :) I’d only do something like this as a temporary measure, but that’s not an option for windows. Once this fix is in, it’ll be there for ever.
Adam: You can’t. You’re right.
jg: yes, whitelisting with your method is better, or just put a flag in the connection state data and have no list, but indeed this behaviour makes the new fast mode a possible performance penalty.
What did Tridge suggest?
No problem. Samba should of course follow the published and freely-available SMB interoperability specifications from Microsoft.
What…? Oh, nevermind…
How slow can "slow" be if that’s all we have now?
When in doubt, use slow and reliable.
Raymond,
The Samba command-line tool "smbclient" prints exact version information from a remote SMB server when you ask it for a remote share listing using "-L", for example:
[17:17:13] [cambodia:8:~]$ smbclient -L //cambodia
Password:
Anonymous login successful
Domain=[WORKGROUP] OS=[Unix] Server=[Samba 3.0.20b-Debian]
On investigation with Ethereal, this information is passed from the server to the client as a multibyte string, apparently multiple times, during the course of communication. At least one of these times is what Ethereal calls an NTLMSSP_CHALLENGE response – the Samba version string is clearly part of the response frame.
It seems to indicate to me that Windows has a chance right at the start of authenticating to a new server, to note whether or not its version string matches one of the known bad versions.
What is stopping you using this information?
David.
Adam: "On top of that, this is only an attack on the local network"
a1.hackersite.com is a valid Internet hostname.
AC: "AFAIK, most ISPs filter the SMB ports"
The attacker knows it and doesn’t use such an ISP.
mph: "Having the ability to type "\servershare" in the address bar doesn’t mean you also have to accept such syntax where real URLs are required (like "a" tags in HTML documents)."
Great. Some guy reports an error with his NAS setup and as a result the RTM version of Vista will not support hyptertext links to file shares. Who do you think will complain now?
Adam: "But you *can’t* generally use file URIs like that to open remote files with other browsers (or on other OSs)."
It (smb://) works in Konqueror (KDE) and Nautilus (Gnome).
Amos: "you can stop the thieving samba hippies from impersonating a real Windows machine [because] I have an irrational hatred for those open source guys"
Oh boy, RMS watch out!
Based on all this, I want to propose a fix that combines several solutions: add a static variable to the FindNextFile function that’s incremented each call if it’s >0. If it’s <0 use FileBothDirectoryInformation. Otherwise, use NtQueryDirectoryFile (if the server supports it). If it returns STATUS_NO_MORE_FILES, then if it’s the 129th call, and FileBothDirectoryInformation returns STATUS_INVALID_LEVEL, negate the variable so it’s negative, disconnect, reconnect and fetch the first 128 files again, possibly using the scheduler and a mutex to make the API more responsive. Return EAGAIN.
Advantages:
* Windows auto-detects the problem and works around it.
Disadvantages:
* Windows doesn’t remain pure and unsullied by compatibility hacks
I thought about it a little more while walking the dogs:
a) My reasoning was incomplete: this is a stateless method of detecting "slow only" servers, and could cause the fast codepaths to proxy on to the slow codepaths in the case of some flag in the connection structure being set.
It requires zero bandwidth wastage, and avoids all the silly tricks other people have been suggesting (the mkdir one was my personal favourite).
b) Embedding vendors may deliberately change the version string, either appending a suffix (as my Debian install does above), or in its entirety.
Perhaps a "known good" list could be used to match the server version against, instead. That might give future embedders more motivation for not masquerading the real software versions they are using.
On saying this though, I would still assume Microsoft’s testing division might be large enough to build a practically large enough "known bad" list that would cover 95%+ of NASes on the market, or those that have been discontinued.
If some popular/obscure NAS is forgotten, a small hotfix for the "known bad" list could be released through Windows Update, or via MSKB.
I am assuming this is a valid fix; I know next to nothing about the layering of the code in question.
Thanks again,
David.
Default to slow (for Samba), on a network driver level.
Upon first connect, put in list with (Unknown).
Every time a query is done, see if there are a 100 or more records returned. If so, do a dummy, identical, fast check in the background. If no error is returned, mark as (Fast). If an error is returned, mark as slow.
Advantages:
– Transparent to user
– Never incorrect results
– Doesn’t matter if server rotates out of cache
Disadvantages:
– Perf hit on small queries against Samba servers until a big one is done
– Perf hit on the first large query against a Samba server
– Have to store server list
If you store the list in the registry, fast mode can be forced for known-good boxes.
My suggestion is suspiciously like jg’s.
Sorry, didn’t mean to look like a thief.
A variation on the "FastSamba" suggestion that’s been made already. I’m assuming that there is at least a little bit of room for this string to grow. So how about now, retroactively, defining a protocol for versioning that uses this string? I’m sure this won’t be the last case where you want to work around bugs in "all versions older than X" of a particular family of clients, and that applies to Windows too.
So how about saying that if the string contains a "#", for example (you can pick a different character if "#" is in use a lot already, perhaps something weird and unprintable (or unicodey, if the protocol allows)), the part of the string after the "#" is a version number.
Then the Samba folks can keep their configurable family string but *whatever* it’s configured to, add a "#4.x" or whatever to the end of it.
This allows you to address this particular problem (if the version string is present at all, the bug isn’t) and also provides a framework for future such workarounds without this kind of angst.
Having said that, though, more importantly than any specific suggestion is the fact that you should be discussing this issue *with the Samba team* rather than with random people on your blog. How do *they* want you to address it?
(And I have to agree with the irony of the situation when the Samba team has struggled for years – more than a decade? – trying to find and workaround the bugs and undocumented behaviors of SMB in all the different versions of Windows. Without any cooperation from Microsoft. One bug in Samba in a feature that’s never been used by any Windows version…? One could even call it a "taste of your own medicine", except that clearly the Samba team have been more than cooperative already in fixing the bug in current versions)
David Wilson: "a small hotfix for the "known bad" list could be released through Windows Update, or via MSKB"
I don’t think that’ll ever happen unless it’s profitable. They could call it something else though instead of hotfix.
Adam: sorry, of course file:///net/
First I want to admit that I didn’t read anyone else’s responses so I’m sorry if this has already been accepted or rejected. Here’s my solution:
1) Create a cache of known file servers. More than 16… maybe 256 of them. This cache would store the server info plus fast/slow preference for this server.
2) Whenever a server is accessed first check for a known server in the cache.
3) If found use prefered settings (fast/slow). If fast is used and error occurs then reset preference to slow.
4) If not found create dummy folder and throw > 100 dummy files into it and read it back. If error occurs then cache server with a preferenc for slow access. If no error cache with fast access. I know this is slow but only the first time a server is ever accessed.
5) Live happily accessing the good servers in fast mode and not loosing files from the bad servers.
Thanks for a great blog. It’s a hoot!
Richard
Adam,
You wrote:
> Btw, it’s not file://\servershare, it’s file:///\servershare, and you did claim that file://server/share was "the correct notation". Also, as I pointed out, I am also aware that file://server/share works, it’s just not correct.
May I ask where you are getting your information on the file URI?
I’d suggest you read the following excellent blog post by ZekeL describing the problems with the file URI form you’re promoting:
http://blogs.msdn.com/freeassociations/archive/2005/05/19/420059.aspx
The following is section 3.10 of RFC 1738 "Uniform Resource Locators (URL)" <http://www.ietf.org/rfc/rfc1738.txt> that describes file URIs.
Thanks,
Dave
3.10 FILES
The file URL scheme is used to designate files accessible on a particular host computer. This scheme, unlike most other URL schemes, does not designate a resource that is universally accessible over the Internet.
A file URL takes the form:
file://<host>/<path>
where <host> is the fully qualified domain name of the system on which the <path> is accessible, and <path> is a hierarchical directory path of the form <directory>/<directory>/…/<name>.
For example, a VMS file
DISK$USER:[MY.NOTES]NOTE123456.TXT
might become
<URL:file://vms.host.edu/disk$user/my/notes/note12345.txt>
As a special case, <host> can be the string "localhost" or the empty string; this is interpreted as `the machine from which the URL is being interpreted’.
The file URL scheme is unusual in that it does not specify an Internet protocol or access method for such files; as such, its utility in network protocols between hosts is limited.
DavRiS:
Yes – the portion between the 2nd and 3rd / is a hostname.
And like the portion of the RFC you quoted:
"This scheme, unlike most other URL schemes, does not designate a resource that is universally accessible over the Internet."
So, any file that is on another host is not a resource that is accessible over the internet.
The only way to reliably access files with the "file:" schema is when "<host> [is] the string "localhost" or the empty string; this is interpreted as `the machine from which the URL is being interpreted’."
So "file:" URIs, in order to be accessible, must have <host> as the local host name, "localhost", or empty. i.e. "file:///"
After that, you put the absolute path of any file in your local filesystem’s namespace. Which, on Windows, is either of the form:
c:pathtofile.txt
\server\sharepathtofile.txt
Which gives us the following URLs:
file:///c:pathtofile.txt
file:///\serversharepathtofile.txt
I don’t see where the post you mention points out any problems with the file:///\servershare… schema, or gives any justification for why file://server/share/… is better, considering that, as the RFC points out:
"[The file protocol] does not designate a resource that is universally accessible over the Internet. […] its utility in network protocols between hosts is limited."
The "file:" schema is used to access files in your local filesystem’s namespace.
All smb shares on your network are visible in your local filesystem’s namespace.
In order to access a file in your local filesystem’s namespace, including remote files that are mapped into it, you specify "file:///" followed by the path in your local filesystem’s namespace for the file you want to access.
That’s the clearest I can put it.
If you can’t understand that then answer this:
If the "file:" schema can be used to access files on remote hosts, then what transport protocol does the "file:" protocol run over (e.g. TCP/IP), what "port" (or equivalent) does it run on (e.g. 138, 139, 445), and where can I see a single piece of documentation about this remote "file:" protocol? I’d like to implement it for UNIX.
Hint: "file:" is not "smb:", and any answer that claims "file:" is "smb:" is wrong. I already have "smb://server/share/path/to/file.txt" on my computer. Now I want this remote "file:" protocol you speak of.
Remember what the "U" in "URI" stands for.
Stefan:
Right – I was confused because you put "net" in the "hostname" part of a URL.
Well, I’d expect both of the following equivalent URIs:
file:///net/server/….
file://localhost/net/server…
to access whatever exists at:
/net/server/…
in the filesystem on your computer. Three slashes and then whatever exists in your local filesystem. That’s entirely my point.
On UNIX, remote files can be part of your local filesystem if mounted there. To access them with a file URI is just:
file:///mnt/remote-mount/path/to/file.txt
On Windows, remote files shared over smb are automatically part of your local filesystem via an automatic mounting at //. Under windows, the following are both paths in your local filesystem:
c:pathtofile.txt
\serversharepathtofile.txt
even though one is not a local file, just as the following are both paths in your local filesystem under unix:
/etc/passwd
/mnt/some-remote-nfs-share/path/to/file.txt
even though (again) one is not a local file.
"file" URIs made from those four local paths are therefore:
file:///c:pathtofile.txt
file:///\serversharepathtofile.txt
file:///etc/passwd
file:///mnt/some-remote-nfs-share/path/to/file.txt
Is this making sense yet?
Could one solution to this problem be to modify FindFirst to read ahead atleast 101 entries on file shares? Most of the times FindFirst are called, FindNext are called shortly after. Read ahead would be an optimization in most cases.
This problem get worse because the lack of a API to get the whole dir in one call.
FindFirst/FindNext are very unoptimized for large dirs.
Adam: just a nitpick, but URLs require the directory separator to be a forward slash, so yours should be:
file:///c:/path/to/file.txt
file://///server/share/path/to/file.txt
I’m not sure if the latter is actualy a valid file URL, syntatically speaking, but it is likely the most correct way of expressing it. The syntax calls for the part following the third slash to be a slash-separated directory path followed by a filename. The ‘//’ at the start of the UNC name doesn’t really follow this pattern.
Adam,
There are a couple of problem with putting a Windows file path after ‘file://’ and calling it a file URI (although I think (2) provides the only real-world problems)
(1) Windows file paths may contain characters that aren’t allowed in URIs. This includes ”, ‘<‘, ‘ ‘, all int’l characters outside of US-ASCII, etc..
(2) URIs use percent-encoding to encode characters that aren’t allowed in a URI or that would interfere with the parsing of a URI where Windows file paths don’t have a concept of encoding characters. This is a problem if your Windows file path contains a ‘%’, ‘#’, or ‘?’. If you have a file named ‘C:tmpMy Phone #s.txt’ the corresponding file URI should be ‘file:///C:/tmp/My%20Phone%20%23s.txt’. If you simply take the Windows file path and stick it after ‘file://’ you end up with ‘file://C:tmpMy Phone #s.txt’. If you were to parse this URI (ignoring the spaces which technically aren’t allowed in a URI) the ‘s.txt’ at the end of the URI ends up as the fragment rather than part of the path. Similar problems occur if you have a ‘%’ in your Windows file path.
Problem (2) is what Zeke was describing in the blog post to which I linked previously.
Additionally, file URIs aren’t intended to only access files from your local filesystem. As noted in RFC 1738:
"The file URL scheme is used to designate files accessible on a particular host computer."
The method of access is left up to the resolving system and resolving the URI is very much dependent on the what system does the resolving however file URIs weren’t intended for only the local file system. The lack of specification here does limit the use of the file URI when going between multiple systems. I’m aware that the U in URI stands for Uniform however as the RFC points out the file scheme is unusual in this fashion.
-Dave
Sorry, that should be a ‘{‘ not a ‘<‘ under (1).
Adam: in your answer to DavRis you omitted the "universally" twice before "accessible". "file:// does not need to be universally accessible over the internet" does not mean that it’s not accessible at all.
{Windows,Internet} Explorer both handle file://remotehost/path/to/file.txt in a transparent thanks to the redirector, mapping file:// to either
– local accesses
– NetBIOS over NetBEUI
– NetBIOS over IPX
– NetBIOS over TCP
– CIFS/SMB
– any third party add-on to the redirector.
This implementation is correct with regard to RFC1738 (and its updates), RFC1739 and RFC2151, the latter both using the same (informal) description:
file://host/directory/file-name
Identifies a specific file. E.g., the file htmlasst in the edu
directory at host http://ftp.cs.da would be denoted, using the full URL
form: <URL:file://ftp.cs.da/edu/htmlasst>.
(I should have taken a closer look into the RFCs earlier instead of using my memory only to avoid this side-thread here altogether:-(
Sorry Raymond!
Using "ftp" as DNS hostname in the example was a bad or ambiguous choice, since file:// clearly does not denote ftp:// here!
Before you start to implement it for UNIX: http://curl.haxx.se/ exists and supports file:// already, although not the way you want and the above referenced RFCs describe it.
(Even on Windows cURL only supports "localhost", discarding ANY other hostname given silently, retrieving files from local and UNC paths.)
It’s up to you to come up with a RFC-compliant implementation on UNIX now.-) Maybe you supply a correction to cURL for both UNIX and Windows? <bg>
Jules: see RFC1738 and its updates,
file://server/path is perfectly right.
Stefan
Someone mentioned that Samba can’t change the family for the fixed versions because it’s a user configurable setting; that makes things even better. Vista could look for some indication in the family that the server will support fast mode, and then Samba can tell their users to use the appropriate configuration if they don’t have a broken version (either setting a specific string or having a new configuration option that will add something). If Vista does have to default to slow mode, it should do it in the most restrictive way possible; if this means all Samba server by default, at least others can still take advantage of the fast mode.
Another solution would be to make the blacklist process manual; when the error is encountered, the user is given the choice to always use slow mode with that server. This could get annoying though, and make the spamming problem even worse.
This problem should be resolved by management, not code monkeys.
PingBack from http://blogs.msdn.com/oldnewthing/archive/2006/04/06/569873.aspx
Saturday, April 01, 2006 8:24 AM by Adam
> Stefan Kanthak:
>>
>> The correct notation is
>> file://some.server/… and should work in
>> other browsers running on Windows too
>
> No, the correct notation for local files is
> file:///c:pathtofile.txt
Microsoft and at least one hardware vendor disagree with both of you. At the moment I’m looking at Pocket Internet Explorer’s display of an unnameable vendor’s built-in home page:
file://windowsdefault_HP.htm
> Now, it has to be said that Internet Explorer
> (and Windows Explorer, which might be the
> same application, but I can never tell)
If you were using Windows Explorer and didn’t even have Internet Explorer open, and Windows puts up a message box (which no one ever reads) saying that Internet Explorer crashed, then you can tell that they’re the same application.
Sunday, April 02, 2006 9:56 PM by steveg
> [as a regular reader I’m also waiting for
> Norman’s Marvin-the-robot-like comment
> anxiously]
What an incredibly stupid machine.
(Sorry the wording is probably inaccurate because a few decades have passed since I read it. Anyway tanks for the memory.)
Monday, April 03, 2006 4:42 AM by Rich
> In the absence of any other information,
> starting the troubleshooting process at the
> most recent change is an eminently reasonable
> thing to do.
Oops. I agree with that, and Microsoft agrees with that (they say so in white fixed-width text on a background which is turning gray), which means I now agree with Microsoft.
Monday, April 03, 2006 7:30 AM by 8
> AC: "AFAIK, most ISPs filter the SMB ports"
> The attacker knows it and doesn’t use such
> an ISP.
True. My firewall logs often include attacks on SMB ports. With my former ISP, my firewall logs included several attacks each day coming from US military sites. Hope this makes someone feel secure.
PingBack from http://blogs.msdn.com/oldnewthing/archive/2006/04/19/578991.aspx
PingBack from http://blogs.msdn.com/oldnewthing/archive/2006/08/29/730002.aspx
PingBack from http://blogs.msdn.com/oldnewthing/archive/2007/02/01/1573160.aspx
PingBack from http://gabrielstein.org/?p=67