Feeds

How WinXP can make non-MS files invisible

Put that paranoia away - it's a notabug

  • alert
  • submit to reddit

Website security in corporate America

Updated again: Windows XP's search system includes a bizarre feature that appears to exclude files with non-Microsoft file extensions, under some conditions. It is however so odd that it's surely got to be a bug, rather than monkey business. But you could go as far as saying it's one of those MS things that inconvenience other companies if they don't do things the new way we're doing them in Redmond.

But in this case, it's largely just a minor inconvenience, albeit one that can easily baffle users, and did this one.

Here's how you can verify it. Go to an XP directory where you know you've got files with both Microsoft and non-Microsoft extensions. Search for *.doc, or another Microsoft extension of your choice and will show up. Search for a non-Microsoft extension in the same way, and it'll show up too. Obviously.

Now, search by the extension and also for a string that you know is going to be in the file. For documents, "the" or "and" would be a pretty good bet, for C++ files (*.cpp) you're inevitably going to get a "for". You still find the file with the Microsoft extension, but magically, it can't find the one with the non-Microsoft extension.

Thanks to the reader who pointed us at this one. He checked with *.java and *.wpd, and we've just checked it with *.ddf (Musicmatch) and *.js (Javascript). But does it apply to every non-MS extension, and if so, why?

Later

We've a lot of mail coming in about this one. Changing associations and extensions seems not to change the result. Change a .txt file to a .cpp while maintaining the Notepad association doesn't make it findable. Associating a .isu file with Notepad does not make it findable, changing its extension to .txt still does not make it findable.

And it doesn't seem to be anything to do with the capabilities of the indexing service, because that is switched off on the machine we're trying it on.

Or is it? In Windows 2000 the search defaulted to treating everything as .txt, so it'd crunch through everything it didn't understand. WinXP (our thanks to Alex Fein and the reader who pointed us at his explanation) doesn't default to .txt, and ignores everything it doesn't have a filter for. This, clearly, is nothing to do with whether you've got the index service switched on or off, because on the machine we're using here the service has never been switched on in the first place.

So that thought was a red herring, and with hindsight a dumb thing to think in the first place. XP has its own batch of Microsoft filters that install with the software, and Office XP comes with a few more. If search finds a file extension it doesn't have a filter for, then it skips the file. Developers who need the search system to be able to search inside their files therefore now have to produce filters for them, or to register them with one of the Microsoft filters.

This doesn't as far as we can see, explain how you can change the extension of a file with no filter for it to .txt, change its association to Notepad for good measure, and still not be able to find it. So there's surely something more in there, but it does seem to be related to filters.

Ah, but we seem to be there now. Thanks to Ami for telling us: "The filters are probably invoked based on file-association, but ignore files that either don't have the correct extension or don't match the format (most file formats have a 'magic' number at the beginning that confirms the format is what the user says it is)." Gotcha, we think.

Why did Microsoft make the switch? Skipping huge MP3s that don't need to be searched in would speed things up considerably, and if you were using the index system, keep the index size down. Alex Feinman has a fuller explanation here, plus a routine that will change the behaviour if you want to do so. ®

Choosing a cloud hosting partner with confidence

More from The Register

next story
'Windows 9' LEAK: Microsoft's playing catchup with Linux
Multiple desktops and live tiles in restored Start button star in new vids
Not appy with your Chromebook? Well now it can run Android apps
Google offers beta of tricky OS-inside-OS tech
New 'Cosmos' browser surfs the net by TXT alone
No data plan? No WiFi? No worries ... except sluggish download speed
SUSE Linux owner Attachmate gobbled by Micro Focus for $2.3bn
Merger will lead to mainframe and COBOL powerhouse
iOS 8 release: WebGL now runs everywhere. Hurrah for 3D graphics!
HTML 5's pretty neat ... when your browser supports it
Greater dev access to iOS 8 will put us AT RISK from HACKERS
Knocking holes in Apple's walled garden could backfire, says securo-chap
NHS grows a NoSQL backbone and rips out its Oracle Spine
Open source? In the government? Ha ha! What, wait ...?
Google extends app refund window to two hours
You now have 120 minutes to finish that game instead of 15
Intel: Hey, enterprises, drop everything and DO HADOOP
Big Data analytics projected to run on more servers than any other app
prev story

Whitepapers

Providing a secure and efficient Helpdesk
A single remote control platform for user support is be key to providing an efficient helpdesk. Retain full control over the way in which screen and keystroke data is transmitted.
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.
Security and trust: The backbone of doing business over the internet
Explores the current state of website security and the contributions Symantec is making to help organizations protect critical data and build trust with customers.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.