Go to Mil Incorporated home page. Articles theme image
mil
incorporated

What is in the Index.dat files?

As already mentioned, the index.dat files are binary files. Their content can be seen only with binary (hex) editor. We will examine an index.dat file from the Temporary Internet Files (Internet Explorer cache). First, let's take a look to the index.dat file header:

Binary (hex) dump of the index.dat file header
Click here to download the free trial version of Mil Shield 9.0
4.34 MB - 5 sec with broadband

Actually the index.dat header is much larger but this is the most important part of it. The first thing is the version of the index.dat file (Client UrlCache MMF Ver 4.7) - this particular file is from Internet Explorer version 4 but the index.dat file format is very similar in Internet Explorer 5, 6, 7 and 8.

The next important thing in the header are the names of the four subfolders in which the cached files from the Internet are located (they are not present in the headers of the index.dat files for cookies and history). These subfolders are located in the same folder as the index.dat file and in this case their names are 49EDE5UVC, GHIZ8LMVB, EBWNUZWLB and G48NSH4S. On your PC these folders can be more than four (depending on the size of the index.dat file) and their names will be different.

The real content of the index.dat files usually starts at byte offset 4000h or 5000h from the beginning of the file. The index.dat file is composed of many records of four different types: HASH, URL, LEAK and REDR.

HASH records are the largest but they don't contain any private information. They are just hash indexes of the contents of the index.dat file. If the file is larger there can be many such records.

The vast majority of the index.dat records are of types URL, LEAK and REDR. They have fairly similar layout. Look at this sample URL record.

Binary (hex) dump of the index.dat URL record

As you can see there is a lot of information here. First, there is encoded date and time of the loading of this picture (icon_hardware.gif) from the Internet. The date and time are encoded in binary format in the second row of the hex dump. Next, there is http://www.aceshardware.com/site/images/icon_hardware.gif, which is the full URL of the loaded file. The name of the local copy of the file (which is in one of the four subfolders of the index.dat folder) is icon_hardware.gif. The next thing is the full HTTP header of the response of the Web server:

HTTP/1.0 200 OK
ETag: "AAAAOl01l7Q"
Content-Type: image/gif
Content-Length: 1234
X-Cache: MISS from proxy.office.devolti.com

The last but not least bit of information in the record is the name of the user account: Administrator. Obviously all this information can be potentially dangerous because it tells us who and when accessed given Internet page and what was the response of the Web server. If you clean the Internet cache (Temporary Internet Files) then the local copies of the cached files are deleted but most of the index.dat file records are left almost untouched. The same is true for the history and cookies.

The empty space of index.dat files is filled with junk (most often zeros but it can also be various meaningless sequences) or in some areas - with "magic" sequence 0BADF00Dh (BAD FOOD). Obviously Microsoft developers are not without a sense of humor. BAD FOOD parts of the file are deleted records and they aren't a privacy threat.

You can use Mil Shield to clean the content of index.dat files along with history, cookies, cache and many other tracks.

Send your comments and suggestions to site@milincorporated.com
Copyright © 2003-2014 Mil Incorporated. All rights reserved.