What is in the Index.dat files?
As already mentioned, the index.dat files are binary files. Their content can be seen only
with binary (hex) editor. We will examine an index.dat file from the Temporary Internet Files (Internet Explorer cache). First, let's take a look to the index.dat file header:
|
|
|
|
|
|
4.34 MB - 5 sec with broadband
|
|
|
|
|
|
Actually the index.dat header is much larger but this is the most important part of it.
The first thing is the version of the index.dat file (Client UrlCache MMF Ver 4.7) - this
particular file is from Internet Explorer version 4 but the index.dat file format is very similar in Internet Explorer
5, 6, 7 and 8.
The next important thing in the header are the names of the four subfolders in which
the cached files from the Internet are located (they are not present in the headers of the index.dat files
for cookies and history). These subfolders
are located in the same folder as the index.dat file and in this case their names are
49EDE5UVC, GHIZ8LMVB, EBWNUZWLB and G48NSH4S. On your PC these
folders can be more than four (depending on the size of the index.dat file) and their names
will be different.
The real content of the index.dat files usually starts at byte offset 4000h or 5000h from
the beginning of the file. The index.dat file is composed of many records of four different types:
HASH, URL, LEAK and REDR.
HASH records are the largest but
they don't contain any private information. They are just hash indexes of the contents
of the index.dat file. If the file is larger there can be many such records.
The vast majority of the index.dat records are of types URL, LEAK and REDR.
They have fairly similar layout. Look at this sample URL record.
As you can see there is a lot of information here. First, there is encoded date and time
of the loading of this picture (icon_hardware.gif) from the Internet. The date and
time are encoded in binary format in the second row of the hex dump. Next, there is
http://www.aceshardware.com/site/images/icon_hardware.gif, which is the full URL
of the loaded file. The
name of the local copy of the file (which is in one of the four subfolders of the index.dat
folder) is icon_hardware.gif. The next thing is the full HTTP header of the response of the Web
server:
HTTP/1.0 200 OK
ETag: "AAAAOl01l7Q"
Content-Type: image/gif
Content-Length: 1234
X-Cache: MISS from proxy.office.devolti.com
The last but not least bit of information in the record is the name of the user account: Administrator. Obviously all
this information can be potentially dangerous because it tells us who and when accessed given Internet page and
what was the response of the Web server. If you clean the Internet cache (Temporary Internet Files) then
the local copies of the cached files are deleted but most of the index.dat file records are left almost untouched. The same
is true for the history and cookies.
The empty space of index.dat files is filled with junk (most often zeros but it can also be various meaningless sequences) or in some areas - with
"magic" sequence 0BADF00Dh (BAD FOOD). Obviously Microsoft developers are not without a sense of
humor. BAD FOOD parts of the file are deleted records and they aren't a privacy threat.
You can use Mil Shield to clean the content of index.dat files along with history,
cookies, cache and many other tracks.
|