First of all, what is a binary file?
I am refering to a file which is not meant to be read directy by humans, only by computers, in other words, not just ASCII text files, only machine readable files, although some binary files may contain portions of ASCII code. Binary files are basically made by computers, for computers... usually
What do binary files contain and what are they used for?
All files just contain a sequence of bits but we will be reading 8bits (1 Byte) at a time as this is the format in which most are written, especially executable files (such as Windows exe, Roms etc...).
Binary executable files often contain operation codes which will be executed by your CPU, for example, an Assembler might compile the following mnemonics into the following opcodes (On a x86 architecture)
Mnemonic: INC EAX Opcode: 0x40 Mnemonic: MOV EAX, [ESP+4] Opcode: 0x8B 0x44 0x24 0x04
Although, this is not always the case as a binary file can mean any file that is compiled, such as data (sound, image data etc etc...) which may not be an opcode, it may just be data.
On a side note, it worth mentioning that 1 byte can only ever store 8 bits (0-7), but sometimes they are refered to as signed or unsigned, an unsigned number will only be used to represent positive numbers, while a signed number can be either positive or negative. The way negative numbers are reperesented is by using the sign bit (most significant bit) as a "negative/sign flag", this is called the "two's compliment", but I'm not going to go into that here.
What's the difference between ASCII and binary files?
Although this has been partially explained, I'll explain further. Have you ever opened up an ".exe" file in a text editor and seen a bunch of garbage, much like this?

If not, then you are missing something, try it
What you are seeing is the ASCII reperestation of each byte contained within that file, which isn't created to be viewed as text. Every character you see is a reperesentation of some byte (number between 0-255) stored in that file and your apathetic text editor is translating those values using the ASCII (American Standard Code for Information Interchange) encoding system, into characters for your viewing pleasure.
You could look at this the other way round, if you view a text file with a hex editor (I advice using free Hex Editor Neo), you will see that each character has its own value. Take a look at this chart to see what I mean.
So are we going to program or what?!?
Yes, yes.
Their are quite a few ways to open a file in binary mode in C/C++, but my prefered method is to use fread() and fopen() as these give you some great features that other don't, but I'm sure you believe me
Anyway, this code will simply open a file in binary mode and output the first 100 bytes in Hexadecimal. (By the way, the reason we use Hexadecimal is because 4 bits (1 Nybble) = 1 hex character, so 8 bits set to 1 = FF, it's a perfect numeral system imho )
#include <stdio.h>
#include <iostream>
using namespace std;
// An unsigned char can store 1 Bytes (8bits) of data (0-255)
typedef unsigned char BYTE;
// Get the size of a file
long getFileSize(FILE *file)
{
long lCurPos, lEndPos;
lCurPos = ftell(file);
fseek(file, 0, 2);
lEndPos = ftell(file);
fseek(file, lCurPos, 0);
return lEndPos;
}
int main()
{
const char *filePath = "C:\\Users\\UrName\\Desktop\\testFile.bin";
BYTE *fileBuf; // Pointer to our buffered data
FILE *file = NULL; // File pointer
// Open the file in binary mode using the "rb" format string
// This also checks if the file exists and/or can be opened for reading correctly
if ((file = fopen(filePath, "rb")) == NULL)
cout << "Could not open specified file" << endl;
else
cout << "File opened successfully" << endl;
// Get the size of the file in bytes
long fileSize = getFileSize(file);
// Allocate space in the buffer for the whole file
fileBuf = new BYTE[fileSize];
// Read the file in to the buffer
fread(fileBuf, fileSize, 1, file);
// Now that we have the entire file buffered, we can take a look at some binary infomation
// Lets take a look in hexadecimal
for (int i = 0; i < 100; i++)
printf("%X ", fileBuf[i]);
cin.get();
delete[]fileBuf;
fclose(file); // Almost forgot this
return 0;
}
NOTE: You will need to update the file path to suite your needs, include double backslashes as C++ interprets the backslash as a "ignore the next character" symbol. Use whatever file you want
I won't go into to much detail in the code because the comments are pretty self explainitory, although, i'll mention a few things.
- In this code, I used the C++ way of allocating/deallocating memory (new and delete) just to stay "up to date", but "malloc()" is just as good if not sometimes more suitable.
- Note how I check if the file pointer returned NULL before I allowed the program to continue
- The fread function simply copies the specified amount of bytes into the buffer a certain amount of times, from a certain offset, if you wish to specify an offset then do this. fread(&fileBuf[0x100], fileSize, 1, file);, now the data will be copied in to the buffer starting from 0x100. This is part of the reason fread is so great
- Always deallocate your memory that you allocate, otherwise the world might implode!! (or you will get a memory leak, which you do not want).
- I used the format string "%X " in printf, an uppcase X is telling printf to output the contents of the buffer in uppercase hexadecimal, you could try "%c" and you will see the ASCII again
C++ kindly gives us types which will automatically cast between character data (ASCII) and numeral (Binary) depending on the type used, and printf allows us to overide this to view the hexadecimal in the buffer, like you opened the file in hex editor.
Why would I open a file in binary?
Opening files in binary can have many purposes, such as reading image files, audio files, video files or even reading ROM files. Most data files has headers which give infomation on the file, try creating an image header reader or a ROM header reader (NES Rom?)
Anyway, if your still here, thanks for reading, I made this tutorial because (when I first started to write emulators), I couldnt find a simple tutorial on how to read and what to do with binary files. I really this helps someone.
Peace
This post has been edited by Aphex19: 26 April 2010 - 07:49 AM







MultiQuote






|