Page 1 of 1

Create a simple configuration file parser. Rate Topic: -----

#1 sarmanu  Icon User is offline

  • D.I.C Lover
  • member icon

Reputation: 966
  • View blog
  • Posts: 2,362
  • Joined: 04-December 09

Posted 26 July 2010 - 03:56 AM

*
POPULAR

Hello everyone. In this tutorial, you will learn how to create a configuration file parser, using C++ STL library. I know that you may find a lot of (more advanced) configuration file parsers on the Internet, but I wanted to create an easy to understand & small one, for beginners. That's why I'm going to explain the code step-by-step.

First of all, what is a configuration file?
A configuration file is a file which contains initial settings for your program. It is nothing more than a text file, which contains a specific structure. That structure usually looks like this:
key = value


We call the structure "key = value" a parameter. In more advanced config files, parameters can be grouped in sections, but I'm not going to talk about that now.

This parser, what is it capable of?
The parser that I'm going to present is capable of parsing simple configuration files, with the basic structure like this:
key1 = value1
key2 = value2  
key3 = value3


It will also remove the leading & trailing whitespace from key & values, it will ignore blank lines, and supports comment parsing. A comment will start with a semicolon (; ), and everything from that semicolon until the end of the line, will be ignored. Example of line with trailing whitespace & leading whitespace & comments:
    key1   =          value1   ; I'm a comment


The parser will remove that comment & whitespace, and the result structure will look like this:
key1=value1
No leading spaces, no trailing spaces, no comments.
The parser is also capable of recognizing keys with the same name. In case you have multiple keys with the same name, an according error message will be thrown.
The parser will also recognize multiple words key values. Example:
car = toyota corolla


The value of car will be toyota corolla, and not only toyota.
The same thing doesn't apply to keys itself, which can't be formed from multiple words. Example:
car 1 = toyota corolla


The parser will ignore that 1, therefore, the key will be car, with the value of toyota corolla.

Setting up the project:
This step is very easy. Create an empty Console Application, and add one source file (.cpp/.cxx) to your project. I named it ConfigFile.cpp.

Create the configuration file parser:
We got here. This is, of course, the most important thing in this tutorial. Open ConfigFile.cpp. Right now, you have a blank file. Start by adding the needed included files at the top of the file:
#include <iostream>
#include <string>
#include <sstream>
#include <map>
#include <fstream>


I think <iostream> and <string> are pretty self-explanatory. <sstream> is needed for conversion between std::string and primitive types, and vice-versa. <map> is needed for holding the pair of key-value, and <fstream> is of course needed for file handling.

Now, we are going to create a class which contain only two functions, needed for conversion of std::string to primitive types (int/float/double/...), and vice-versa. I have called it Convert. This is the code for it:
class Convert
{
public:
        // Convert T, which should be a primitive, to a std::string.
	template <typename T>
	static std::string T_to_string(T const &val) 
	{
		std::ostringstream ostr;
		ostr << val;

		return ostr.str();
	}
	
        // Convert a std::string to T.	
	template <typename T>
	static T string_to_T(std::string const &val) 
	{
		std::istringstream istr(val);
		T returnVal;
		if (!(istr >> returnVal))
			exitWithError("CFG: Not a valid " + (std::string)typeid(T).name() + " received!\n");

		return returnVal;
	}

	template <>
	static std::string string_to_T(std::string const &val)
	{
		return val;
	}
};


Now, you may find all over the internet the function encapsulated in Convert class. It's the classic stringstream that performs the conversions. I just want to point a thing. I have specialized string_to_T function, for std::string. Why? Well, take a look at this:
if (!(istr >> returnVal))


If function parameter val would be a string containing whitespace, like:
toyota corolla


then string_to_T will return only "toyota", since istringstream will stop extracting at the first whitespace.
Now, you may wonder what's with exitWithError function. This function posts a message on the console, then it aborts the execution of the program. The function looks like this:
void exitWithError(const std::string &error) 
{
	std::cout << error;
	std::cin.ignore();
	std::cin.get();

	exit(EXIT_FAILURE);
}



Now, let's create the main class, which contains functions needed to parse the configuration file. I have called it ConfigFile. Copy-paste this into your file:
class ConfigFile
{
private:
public:
};


Now, let's deal with the private zone of the class. As member variables, we will only have a std::string, which will hold the name of the configuration file, and a std::map<std::string, std::string>, which will hold pairs of key-value. Let's add the to the class:
std::map<std::string, std::string> contents;
std::string fName;


Done.
Right now, we will create a function that removes the comment from an individual line. It looks like this (copy-paste it to private section of class):
void removeComment(std::string &line) const
{
    if (line.find(';') != line.npos)
	 line.erase(line.find(';'));
}


So, what does it do? It checks if the line contains a semicolon, and if it does, it removes everything from the semicolon (including it), to the end of the line. If the line contains nothing but a comment, then, after comment removal, the line will only contain whitespace. That's why I created a separate function which checks this:
bool onlyWhitespace(const std::string &line) const
{
      return (line.find_first_not_of(' ') == line.npos);
}


Basically, the function returns false if a non-space character was found, true otherwise. The function is "const" because it does not alter any class member variables.
Now, a very important function is on its way. This function checks if an individual line has the correct structure of a config file (key = value). It looks like this:
bool validLine(const std::string &line) const
{
	std::string temp = line;
	temp.erase(0, temp.find_first_not_of("\t "));
	if (temp[0] == '=')
		return false;

	for (size_t i = temp.find('=') + 1; i < temp.length(); i++)
		if (temp[i] != ' ')
			return true;

	return false;
}


Let's take it step by step. First of all, the function accepts as parameter a std::string, which is an individual line (with removed comment), from the config file. Let's take a look at this part:
std::string temp = line;
temp.erase(0, temp.find_first_not_of("\t "));
if (temp[0] == '=')
	return false;


The .erase() simply removes every character starting from position 0 -> first non-tab or non-space character. After removal, if the first character is '=', it means that we do not have a key. Something like this:
; Oups? Missing the key from below line!
  =  someValue;


Now, let's analyze the second part of the function:
for (size_t i = temp.find('=') + 1; i < temp.length(); i++)
    if (temp[i] != ' ')
	return true;

return false;


The for loop loops starting from the position of the '=', until the end of the line. If a non-space character was found, then we have a key value. If the "if" never executes, the function returns false, because the key doesn't have a value. An example in which the function also returns false:
; Ooups. No key value in the below line:
key =    ; no value!


Done with that too. Now, we will create a function that extracts the key from the pair of key = value. It looks like this:
void extractKey(std::string &key, size_t const &sepPos, const std::string &line) const
{
      key = line.substr(0, sepPos);
      if (key.find('\t') != line.npos || key.find(' ') != line.npos)
	    key.erase(key.find_first_of("\t "));
}


sepPos represents the position of the '=', in line (we will discuss about it in another function). Let's give an example and see what the function would assign to "key". Example:
car = ford


The value of key will be: " car" (there are three whitespaces in front of car, but DIC code tags won't allow whitespace). Why? Because that .substr() creates a substring starting with the character at position 0, and finishes with the character from the position of '=' - 1. Then, everything from the first space or tab character, is removed.
Now, since we created a function that extracts the key, let's create one that extracts the value of the key. It looks like this:
void extractValue(std::string &value, size_t const &sepPos, const std::string &line) const
{
	value = line.substr(sepPos + 1);
	value.erase(0, value.find_first_not_of("\t "));
	value.erase(value.find_last_not_of("\t ") + 1);
}


Again, sepPos is the position of the '=', and line is the individual line with the comment removed. Let's take an example and see what "value" will be assigned:
car =   toyota corolla


value will be assigned "toyota corolla". First of all, .substr() creates a substring starting from positon of '=' + 1, to the end of the line. Then, value.erase(0, value.find_first_not_of("\t ")); removes the leading whitespace, and value.erase(value.find_last_not_of("\t ") + 1); removes everything starting with the position of the last non-tab or non-space character.
Now, all we need to do is to create some functions which calls the above functions. Copy-paste these, again into your private section of the class:
void extractContents(const std::string &line) 
{
	std::string temp = line;
        // Erase leading whitespace from the line.
	temp.erase(0, temp.find_first_not_of("\t "));
	size_t sepPos = temp.find('=');

	std::string key, value;
	extractKey(key, sepPos, temp);
	extractValue(value, sepPos, temp);

	if (!keyExists(key))
		contents.insert(std::pair<std::string, std::string>(key, value));
	else
		exitWithError("CFG: Can only have unique key names!\n");
}

// lineNo = the current line number in the file.
// line = the current line, with comments removed.
void parseLine(const std::string &line, size_t const lineNo)
{
	if (line.find('=') == line.npos)
		exitWithError("CFG: Couldn't find separator on line: " + Convert::T_to_string(lineNo) + "\n");

	if (!validLine(line))
	        exitWithError("CFG: Bad format for line: " + Convert::T_to_string(lineNo) + "\n");

	extractContents(line);
}


I don't think that the functions needs some more presentation. This:
if (!keyExists(key))


keyExists() is a function which checks if a key given as parameter, already exists in the std::map (contents). I will present it later.
Now, the only thing that we have to do, in the private zone of the class, is to add a function that opens the configuration file, and extracts & parses it's contents. It looks like this:
void ExtractKeys()
{
	std::ifstream file;
	file.open(fName.c_str());
	if (!file)
		exitWithError("CFG: File " + fName + " couldn't be found!\n");

	std::string line;
	size_t lineNo = 0;
	while (std::getline(file, line))
	{
		lineNo++;
		std::string temp = line;

		if (temp.empty())
			continue;

		removeComment(temp);
		if (onlyWhitespace(temp))
			continue;

		parseLine(temp, lineNo);
	}

	file.close();
}


So what does the function does? It opens up the configuration file. Then, the while loop keeps extracting lines, until EOF is found. We check each line if it's empty, and if it is, we jump over it. Comments are removed, then, if the line contains only whitespaces, we jump over it. Lastly, parseLine is called, and line contents are added to our map.

We have finished adding function to the private zone of the class, now, let's deal with the public zone. Let's start by adding class constructor, which sets the name of the configuration file, then calls ExtractKeys to perform extraction:
ConfigFile(const std::string &fName)
{
	this->fName = fName;
	ExtractKeys();
}


Done. Now, let's create a function which keys if a specific key exists in the configuration file. Since the pair of key-value is extracted in our map, all we have to do is to use std::map::find function to look for the key:
bool keyExists(const std::string &key) const
{
	return contents.find(key) != contents.end();
}


And lastly, let's create the function that retrieves the value of a specific key. It looks like this:
template <typename ValueType>
ValueType getValueOfKey(const std::string &key, ValueType const &defaultValue = ValueType()) const
{
	if (!keyExists(key))
		return defaultValue;

	return Convert::string_to_T<ValueType>(contents.find(key)->second);
}


The function returns a default value (operator()() of ValueType), if the key couldn't be found. Otherwise, it will return the converted value from string to ValueType, of the key. We will discuss right now on how to use this function.

Create a sample configuration file that can be used with this parser:
I have called it config.cfg.
; This is a comment
; Another comment ;
color=red ; comment
fruit =   apple     ; some whitespace + comment
car =   toyota corolla     ; key value more than one word 
double =3.1223   ;    a double             



How to use the ConfigFile class:
It is extremely easy. Everything you have to do is this:
ConfigFile cfg("config.cfg");


Of course, "config.cfg" can be replaced with the name of your configuration file.

Check if a key exists: everything you need to do is to use ConfigFile::keyExists function:
// Check if car key exists. It does, in our case.
if (cfg.keyExists("car"))
   std::cout << "car key exists!\n";

// Check if fruits key exists. It doesn't, in our case.
if (cfg.keyExists("fruits"))
   std::cout << "fruits key exists!\n";



Retrieve the value of a specific key:
// Retrieve the value of key "car":
// If car key doesn't exist, an empty string is returned.
// Value type is std::string.
// In our case it returns "toyota corolla"
std::string carValue = cfg.getValueOfKey<std::string>("car");

// Retrieve the value of key "double":
// We directly retrieve it as a double:
// If key "double" is not found, the return value will be 1.
// In our case it returns "3.1223"
double doubleVal = cfg.getValueOfKey<double>("double", 1);



And that's pretty much everything you need to know about how to use the ConfigFile functions. You may also wonder why I didn't use separate header files for ConfigFile / Convert classes, and separate source files. Well, I should have done that, but I wanted to keep the tutorial as short as I could. You are free to add separate header files / source files to keep your project cleaner.

I have also attached the whole source code presented in this tutorial.

Additional references:
Wiki article about Configuration files

Attached File(s)

  • Attached File  main.zip (1.43K)
    Number of downloads: 2877

This post has been edited by sarmanu: 26 July 2010 - 03:58 AM


Is This A Good Question/Topic? 5
  • +

Replies To: Create a simple configuration file parser.

#2 Guest_Wolfgang*


Reputation:

Posted 29 July 2010 - 12:00 AM

Why not just use "Boost program options"?
Was This Post Helpful? 0

#3 c0d3prada  Icon User is offline

  • New D.I.C Head

Reputation: 1
  • View blog
  • Posts: 6
  • Joined: 16-November 10

Posted 17 November 2010 - 08:27 PM

nice tutorial. just the other day i was designing a class file to read ini files but i came to a snag meaning that it took a while to completely read large ini files. i'll try this class and see how fast it works

This post has been edited by c0d3prada: 17 November 2010 - 08:27 PM

Was This Post Helpful? 0
  • +
  • -

Page 1 of 1