7 Replies - 3095 Views - Last Post: 16 August 2008 - 08:26 PM Rate Topic: -----

#1 LowWaterMark  Icon User is offline

  • D.I.C Head

Reputation: 1
  • View blog
  • Posts: 119
  • Joined: 30-July 08

Integer limits and ASCII

Post icon  Posted 14 August 2008 - 02:53 AM

I'm quite baffled by trying to resolve ASCII's available number of assignments in C and the range of minimum to maximum storable characters. Consider:
#include <stdio.h>
#include <limits.h>

int main()
{
   printf ("char:  [%d, %d]\n", CHAR_MIN, CHAR_MAX);
   return 0;
}
You get: char: [-128, 127]

OK, now I get confused. When you initialize literals, specifically integers in ASCII, ASCII encodes from 0 to 255 all decimal, all positive. This of course is the same quantity of integers as is available in the code above (>= -128, <=127), but without the negative integers.

Consider the two simple lines of (ASCII) code defining two character variables.
char c2 = 197;
- is essentially identical to -
char c = '+';
as in ASCII, the plus sign is described in base 10 by the integer 197.

How does ASCII resolve what to plop into c2 given: (197 > 127)?

I know I have made a simple question more complicated than it needs to be but I'm sure this question reflects some larger misunderstanding on my part.

This post has been edited by LowWaterMark: 14 August 2008 - 03:04 AM


Is This A Good Question/Topic? 0
  • +

Replies To: Integer limits and ASCII

#2 born2c0de  Icon User is offline

  • printf("I'm a %XR",195936478);
  • member icon

Reputation: 180
  • View blog
  • Posts: 4,667
  • Joined: 26-November 04

Re: Integer limits and ASCII

Posted 14 August 2008 - 03:33 AM

CHAR_MIN and CHAR_MAX constants hold the respective values for a signed char.

In a signed data type, the most significant bit (MSB) is used to represent the sign.
Since the size of a char is 8 bits, only 7 bits are available for signed chars since the MSB is used for the sign.

Hence it can only store (2^7) characters i.e. 128 characters (from 0 to 127). Since MSB is used for the sign, the range for a signed char is -127 to 127.

But unsigned chars use all 8-bits to represent a character. Hence it can store 2^8 characters i.e. 256 characters (from 0 to 255)

Hope this helps.
:)
Was This Post Helpful? 0
  • +
  • -

#3 LowWaterMark  Icon User is offline

  • D.I.C Head

Reputation: 1
  • View blog
  • Posts: 119
  • Joined: 30-July 08

Re: Integer limits and ASCII

Posted 14 August 2008 - 05:53 AM

born2c0de, thank you for taking time to respond and yes, it does help. Unfortunately it raises one more question. The output for my first code above yields:

char: [-128, 127]

Doesn't this suggest that CHAR_MIN is utilizing 9 bits (using the MSB in the signed data type as you explained)? Or, if the MSB holds the character, - , does the character set omit the zero and therefor range from -1 to -128?

This post has been edited by LowWaterMark: 14 August 2008 - 05:57 AM

Was This Post Helpful? 0
  • +
  • -

#4 OliveOyl3471  Icon User is offline

  • Everybody's crazy but me!
  • member icon

Reputation: 134
  • View blog
  • Posts: 6,581
  • Joined: 11-July 07

Re: Integer limits and ASCII

Posted 14 August 2008 - 06:09 AM

This may not answer your question, but when I run this:
   char c = '+';
   char c2 = 197;
   cout<<c<<" "<<c2;


the output is:
+

This will tell you what the output will be when you declare your char as a number (after a certain number, 255 I think, it repeats):
for(int x = 1; x<= 255; x++)
 {
		  char c = x;
		  cout<<x <<": "<<c<<", ";
 }


Might come in handy if you ever want to display one of those symbols. :)
I don't really know if the outcome of this program is compiler-dependent, but you could still run it to see what you get.
Don't worry if it beeps. Mine did that when x=7.

If I am not mistaken, the outcome should be the ASCII table.

This post has been edited by OliveOyl3471: 14 August 2008 - 09:55 PM

Was This Post Helpful? 0
  • +
  • -

#5 NickDMax  Icon User is offline

  • Can grep dead trees!
  • member icon

Reputation: 2250
  • View blog
  • Posts: 9,245
  • Joined: 18-February 07

Re: Integer limits and ASCII

Posted 14 August 2008 - 07:22 AM

So a char here is 8 bits... one bit is used to represent the sign, so we have 7 bits of data. With 7 bits we can represent 0 - 127. (thus the MAX_CHAR).

but once we add in the sign what happens. Well negative numbers in binary are actually very strange. you would think that -1 would be 10000001 but it isn't -1 is 11111111 in binary, -2 is 11111110, and -128 is 10000000.

So the negative numbers still have 7 bits, but they also have 10000000 which is the positive 0 with the sign bit set.

Computers use the two's complement to represent negative numbers. This is calculated as:

-a = (~a) + 1

So note that ~a depends upon your storage size. If you use 8 bits than -1 = 11111111 but if you use 4 bits it is 1111 and if you used 12 bits it would be 1111 1111 1111. So the binary values of the negative numbers depend upon your storage size.

the nice thing about the two's complement is that it leaves addition intact:
6 - 2 = 4
0110 - 0010 = 0110 + 1110 = 1 0100 (drop the overflow bit) 0100

but that is why the negatives always have 1 more value than the positives.

another way to look at it is 0 is a positive value so 0-127 are the positives, and -1 though -128 are the negatives. So the negatives don't really have 1 MORE value, its just 0 - 127 with the MSB set.

197 = 11000101 which when we take the 2's complement becomes 00111011 which is 0x3B = 3*16+11 = 59.
so 197 is -59 when stored in an 8 bit character... SO when dealing with unsigned 8 bit chars, 197 < 127!!!

Note that the 2's complement is specific to your storage size so for example using 16 bit short ints:
-59 is the 2's complement of 0000 0000 0011 1011
which is 1111 1111 1100 0101 (which is 65477 in an unsigned short int).

so in my little example above I was using only 4 bits (not 8) which is why -2 was 1110 and not 11111110

This post has been edited by NickDMax: 14 August 2008 - 07:46 AM

Was This Post Helpful? 0
  • +
  • -

#6 perfectly.insane  Icon User is offline

  • D.I.C Addict
  • member icon

Reputation: 70
  • View blog
  • Posts: 644
  • Joined: 22-March 08

Re: Integer limits and ASCII

Posted 14 August 2008 - 03:29 PM

Two's compliment simplifies the implementation of arithmetic. There is no difference between signed and unsigned addition and subtraction. For example:

-1 + 1 -> 0

11111111 + 00000001 -> 0

0 - 1 -> -1

00000000 - 00000001 = 11111111 (note the rollaround effect).

I believe the same is true for multiplication and division.

3 * -2 = -6

00000011 * 11111110 = 1011111010. Since the data type is 8 bits, the result is 11111010 (-6).

The main aspect to consider is data type conversion. For example, when you convert a (char) -1 to an (int) -1.

11111111 should not be converted to 00000000000000000000000011111111, but should have been 11111111111111111111111111111111.

The name of the operation that does this properly on Intel systems is called Move with Sign Extension (MOVSX).

This post has been edited by perfectly.insane: 14 August 2008 - 03:30 PM

Was This Post Helpful? 0
  • +
  • -

#7 NickDMax  Icon User is offline

  • Can grep dead trees!
  • member icon

Reputation: 2250
  • View blog
  • Posts: 9,245
  • Joined: 18-February 07

Re: Integer limits and ASCII

Posted 14 August 2008 - 06:59 PM

By the way, you described '+' as char 197, but it is not. '+' is char 43.

(see the ASCII table). Char 197 (or -59 in an 8 bit char) is in the extended character codes which sometimes have box drawing features -- and the single line cross is char 197. This is the character that you would use to represent point where 4 cells meet. So while it looks like "+" it isn't -- its longer. There is not "space" between it an the next character so they form a continuous line unlike +------+ the regular characters.

The problem with the extended ascii is the they were often redefines (even more so once windows came around). So people would use other codepages (character sets) and often char 197 would be a capital A with a little circle over it. On a MAC this falls into the math symbols and was an "approximately equal" sign.

So in the end. We can see why Unicode became popular... now we can have more characters that don't match up!!
Was This Post Helpful? 0
  • +
  • -

#8 NickDMax  Icon User is offline

  • Can grep dead trees!
  • member icon

Reputation: 2250
  • View blog
  • Posts: 9,245
  • Joined: 18-February 07

Re: Integer limits and ASCII

Posted 16 August 2008 - 08:26 PM

interesting that I just did a long shpeel on two complement and then missed this question in "How not to program in C++"

Whats the bug:
#include <iostream>

int main() {
    char c = 0xFF;
    if (c == 0xFF) {
        std:cout << "Yes!!!" << std::endl;
    } else {
        std:cout << "No!!!" << std::endl;
    }
    return 0;
}


Spoiler

This post has been edited by NickDMax: 16 August 2008 - 10:47 PM

Was This Post Helpful? 1
  • +
  • -

Page 1 of 1