14 Replies - 2834 Views - Last Post: 18 October 2010 - 03:04 AM Rate Topic: -----

#1 kapitalist  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 6
  • Joined: 15-October 10

fixed width "compression" string to binary

Posted 15 October 2010 - 06:52 PM

Hello,

I am having a hard time figuring out what I am doing wrong. I am in grad school and have to take an intro to programming class, yay me, and its python. Learning some decent stuff but more frustrated than i have EVER been in my life!

Please see example of what I am supposed to do and what I have so far:

>>> compress( '11111' )
'10000101'

so I am supposed to define a "compress" command that takes any strings of 1/0 and "compresses". The first bit in the block of 8 represents the digit, and the rest of the 7 represent the quantity of times that digit appears in binary....hence the answer.

this is what I have so far:

def compress (s):
    if len (s) ==0: return 0
    f=s[0]
    i=0
    while i < len (s):
        if s[i] != f:
            return i
        i+=1                                    #gives us how many of the first number appear in the string
    print f,binaryof (i)



but 2 issues. Its not represented in block of 8 and there is a space I cant seem to get rid of.

Please help, thanks in advance!

This post has been edited by kapitalist: 15 October 2010 - 06:54 PM


Is This A Good Question/Topic? 0
  • +

Replies To: fixed width "compression" string to binary

#2 Guest_c.user*


Reputation:

Re: fixed width "compression" string to binary

Posted 15 October 2010 - 07:30 PM

how it should compress 11110111101010001010 ?
Was This Post Helpful? 0

#3 kapitalist  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 6
  • Joined: 15-October 10

Re: fixed width "compression" string to binary

Posted 15 October 2010 - 07:37 PM

View Postc.user, on 15 October 2010 - 06:30 PM, said:

how it should compress 11110111101010001010 ?


It would compress it like this

10000100, 00000001, 10000100, etc...

the four ones compress too the first block of 8. Then the 0 compresses to the second block, then the next four 1's compress.

Remember the number, then the quantity that the number appears in binary.

000

would "compress" to

00000011, again first digit is representing the repeated string "0", and the rest of the solution represents the binary of how many times....binary of 3 is 011....(but I have to show the solution in blocks of 8) Also, no commas. I just put commas so that its easier to read.

This post has been edited by kapitalist: 15 October 2010 - 07:37 PM

Was This Post Helpful? 0
  • +
  • -

#4 Guest_c.user*


Reputation:

Re: fixed width "compression" string to binary

Posted 16 October 2010 - 02:35 AM

>>> s = 101
>>> '1' + str(s).rjust(7, '0')
'10000101'
>>>


Was This Post Helpful? 0

#5 Nallo  Icon User is offline

  • D.I.C Regular
  • member icon

Reputation: 163
  • View blog
  • Posts: 255
  • Joined: 19-July 09

Re: fixed width "compression" string to binary

Posted 16 October 2010 - 02:57 AM

Edit: c.user was faster for the two issues you mentioned

Unfortunately your programm has more issues than the two you mentioned. It will only turn the beginning of s to a block. '11100' would result in '10000011' for the '111'start, but the '00' is ignored. You have to loop over the string instead of only looking at the beginning. I am not showing code for that as it is your assignement.

This post has been edited by Nallo: 16 October 2010 - 03:02 AM

Was This Post Helpful? 0
  • +
  • -

#6 Guest_c.user*


Reputation:

Re: fixed width "compression" string to binary

Posted 16 October 2010 - 07:37 PM

>>> s = "11110111101010001010"
>>> import re
>>> def f(s):
...   octets = []
...   for i in re.findall('0+|1+', s):
...     binstr = str(bin(len(i)))[2:]
...     octets.append(
...       '{}{}'.format(i[0], binstr.rjust(7, '0'))
...     )
...   return ''.join(octets)
... 
>>> s
'11110111101010001010'
>>> f(s)
'100001000000000110000100000000011000000100000001100000010000001110000001000000011000000100000001'
>>> re.findall('.{8,8}', f(s))
['10000100', '00000001', '10000100', '00000001', '10000001', '00000001', '10000001', '00000011', '10000001', '00000001', '10000001', '00000001']
>>>


Was This Post Helpful? 0

#7 kapitalist  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 6
  • Joined: 15-October 10

Re: fixed width "compression" string to binary

Posted 17 October 2010 - 12:45 AM

View Postc.user, on 16 October 2010 - 06:37 PM, said:

>>> s = "11110111101010001010"
>>> import re
>>> def f(s):
...   octets = []
...   for i in re.findall('0+|1+', s):
...     binstr = str(bin(len(i)))[2:]
...     octets.append(
...       '{}{}'.format(i[0], binstr.rjust(7, '0'))
...     )
...   return ''.join(octets)
... 
>>> s
'11110111101010001010'
>>> f(s)
'100001000000000110000100000000011000000100000001100000010000001110000001000000011000000100000001'
>>> re.findall('.{8,8}', f(s))
['10000100', '00000001', '10000100', '00000001', '10000001', '00000001', '10000001', '00000011', '10000001', '00000001', '10000001', '00000001']
>>>



Umm. thats great, but the problem is you are using things in your code that have not been taught in class, so I do not know how to read it. Can you simplify it? I would really appreciate it.
Was This Post Helpful? 0
  • +
  • -

#8 Nallo  Icon User is offline

  • D.I.C Regular
  • member icon

Reputation: 163
  • View blog
  • Posts: 255
  • Joined: 19-July 09

Re: fixed width "compression" string to binary

Posted 17 October 2010 - 10:20 AM

Using re, the python regular expression module makes for a nice solution. Using itertools.groupby would be nice too. But that is not going to help kapitalist.

When you have a task that seems overwhelming ... try to break it down into simple parts.

You probably understood that you need to loop over groups of same digits in your string s,
get their length and a digit from them.

So lets write some helper functions first that do this (only docstrings in there, no actual code):
def block(digit, group_lenght):
    """
    returns string of length 8
    first character beeing digit
    remining 7 chars beeing binary representation of group_length
    """
    # your job to write it
    # (thats the thing you asked about and c.user answered)
    
def consume_group(s, group_start):
    """
    returns tuple(digit, grouplength)
    where digit is the digit at s[group_start]
    and grouplength is the number of same digits as digit following
    digit in s (+1 for the original digit)
    """
    # your job to write it
    # (you almost wrote that in your first version)



The advantage of those simple functions is: You know exactly what they are supposed to do (especially if you always write your docstrings!) ... compare that to a convoluted monotolitic 100+ lines of code function with multiple levels of nesting that you never want to read or touch again.

Once you have that, the main compress function is a piece of cake. So even when you didnt work out how to write those helper functions yet you can write your main function:
def compress(s):
    blocks = ''
    index = 0
    while index < len(s): #loop over the groups in s
        group_digit, group_length = consume_group(s, index)
        blocks += block(group_digit, group_length)
        index += group_length #move index to next group
    return blocks



I hope I didnt take away your joy of solving the problem yourself. And I hope I didnt violate the rules of dreamincode here too much.
Was This Post Helpful? 2
  • +
  • -

#9 baavgai  Icon User is offline

  • Dreaming Coder
  • member icon

Reputation: 5805
  • View blog
  • Posts: 12,644
  • Joined: 16-October 07

Re: fixed width "compression" string to binary

Posted 17 October 2010 - 12:16 PM

There's really no need to resort to regular expressions. I doubt an instructor would thank you for it. You don't need them, don't use them.

def compress (s):
	if len (s)==0:
		return '' # don't you want to return a string?
	
	f = s[0]
	i = 0
	while i < len (s):
		if s[i] != f: # i starts out as 0, you're counting the first one twice
			return i # so the first time you hit a change, you return an int?  huh?
		i+=1  # BIG BUG WARNING!!! You've just confuse an accumulator with a loop
		
	# didn't we want to return something?
	# print f,binaryof (i)



Let's make an assumption, you want to return a string. Starting there, how far can we get...
def compress (s):
	result = '' # no matter what, we return this
	lastBit = None # you were correct, we do need to know what the last one was
	count = 0 # you're counting, remember?
	
	for bit in s: # now we loop through each of the digits
		if bit==lastBit:
			# this one is easy, though it's a special case when count==8
			count = count + 1 
		else:
			# time to print something
			# or in this case add it to the result
	
	# we've escaped the loop, chances are you have something left, though
	if count>0:
		# process what's left
	
	return result # you've been filling a string, return it



Hope that helps.
Was This Post Helpful? 1
  • +
  • -

#10 kapitalist  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 6
  • Joined: 15-October 10

Re: fixed width "compression" string to binary

Posted 17 October 2010 - 02:10 PM

View PostNallo, on 17 October 2010 - 09:20 AM, said:

Using re, the python regular expression module makes for a nice solution. Using itertools.groupby would be nice too. But that is not going to help kapitalist.

When you have a task that seems overwhelming ... try to break it down into simple parts.

You probably understood that you need to loop over groups of same digits in your string s,
get their length and a digit from them.

So lets write some helper functions first that do this (only docstrings in there, no actual code):
def block(digit, group_lenght):
    """
    returns string of length 8
    first character beeing digit
    remining 7 chars beeing binary representation of group_length
    """
    # your job to write it
    # (thats the thing you asked about and c.user answered)
    
def consume_group(s, group_start):
    """
    returns tuple(digit, grouplength)
    where digit is the digit at s[group_start]
    and grouplength is the number of same digits as digit following
    digit in s (+1 for the original digit)
    """
    # your job to write it
    # (you almost wrote that in your first version)



The advantage of those simple functions is: You know exactly what they are supposed to do (especially if you always write your docstrings!) ... compare that to a convoluted monotolitic 100+ lines of code function with multiple levels of nesting that you never want to read or touch again.

Once you have that, the main compress function is a piece of cake. So even when you didnt work out how to write those helper functions yet you can write your main function:
def compress(s):
    blocks = ''
    index = 0
    while index < len(s): #loop over the groups in s
        group_digit, group_length = consume_group(s, index)
        blocks += block(group_digit, group_length)
        index += group_length #move index to next group
    return blocks



I hope I didnt take away your joy of solving the problem yourself. And I hope I didnt violate the rules of dreamincode here too much.


NO, because I dont find this fun. I find it frustrating, especially loops. I understand the math and the logic, I just cant WRITE it in code. I am not looking for someone to give me the answer but at the same time without looking at the answer and a detailed explanation of why, I dont see how many people can jump into programming.
Was This Post Helpful? 0
  • +
  • -

#11 Guest_c.user*


Reputation:

Re: fixed width "compression" string to binary

Posted 17 October 2010 - 04:12 PM

View Postkapitalist, on 17 October 2010 - 05:45 PM, said:

Umm. thats great, but the problem is you are using things in your code that have not been taught in class, so I do not know how to read it. Can you simplify it? I would really appreciate it.


the main thing re.findall('0+|1+', s) creates a list of sequences

>>> import re
>>> re.findall("0+|1+", "0000111110011")
['0000', '11111', '00', '11']
>>>



we put this list into the loop and take every item

then we prepare binstr = str(bin(len(i)))[2:] the number, counting the length, translating it to the binary value (as bin() returns a number, we should coerce it to a string back), and taking only the part after the '0b...' prefix

so, we have a number like '111' or '0' for seven and zero accordingly

now we can print it
we take first character of item, and append the rest, justified to the right
but when we justify it, it is justifying like this ' 111', so we specify the fill character zero to make it this '0000111'

and now we print it to the string '{}{}' on this places - first char of item and justified binary value, filled with zeroes

this string is going into the empty list, we append it to the end of list
so we get a list ['10000001', '10000101', ...] after the loop

and the we join all elements of the list, it will return a string

so, there are no unnecessary steps, so I don't know how to make it shorter, we don't input all constructions into one line, because it's diffucult to enlarge such code after some time

as for the choice which method to use, at first you are writing your implementation, it's very hard, because you have no experience and this is incoming experience itself
when you have experience, you take all capabilities and the re module will be shorter and faster in most tasks, so write your own implementation if you want to be a programmer
Was This Post Helpful? 1

#12 kapitalist  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 6
  • Joined: 15-October 10

Re: fixed width "compression" string to binary

Posted 17 October 2010 - 04:25 PM

I think this forum is for advanced users only. I thank you for your attempts in helping but I need a more novice forum. The explanation is great (if I could get it) but again, I am at a basic level and we are not using 're' and when somebody mentions tuple, I am thinking database records...

I am more confused than when I first originally posted. This must be the epic misunderstanding between pure IT people and business/sales/management people. Communications. lol. Off to shoot myself.
Was This Post Helpful? 0
  • +
  • -

#13 baavgai  Icon User is offline

  • Dreaming Coder
  • member icon

Reputation: 5805
  • View blog
  • Posts: 12,644
  • Joined: 16-October 07

Re: fixed width "compression" string to binary

Posted 17 October 2010 - 05:31 PM

The trick is, no one wants up just up and give you the answer. You've gotten a number of valid answers, but we may have journey on a regex tangent. Computer people tend to enjoy clever answers to simple questions. While it may not seem so, your problem, counting up a boolean value with a loop and count, is pretty basic.

The answer I offered attempted to lay out enough framework to let you take it from there. It's not an obtuse solution and requires you to fill in a print and a count. I'm willing to offer a little more.

def compress(s):
	def getResultValue(bit, count):
		if bit:
			# place holder
			return "%s-%d:" % (bit, count)
		return ''
	
	(result, lastBit, count) = ('', None, 0)
	for bit in s: # now we loop through each of the digits
		if bit==lastBit:
			if count==8:
				result += getResultValue(lastBit, count)
				count = 1
			else:
				count += 1
		else:
			result += getResultValue(lastBit, count)
			(lastBit, count) = (bit, 1)
	
	if count>0:
		result += getResultValue(lastBit, count)
	
	return result



The function getResultValue takes two parameters, bit and count, and returns a string. You need only implement that with basically the code you've already offered.

I can't in good conscience offer any more. Good luck.
Was This Post Helpful? 0
  • +
  • -

#14 kapitalist  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 6
  • Joined: 15-October 10

Re: fixed width "compression" string to binary

Posted 17 October 2010 - 07:13 PM

View Postbaavgai, on 17 October 2010 - 04:31 PM, said:

The trick is, no one wants up just up and give you the answer. You've gotten a number of valid answers, but we may have journey on a regex tangent. Computer people tend to enjoy clever answers to simple questions. While it may not seem so, your problem, counting up a boolean value with a loop and count, is pretty basic.

The answer I offered attempted to lay out enough framework to let you take it from there. It's not an obtuse solution and requires you to fill in a print and a count. I'm willing to offer a little more.

def compress(s):
	def getResultValue(bit, count):
		if bit:
			# place holder
			return "%s-%d:" % (bit, count)
		return ''
	
	(result, lastBit, count) = ('', None, 0)
	for bit in s: # now we loop through each of the digits
		if bit==lastBit:
			if count==8:
				result += getResultValue(lastBit, count)
				count = 1
			else:
				count += 1
		else:
			result += getResultValue(lastBit, count)
			(lastBit, count) = (bit, 1)
	
	if count>0:
		result += getResultValue(lastBit, count)
	
	return result



The function getResultValue takes two parameters, bit and count, and returns a string. You need only implement that with basically the code you've already offered.

I can't in good conscience offer any more. Good luck.



I understand you cannot give me the answer, NOR do I want to be given the answer. I am in graduate school so I do have a lot invested in learning what I am studying. There is a disconnect with programming and my background so what seems as a basic problem to you is because of your vast experience in this field. Telling somebody "the answer is right there" without them really knowing what they are looking at is not the same as if I really understood this stuff. Thanks for taking extra time out of your day. Its really discouraging for someone like myself to struggle at something as I consider myself an intelligent person, but alas, frustration with the learning curve is chopping me down. Especially because after this class I "may" have one or two more programming classes.

my sentences could be structured much better but I am very tired : (

This post has been edited by kapitalist: 17 October 2010 - 07:14 PM

Was This Post Helpful? 0
  • +
  • -

#15 Guest_c.user*


Reputation:

Re: fixed width "compression" string to binary

Posted 18 October 2010 - 03:04 AM

>>> s = "111110110001110"
>>> def get_sequence(s):
...   seq = ''
...   first = s[0]
...   for c in s:
...     if c != first:
...       break
...     seq += c
...   return seq
... 
>>> seq = ''
>>> offset = 0
>>> while offset < len(s):
...   seq = get_sequence(s[offset:])
...   offset += len(seq)
...   if seq:
...     print(seq)
... 
11111
0
11
000
111
0
>>>


Was This Post Helpful? 1

Page 1 of 1