Extracting subsequences from a list or string

Page 1 of 1

11 Replies - 18667 Views - Last Post: 05 July 2013 - 01:36 PMRate Topic: //<![CDATA[ rating = new ipb.rating( 'topic_rate_', { url: 'https://www.dreamincode.net/forums/index.php?app=forums&module=ajax&section=topics&do=rateTopic&t=323968&amp;s=451dc7b5e79adfefa216d72ef0f5212f&md5check=' + ipb.vars['secure_hash'], cur_rating: 0, rated: 0, allow_rate: 0, multi_rate: 1, show_rate_text: true } ); //]]>

#1 jon.kiparsky

• Beginner

Reputation: 11093
• Posts: 18,982
• Joined: 19-March 11

Extracting subsequences from a list or string

Posted 30 June 2013 - 08:19 PM

So I'm working on a problem that requires me to operate on each n-length subsequence of a string, and of course I want to find a clever way to get that list. The obvious way is with a comprehension like this:

```def subsequences(l,n):
return [l[i:i+n] for i in range (len(l)-n+1)]

```

This works, but it is not especially pretty or graceful. I was just curious to know if anyone can come up with a more congenial way to express this.

(Yeah, this is just a chance for you smart kids to show off. That's always fun...)

Is This A Good Question/Topic? 0

Replies To: Extracting subsequences from a list or string

#2 Nallo

• D.I.C Regular

Reputation: 165
• Posts: 258
• Joined: 19-July 09

Re: Extracting subsequences from a list or string

Posted 03 July 2013 - 03:06 PM

I don't have a beautiful way either. Though if you only need sequential access to the subsequences you could use a generator to avoid memory overhead:
```def subsequences(l,n):
for i in range (len(l)-n+1):
yield [l[i:i+n]]

#sequential access
for subs_5 in subsequences(l, 5):
pass #do something

```

or more consciese:
```for subs_5 in (l[i:i+5] for i in range (len(l)-5+1)):
pass

```

#3 sepp2k

• D.I.C Lover

Reputation: 2627
• Posts: 4,182
• Joined: 21-June 11

Re: Extracting subsequences from a list or string

Posted 04 July 2013 - 04:39 AM

What you have looks fine to me, but I want to point out that what you're producing are substrings/sublists, not subsequences.

#4 ConciselyVerbose

• D.I.C Regular

Reputation: 91
• Posts: 315
• Joined: 05-July 13

Re: Extracting subsequences from a list or string

Posted 05 July 2013 - 08:36 AM

sepp2k, on 04 July 2013 - 04:39 AM, said:

What you have looks fine to me, but I want to point out that what you're producing are substrings/sublists, not subsequences.

How so? A subsequence, per your link, is a sequence you can make by removing elements from the original. He is theoretically removing the elements on either side of the desired string.

Actually on topic, I would be interested if there is a more graceful way, but sometimes the first thing that pops into your head is the right answer. Python was not my first language, but it is my favorite precisely because of how elegantly it handles things like list comprehensions. Compare writing that line in other languages before being too hard on yourself for not being graceful.

#5 sepp2k

• D.I.C Lover

Reputation: 2627
• Posts: 4,182
• Joined: 21-June 11

Re: Extracting subsequences from a list or string

Posted 05 July 2013 - 08:45 AM

ConciselyVerbose, on 05 July 2013 - 05:36 PM, said:

How so? A subsequence, per your link, is a sequence you can make by removing elements from the original. He is theoretically removing the elements on either side of the desired string.

Yes, but never anything else. For example "abc" is a subsequence of "ac", but subsequences("abc", 2) will not include "ac". Therefore the list returned by subsequences("abc", 2) is not the list of subsequences of length 2 of "abc" - it's the list of substrings of length 2.

#6 ConciselyVerbose

• D.I.C Regular

Reputation: 91
• Posts: 315
• Joined: 05-July 13

Re: Extracting subsequences from a list or string

Posted 05 July 2013 - 08:50 AM

Fair enough. He is referring to a specific set of subsequences and not all subsequences. He is still creating subsequences, though. It is just a specific set, not a comprehensive one.

#7 baavgai

• Dreaming Coder

Reputation: 7197
• Posts: 15,004
• Joined: 16-October 07

Re: Extracting subsequences from a list or string

Posted 05 July 2013 - 09:24 AM

Nah, sorry, there's just not a lot you can do with that.

Same thing, with more cryptic syntax:
```def subsequences(s,n):
return map(lambda i: s[i:i+n], range(len(s)-n+1))

```

And, for something completely different:
```def subsequences(s,n):
def f(s):
while len(s)>=n:
yield s[:n]
s = s[1:]
return [ i for i in f(s) ]

```

I kind of found myself wanting a car, cdr and tail end recursion for this one...

#8 jon.kiparsky

• Beginner

Reputation: 11093
• Posts: 18,982
• Joined: 19-March 11

Re: Extracting subsequences from a list or string

Posted 05 July 2013 - 09:29 AM

ConciselyVerbose, on 05 July 2013 - 10:36 AM, said:

sepp2k, on 04 July 2013 - 04:39 AM, said:

What you have looks fine to me, but I want to point out that what you're producing are substrings/sublists, not subsequences.

How so? A subsequence, per your link, is a sequence you can make by removing elements from the original. He is theoretically removing the elements on either side of the desired string.

No, sepp's totally right. The term would be a reasonable etymological coinage if it were novel, but it's already got a meaning, which slipped my mind. They are sub-sequences, in the sense that they are sequences taken from a larger sequences, but they are not what a mathematician considers a subsequence, so the term is the wrong one.

Quote

Actually on topic, I would be interested if there is a more graceful way, but sometimes the first thing that pops into your head is the right answer. Python was not my first language, but it is my favorite precisely because of how elegantly it handles things like list comprehensions.

It's not that I'm being hard on myself, I'd just like to find a better way to do this sort of thing. As far as I can tell, this is the best way to do this in python, and it's pretty horrible.

Quote

Compare writing that line in other languages before being too hard on yourself for not being graceful.

The Java version would be a perfectly straightforward loop, and it would be infinitely more readable - for java and python programmers alike - even though it would be a few lines longer.
Comprehensions are great, but I don't know where we get this conviction that they are universally easier to read than other forms. In this case, they're plainly not.

#9 baavgai

• Dreaming Coder

Reputation: 7197
• Posts: 15,004
• Joined: 16-October 07

Re: Extracting subsequences from a list or string

Posted 05 July 2013 - 10:05 AM

Wait, you don't think the comprehension is easy to read? WTF?!?

Well, hell, you can just use more lines:
```def subsequences(s,n):
def getSub(i): return s[i:i+n]
size = len(s)-n+1
iterations = range(size)
return map(getSub, iterations)

```

There, all nicely labeled.

Or, just dissect the comprehension:
```def subsequences(s,n):
result = []
for i in range (len(l)-n+1):
result.append(s[i:i+n])
return result

```

This post has been edited by baavgai: 05 July 2013 - 10:06 AM

#10 ConciselyVerbose

• D.I.C Regular

Reputation: 91
• Posts: 315
• Joined: 05-July 13

Re: Extracting subsequences from a list or string

Posted 05 July 2013 - 10:53 AM

I see where you're coming form, but I don't really agree. You could write a more java-esque version if you wanted to, but there is no need. I started with java and have far more experience with java, and have no problem reading the list comprehension as provided. I am not claiming that list comprehensions are the end all, be all, better than anything else anyone can do answer, but the ability to put what would be several lines in other languages in one here, and in a fairly easy to read form (provided you understand list comprehensions, which aren't overly difficult) makes the code much easier to maintain, at least for me.

The generator suggested is also a nice solution, especially if you intend to use the code multiple times.

#11 jon.kiparsky

• Beginner

Reputation: 11093
• Posts: 18,982
• Joined: 19-March 11

Re: Extracting subsequences from a list or string

Posted 05 July 2013 - 01:28 PM

baavgai, on 05 July 2013 - 12:05 PM, said:

Wait, you don't think the comprehension is easy to read? WTF?!?

What I said was the java version would be more readable, and I don't think that's very weird at all. The comprehension makes me think more about the mechanics of the language and less about the problem I'm trying to solve - this is the opposite of what python generally does well.

Comprehensions are great when the problems are very small, but they're like regex - the complexity of the solution grows much faster than the complexity of the problem. This one hasn't got to that "write-only" point, but you can see how it's getting ready to blow up.

Quote

the ability to put what would be several lines in other languages in one here, and in a fairly easy to read form (provided you understand list comprehensions, which aren't overly difficult) makes the code much easier to maintain, at least for me.

Some day I hope someone will make a reasonable case for the bizarre assumption that "fewer lines is better lines". They haven't yet.

("what, you think more lines are better????"
No, I didn't actually say that)

#12 ConciselyVerbose

• D.I.C Regular

Reputation: 91
• Posts: 315
• Joined: 05-July 13

Re: Extracting subsequences from a list or string

Posted 05 July 2013 - 01:36 PM

It's not about less lines as much as it is not using several lines for something simple. As what you are looking for is "all n length substrings of s", it doesn't make sense to me to use several lines to explain.

I'm not one to try to boil large concepts into overly convoluted single lines, but if what it represents is short I want the code to be as well.