Quantcast
Viewing all articles
Browse latest Browse all 41

Answer by sastanin for How can I partition (split up, divide) a list based on a condition?

My take on it. I propose a lazy, single-pass, partition function,which preserves relative order in the output subsequences.

1. Requirements

I assume that the requirements are:

  • maintain elements' relative order (hence, no sets anddictionaries)
  • evaluate condition only once for every element (hence not using(i)filter or groupby)
  • allow for lazy consumption of either sequence (if we can afford toprecompute them, then the naïve implementation is likely to beacceptable too)

2. split library

My partition function (introduced below) and other similar functionshave made it into a small library:

It's installable normally via PyPI:

pip install --user split

To split a list base on condition, use partition function:

>>> from split import partition>>> files = [ ('file1.jpg', 33L, '.jpg'), ('file2.avi', 999L, '.avi') ]>>> image_types = ('.jpg','.jpeg','.gif','.bmp','.png')>>> images, other = partition(lambda f: f[-1] in image_types, files)>>> list(images)[('file1.jpg', 33L, '.jpg')]>>> list(other)[('file2.avi', 999L, '.avi')]

3. partition function explained

Internally we need to build two subsequences at once, so consumingonly one output sequence will force the other one to be computedtoo. And we need to keep state between user requests (store processedbut not yet requested elements). To keep state, I use two double-endedqueues (deques):

from collections import deque

SplitSeq class takes care of the housekeeping:

class SplitSeq:    def __init__(self, condition, sequence):        self.cond = condition        self.goods = deque([])        self.bads = deque([])        self.seq = iter(sequence)

Magic happens in its .getNext() method. It is almost like .next()of the iterators, but allows to specify which kind of element we wantthis time. Behind the scene it doesn't discard the rejected elements,but instead puts them in one of the two queues:

    def getNext(self, getGood=True):        if getGood:            these, those, cond = self.goods, self.bads, self.cond        else:            these, those, cond = self.bads, self.goods, lambda x: not self.cond(x)        if these:            return these.popleft()        else:            while 1: # exit on StopIteration                n = self.seq.next()                if cond(n):                    return n                else:                    those.append(n)

The end user is supposed to use partition function. It takes acondition function and a sequence (just like map or filter), andreturns two generators. The first generator builds a subsequence ofelements for which the condition holds, the second one builds thecomplementary subsequence. Iterators and generators allow for lazysplitting of even long or infinite sequences.

def partition(condition, sequence):    cond = condition if condition else bool  # evaluate as bool if condition == None    ss = SplitSeq(cond, sequence)    def goods():        while 1:            yield ss.getNext(getGood=True)    def bads():        while 1:            yield ss.getNext(getGood=False)    return goods(), bads()

I chose the test function to be the first argument to facilitatepartial application in the future (similar to how map and filterhave the test function as the first argument).


Viewing all articles
Browse latest Browse all 41

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>