Intervals (bx-python)¶
The package bx.intervals is required. Instructions for installation can be found at https://bitbucket.org/james_taylor/bx-python/wiki/Home or try pip install bx-python.
- petlx.interval.intervaljoin(left, right, lstart='start', lstop='stop', rstart='start', rstop='stop', lfacet=None, rfacet=None, proximity=0, lprefix=None, rprefix=None)[source]¶
Join two tables by overlapping intervals. E.g.:
>>> from petl import look >>> from petlx.interval import intervaljoin >>> look(left) +---------+-------+--------+ | 'begin' | 'end' | 'quux' | +=========+=======+========+ | 1 | 2 | 'a' | +---------+-------+--------+ | 2 | 4 | 'b' | +---------+-------+--------+ | 2 | 5 | 'c' | +---------+-------+--------+ | 9 | 14 | 'd' | +---------+-------+--------+ | 9 | 140 | 'e' | +---------+-------+--------+ | 1 | 1 | 'f' | +---------+-------+--------+ | 2 | 2 | 'g' | +---------+-------+--------+ | 4 | 4 | 'h' | +---------+-------+--------+ | 5 | 5 | 'i' | +---------+-------+--------+ | 1 | 8 | 'j' | +---------+-------+--------+ >>> look(right) +---------+--------+---------+ | 'start' | 'stop' | 'value' | +=========+========+=========+ | 1 | 4 | 'foo' | +---------+--------+---------+ | 3 | 7 | 'bar' | +---------+--------+---------+ | 4 | 9 | 'baz' | +---------+--------+---------+ >>> result = intervaljoin(left, right, lstart='begin', lstop='end', rstart='start', rstop='stop') >>> look(result) +---------+-------+--------+---------+--------+---------+ | 'begin' | 'end' | 'quux' | 'start' | 'stop' | 'value' | +=========+=======+========+=========+========+=========+ | 1 | 2 | 'a' | 1 | 4 | 'foo' | +---------+-------+--------+---------+--------+---------+ | 2 | 4 | 'b' | 1 | 4 | 'foo' | +---------+-------+--------+---------+--------+---------+ | 2 | 4 | 'b' | 3 | 7 | 'bar' | +---------+-------+--------+---------+--------+---------+ | 2 | 5 | 'c' | 1 | 4 | 'foo' | +---------+-------+--------+---------+--------+---------+ | 2 | 5 | 'c' | 3 | 7 | 'bar' | +---------+-------+--------+---------+--------+---------+ | 2 | 5 | 'c' | 4 | 9 | 'baz' | +---------+-------+--------+---------+--------+---------+ | 2 | 2 | 'g' | 1 | 4 | 'foo' | +---------+-------+--------+---------+--------+---------+ | 4 | 4 | 'h' | 3 | 7 | 'bar' | +---------+-------+--------+---------+--------+---------+ | 5 | 5 | 'i' | 3 | 7 | 'bar' | +---------+-------+--------+---------+--------+---------+ | 5 | 5 | 'i' | 4 | 9 | 'baz' | +---------+-------+--------+---------+--------+---------+
An additional key comparison can be added, e.g.:
>>> from petl import look >>> look(left) +----------+---------+-------+ | 'fruit' | 'begin' | 'end' | +==========+=========+=======+ | 'apple' | 1 | 2 | +----------+---------+-------+ | 'apple' | 2 | 4 | +----------+---------+-------+ | 'apple' | 2 | 5 | +----------+---------+-------+ | 'orange' | 2 | 5 | +----------+---------+-------+ | 'orange' | 9 | 14 | +----------+---------+-------+ | 'orange' | 19 | 140 | +----------+---------+-------+ | 'apple' | 1 | 1 | +----------+---------+-------+ | 'apple' | 2 | 2 | +----------+---------+-------+ | 'apple' | 4 | 4 | +----------+---------+-------+ | 'apple' | 5 | 5 | +----------+---------+-------+ >>> look(right) +----------+---------+--------+---------+ | 'type' | 'start' | 'stop' | 'value' | +==========+=========+========+=========+ | 'apple' | 1 | 4 | 'foo' | +----------+---------+--------+---------+ | 'apple' | 3 | 7 | 'bar' | +----------+---------+--------+---------+ | 'orange' | 4 | 9 | 'baz' | +----------+---------+--------+---------+ >>> result = intervaljoin(left, right, lstart='begin', lstop='end', rstart='start', rstop='stop', lfacet='fruit', rfacet='type') >>> look(result) +----------+---------+-------+----------+---------+--------+---------+ | 'fruit' | 'begin' | 'end' | 'type' | 'start' | 'stop' | 'value' | +==========+=========+=======+==========+=========+========+=========+ | 'apple' | 1 | 2 | 'apple' | 1 | 4 | 'foo' | +----------+---------+-------+----------+---------+--------+---------+ | 'apple' | 2 | 4 | 'apple' | 1 | 4 | 'foo' | +----------+---------+-------+----------+---------+--------+---------+ | 'apple' | 2 | 4 | 'apple' | 3 | 7 | 'bar' | +----------+---------+-------+----------+---------+--------+---------+ | 'apple' | 2 | 5 | 'apple' | 1 | 4 | 'foo' | +----------+---------+-------+----------+---------+--------+---------+ | 'apple' | 2 | 5 | 'apple' | 3 | 7 | 'bar' | +----------+---------+-------+----------+---------+--------+---------+ | 'orange' | 2 | 5 | 'orange' | 4 | 9 | 'baz' | +----------+---------+-------+----------+---------+--------+---------+ | 'apple' | 2 | 2 | 'apple' | 1 | 4 | 'foo' | +----------+---------+-------+----------+---------+--------+---------+ | 'apple' | 4 | 4 | 'apple' | 3 | 7 | 'bar' | +----------+---------+-------+----------+---------+--------+---------+ | 'apple' | 5 | 5 | 'apple' | 3 | 7 | 'bar' | +----------+---------+-------+----------+---------+--------+---------+ | 'orange' | 5 | 5 | 'orange' | 4 | 9 | 'baz' | +----------+---------+-------+----------+---------+--------+---------+
New in version 0.2.
- petlx.interval.intervalleftjoin(left, right, lstart='start', lstop='stop', rstart='start', rstop='stop', lfacet=None, rfacet=None, proximity=0, missing=None, lprefix=None, rprefix=None)[source]¶
Like intervaljoin() but rows from the left table without a match in the right table are also included. E.g.:
>>> from petl import look >>> from petlx.interval import intervalleftjoin >>> look(left) +----------+---------+-------+ | 'fruit' | 'begin' | 'end' | +==========+=========+=======+ | 'apple' | 1 | 2 | +----------+---------+-------+ | 'apple' | 2 | 4 | +----------+---------+-------+ | 'apple' | 2 | 5 | +----------+---------+-------+ | 'orange' | 2 | 5 | +----------+---------+-------+ | 'orange' | 9 | 14 | +----------+---------+-------+ | 'orange' | 19 | 140 | +----------+---------+-------+ | 'apple' | 1 | 1 | +----------+---------+-------+ | 'apple' | 2 | 2 | +----------+---------+-------+ | 'apple' | 4 | 4 | +----------+---------+-------+ | 'apple' | 5 | 5 | +----------+---------+-------+ >>> look(right) +----------+---------+--------+---------+ | 'type' | 'start' | 'stop' | 'value' | +==========+=========+========+=========+ | 'apple' | 1 | 4 | 'foo' | +----------+---------+--------+---------+ | 'apple' | 3 | 7 | 'bar' | +----------+---------+--------+---------+ | 'orange' | 4 | 9 | 'baz' | +----------+---------+--------+---------+ >>> result = intervalleftjoin(left, right, lstart='begin', lstop='end', rstart='start', rstop='stop') >>> look(result) +----------+---------+-------+----------+---------+--------+---------+ | 'fruit' | 'begin' | 'end' | 'type' | 'start' | 'stop' | 'value' | +==========+=========+=======+==========+=========+========+=========+ | 'apple' | 1 | 2 | 'apple' | 1 | 4 | 'foo' | +----------+---------+-------+----------+---------+--------+---------+ | 'apple' | 2 | 4 | 'apple' | 1 | 4 | 'foo' | +----------+---------+-------+----------+---------+--------+---------+ | 'apple' | 2 | 4 | 'apple' | 3 | 7 | 'bar' | +----------+---------+-------+----------+---------+--------+---------+ | 'apple' | 2 | 5 | 'apple' | 1 | 4 | 'foo' | +----------+---------+-------+----------+---------+--------+---------+ | 'apple' | 2 | 5 | 'apple' | 3 | 7 | 'bar' | +----------+---------+-------+----------+---------+--------+---------+ | 'apple' | 2 | 5 | 'orange' | 4 | 9 | 'baz' | +----------+---------+-------+----------+---------+--------+---------+ | 'orange' | 2 | 5 | 'apple' | 1 | 4 | 'foo' | +----------+---------+-------+----------+---------+--------+---------+ | 'orange' | 2 | 5 | 'apple' | 3 | 7 | 'bar' | +----------+---------+-------+----------+---------+--------+---------+ | 'orange' | 2 | 5 | 'orange' | 4 | 9 | 'baz' | +----------+---------+-------+----------+---------+--------+---------+ | 'orange' | 9 | 14 | None | None | None | None | +----------+---------+-------+----------+---------+--------+---------+
New in version 0.2.
- petlx.interval.intervalantijoin(left, right, lstart='start', lstop='stop', rstart='start', rstop='stop', lfacet=None, rfacet=None, proximity=0, missing=None)[source]¶
Return rows from the left table with no overlapping rows from the right table.
New in version 0.16.
- petlx.interval.intervaljoinvalues(left, right, value, lstart='start', lstop='stop', rstart='start', rstop='stop', lfacet=None, rfacet=None, proximity=0)[source]¶
Convenience function to join the left table with values from a specific field in the right hand table.
New in version 0.5.3.
- petlx.interval.intervalsubtract(left, right, lstart='start', lstop='stop', rstart='start', rstop='stop', lfacet=None, rfacet=None, proximity=0)[source]¶
Subtract intervals in the right hand table from intervals in the left hand table.
New in version 0.5.4.
- petlx.interval.intervallookup(table, start='start', stop='stop', value=None, proximity=0)[source]¶
Construct an interval lookup for the given table. E.g.:
>>> from petlx.interval import intervallookup >>> table = [['start', 'stop', 'value'], ... [1, 4, 'foo'], ... [3, 7, 'bar'], ... [4, 9, 'baz']] >>> lkp = intervallookup(table, 'start', 'stop') >>> lkp.find(1, 2) [(1, 4, 'foo')] >>> lkp.find(2, 4) [(1, 4, 'foo'), (3, 7, 'bar')] >>> lkp.find(2, 5) [(1, 4, 'foo'), (3, 7, 'bar'), (4, 9, 'baz')] >>> lkp.find(9, 14) [] >>> lkp.find(19, 140) [] >>> lkp.find(1) [] >>> lkp.find(2) [(1, 4, 'foo')] >>> lkp.find(4) [(3, 7, 'bar')] >>> lkp.find(5) [(3, 7, 'bar'), (4, 9, 'baz')]
Note that there must be a non-zero overlap between the query and the interval for the interval to be retrieved, hence lkp.find(1) returns nothing. Use the proximity keyword argument to find intervals within a given distance of the query.
Some examples using the proximity and value keyword arguments:
>>> table = [['start', 'stop', 'value'], ... [1, 4, 'foo'], ... [3, 7, 'bar'], ... [4, 9, 'baz']] >>> lkp = intervallookup(table, 'start', 'stop', value='value', proximity=1) >>> lkp.find(1, 2) ['foo'] >>> lkp.find(2, 4) ['foo', 'bar', 'baz'] >>> lkp.find(2, 5) ['foo', 'bar', 'baz'] >>> lkp.find(9, 14) ['baz'] >>> lkp.find(19, 140) [] >>> lkp.find(1) ['foo'] >>> lkp.find(2) ['foo'] >>> lkp.find(4) ['foo', 'bar', 'baz'] >>> lkp.find(5) ['bar', 'baz'] >>> lkp.find(9) ['baz']
New in version 0.2.
- petlx.interval.intervallookupone(table, start='start', stop='stop', value=None, proximity=0, strict=True)[source]¶
Construct an interval lookup for the given table, returning at most one result for each query. If strict=True is given, queries returning more than one result will raise a DuplicateKeyError. If strict=False is given, and there is more than one result, the first result is returned.
See also intervallookup().
New in version 0.2.
- petlx.interval.intervalrecordlookup(table, start='start', stop='stop', proximity=0)[source]¶
As intervallookup() but return records.
New in version 0.2.
- petlx.interval.intervalrecordlookupone(table, start='start', stop='stop', proximity=0, strict=True)[source]¶
As intervallookupone() but return records.
New in version 0.2.
- petlx.interval.facetintervallookup(table, facet, start='start', stop='stop', value=None, proximity=0)[source]¶
Construct a faceted interval lookup for the given table. E.g.:
>>> from petl import look >>> from petlx.interval import facetintervallookup >>> look(table) +----------+---------+--------+---------+ | 'type' | 'start' | 'stop' | 'value' | +==========+=========+========+=========+ | 'apple' | 1 | 4 | 'foo' | +----------+---------+--------+---------+ | 'apple' | 3 | 7 | 'bar' | +----------+---------+--------+---------+ | 'orange' | 4 | 9 | 'baz' | +----------+---------+--------+---------+ >>> lkp = facetintervallookup(table, facet='type', start='start', stop='stop') >>> lkp['apple'].find(1, 2) [('apple', 1, 4, 'foo')] >>> lkp['apple'].find(2, 4) [('apple', 1, 4, 'foo'), ('apple', 3, 7, 'bar')] >>> lkp['apple'].find(2, 5) [('apple', 1, 4, 'foo'), ('apple', 3, 7, 'bar')] >>> lkp['orange'].find(2, 5) [('orange', 4, 9, 'baz')] >>> lkp['orange'].find(9, 14) [] >>> lkp['orange'].find(19, 140) [] >>> lkp['apple'].find(1) [] >>> lkp['apple'].find(2) [('apple', 1, 4, 'foo')] >>> lkp['apple'].find(4) [('apple', 3, 7, 'bar')] >>> lkp['apple'].find(5) [('apple', 3, 7, 'bar')] >>> lkp['orange'].find(5) [('orange', 4, 9, 'baz')]
New in version 0.2.
- petlx.interval.facetintervallookupone(table, facet, start='start', stop='stop', value=None, proximity=0, strict=True)[source]¶
Construct a faceted interval lookup for the given table, returning at most one result for each query, e.g.:
>>> from petl import look >>> from petlx.interval import facetintervallookupone >>> look(table) +----------+---------+--------+---------+ | 'type' | 'start' | 'stop' | 'value' | +==========+=========+========+=========+ | 'apple' | 1 | 4 | 'foo' | +----------+---------+--------+---------+ | 'apple' | 3 | 7 | 'bar' | +----------+---------+--------+---------+ | 'orange' | 4 | 9 | 'baz' | +----------+---------+--------+---------+ >>> lkp = facetintervallookupone(table, key='type', start='start', stop='stop', value='value') >>> lkp['apple'].find(1, 2) 'foo' >>> lkp['apple'].find(2, 4) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "petlx/interval.py", line 191, in __getitem__ raise DuplicateKeyError petl.util.DuplicateKeyError >>> lkp['apple'].find(2, 5) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "petlx/interval.py", line 191, in __getitem__ raise DuplicateKeyError petl.util.DuplicateKeyError >>> lkp['apple'].find(4, 5) 'bar' >>> lkp['orange'].find(4, 5) 'baz' >>> lkp['apple'].find(5, 7) 'bar' >>> lkp['orange'].find(5, 7) 'baz' >>> lkp['apple'].find(8, 9) >>> lkp['orange'].find(8, 9) 'baz' >>> lkp['orange'].find(9, 14) >>> lkp['orange'].find(19, 140) >>> lkp['apple'].find(1) >>> lkp['apple'].find(2) 'foo' >>> lkp['apple'].find(4) 'bar' >>> lkp['apple'].find(5) 'bar' >>> lkp['orange'].find(5) 'baz' >>> lkp['apple'].find(8) >>> lkp['orange'].find(8) 'baz'
If strict=True is given, queries returning more than one result will raise a DuplicateKeyError. If strict=False is given, and there is more than one result, the first result is returned.
See also facetintervallookup().
New in version 0.2.
- petlx.interval.facetintervalrecordlookup(table, facet, start='start', stop='stop', proximity=0)[source]¶
As facetintervallookup() but return records.
New in version 0.2.
- petlx.interval.facetintervalrecordlookupone(table, facet, start, stop, proximity=0, strict=True)[source]¶
As facetintervallookupone() but return records.
New in version 0.2.
- petlx.interval.collapsedintervals(tbl, start='start', stop='stop', facet=None)[source]¶
Utility function to collapse intervals in a table.
If no facet key is given, returns an iterator over (start, stop) tuples.
If facet key is given, returns an iterator over (key, start, stop) tuples.
New in version 0.5.5.
- petlx.interval.tupletree(table, start='start', stop='stop', value=None)[source]¶
Construct an interval tree for the given table, where each node in the tree is a row of the table.
- petlx.interval.tupletrees(table, facet, start='start', stop='stop', value=None)[source]¶
Construct faceted interval trees for the given table, where each node in the tree is a row of the table.