Intervals (bx-python)

The package bx.intervals is required. Instructions for installation can be found at https://bitbucket.org/james_taylor/bx-python/wiki/Home or try pip install bx-python.

petlx.interval.intervallookup(table, start='start', stop='stop', valuespec=None, proximity=0)[source]

Construct an interval lookup for the given table. E.g.:

>>> from petlx.interval import intervallookup    
>>> table = [['start', 'stop', 'value'],
...          [1, 4, 'foo'],
...          [3, 7, 'bar'],
...          [4, 9, 'baz']]
>>> lkp = intervallookup(table, 'start', 'stop')
>>> lkp[1:2]
[(1, 4, 'foo')]
>>> lkp[2:4]
[(1, 4, 'foo'), (3, 7, 'bar')]
>>> lkp[2:5]
[(1, 4, 'foo'), (3, 7, 'bar'), (4, 9, 'baz')]
>>> lkp[9:14]
[]
>>> lkp[19:140]
[]
>>> lkp[1]
[]
>>> lkp[2]
[(1, 4, 'foo')]
>>> lkp[4]
[(3, 7, 'bar')]
>>> lkp[5]
[(3, 7, 'bar'), (4, 9, 'baz')]

Note that there must be a non-zero overlap between the query and the interval for the interval to be retrieved, hence lkp[1] returns nothing. Use the proximity keyword argument to find intervals within a given distance of the query.

Some examples using the proximity and valuespec keyword arguments:

>>> table = [['start', 'stop', 'value'],
...          [1, 4, 'foo'],
...          [3, 7, 'bar'],
...          [4, 9, 'baz']]
>>> lkp = intervallookup(table, 'start', 'stop', valuespec='value', proximity=1)
>>> lkp[1:2]
['foo']
>>> lkp[2:4]
['foo', 'bar', 'baz']
>>> lkp[2:5]
['foo', 'bar', 'baz']
>>> lkp[9:14]
['baz']
>>> lkp[19:140]
[]
>>> lkp[1]
['foo']
>>> lkp[2]
['foo']
>>> lkp[4]
['foo', 'bar', 'baz']
>>> lkp[5]
['bar', 'baz']
>>> lkp[9]
['baz']

New in version 0.2.

petlx.interval.intervallookupone(table, start='start', stop='stop', valuespec=None, proximity=0, strict=True)[source]

Construct an interval lookup for the given table, returning at most one result for each query. If strict=True is given, queries returning more than one result will raise a DuplicateKeyError. If strict=False is given, and there is more than one result, the first result is returned.

See also intervallookup().

New in version 0.2.

petlx.interval.intervalrecordlookup(table, start='start', stop='stop', proximity=0)[source]

As intervallookup() but return records (dictionaries of values indexed by field name).

New in version 0.2.

petlx.interval.intervalrecordlookupone(table, start='start', stop='stop', proximity=0, strict=True)[source]

As intervallookupone() but return records (dictionaries of values indexed by field name).

New in version 0.2.

petlx.interval.facetintervallookup(table, key, start='start', stop='stop', valuespec=None, proximity=0)[source]

Construct a faceted interval lookup for the given table. E.g.:

>>> from petl import look
>>> from petlx.interval import facetintervallookup
>>> look(table)
+----------+---------+--------+---------+
| 'type'   | 'start' | 'stop' | 'value' |
+==========+=========+========+=========+
| 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+--------+---------+
| 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+--------+---------+
| 'orange' | 4       | 9      | 'baz'   |
+----------+---------+--------+---------+

>>> lkp = facetintervallookup(table, key='type', start='start', stop='stop')
>>> lkp['apple'][1:2]
[('apple', 1, 4, 'foo')]
>>> lkp['apple'][2:4]
[('apple', 1, 4, 'foo'), ('apple', 3, 7, 'bar')]
>>> lkp['apple'][2:5]
[('apple', 1, 4, 'foo'), ('apple', 3, 7, 'bar')]
>>> lkp['orange'][2:5]
[('orange', 4, 9, 'baz')]
>>> lkp['orange'][9:14]
[]
>>> lkp['orange'][19:140]
[]
>>> lkp['apple'][1]
[]
>>> lkp['apple'][2]
[('apple', 1, 4, 'foo')]
>>> lkp['apple'][4]
[('apple', 3, 7, 'bar')]
>>> lkp['apple'][5]
[('apple', 3, 7, 'bar')]
>>> lkp['orange'][5]
[('orange', 4, 9, 'baz')]

New in version 0.2.

petlx.interval.facetintervallookupone(table, key, start='start', stop='stop', valuespec=None, proximity=0, strict=True)[source]

Construct a faceted interval lookup for the given table, returning at most one result for each query, e.g.:

>>> from petl import look
>>> from petlx.interval import facetintervallookupone
>>> look(table)
+----------+---------+--------+---------+
| 'type'   | 'start' | 'stop' | 'value' |
+==========+=========+========+=========+
| 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+--------+---------+
| 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+--------+---------+
| 'orange' | 4       | 9      | 'baz'   |
+----------+---------+--------+---------+

>>> lkp = facetintervallookupone(table, key='type', start='start', stop='stop', valuespec='value')
>>> lkp['apple'][1:2]
'foo'
>>> lkp['apple'][2:4]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "petlx/interval.py", line 191, in __getitem__
    raise DuplicateKeyError
petl.util.DuplicateKeyError
>>> lkp['apple'][2:5]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "petlx/interval.py", line 191, in __getitem__
    raise DuplicateKeyError
petl.util.DuplicateKeyError
>>> lkp['apple'][4:5]
'bar'
>>> lkp['orange'][4:5]
'baz'
>>> lkp['apple'][5:7]
'bar'
>>> lkp['orange'][5:7]
'baz'
>>> lkp['apple'][8:9]
>>> lkp['orange'][8:9]
'baz'
>>> lkp['orange'][9:14]
>>> lkp['orange'][19:140]
>>> lkp['apple'][1]
>>> lkp['apple'][2]
'foo'
>>> lkp['apple'][4]
'bar'
>>> lkp['apple'][5]
'bar'
>>> lkp['orange'][5]
'baz'
>>> lkp['apple'][8]
>>> lkp['orange'][8]
'baz'

If strict=True is given, queries returning more than one result will raise a DuplicateKeyError. If strict=False is given, and there is more than one result, the first result is returned.

See also facetintervallookup().

New in version 0.2.

petlx.interval.facetintervalrecordlookup(table, key, start='start', stop='stop', proximity=0)[source]

As facetintervallookup() but return records (dictionaries of values indexed by field name).

New in version 0.2.

petlx.interval.facetintervalrecordlookupone(table, key, start, stop, proximity=0, strict=True)[source]

As facetintervallookupone() but return records (dictionaries of values indexed by field name).

New in version 0.2.

petlx.interval.intervaljoin(left, right, lstart='start', lstop='stop', rstart='start', rstop='stop', lfacet=None, rfacet=None, proximity=0)[source]

Join two tables by overlapping intervals. E.g.:

>>> from petl import look
>>> from petlx.interval import intervaljoin
>>> look(left)
+---------+-------+--------+
| 'begin' | 'end' | 'quux' |
+=========+=======+========+
| 1       | 2     | 'a'    |
+---------+-------+--------+
| 2       | 4     | 'b'    |
+---------+-------+--------+
| 2       | 5     | 'c'    |
+---------+-------+--------+
| 9       | 14    | 'd'    |
+---------+-------+--------+
| 9       | 140   | 'e'    |
+---------+-------+--------+
| 1       | 1     | 'f'    |
+---------+-------+--------+
| 2       | 2     | 'g'    |
+---------+-------+--------+
| 4       | 4     | 'h'    |
+---------+-------+--------+
| 5       | 5     | 'i'    |
+---------+-------+--------+
| 1       | 8     | 'j'    |
+---------+-------+--------+

>>> look(right)
+---------+--------+---------+
| 'start' | 'stop' | 'value' |
+=========+========+=========+
| 1       | 4      | 'foo'   |
+---------+--------+---------+
| 3       | 7      | 'bar'   |
+---------+--------+---------+
| 4       | 9      | 'baz'   |
+---------+--------+---------+

>>> result = intervaljoin(left, right, lstart='begin', lstop='end', rstart='start', rstop='stop')
>>> look(result) 
+---------+-------+--------+---------+--------+---------+
| 'begin' | 'end' | 'quux' | 'start' | 'stop' | 'value' |
+=========+=======+========+=========+========+=========+
| 1       | 2     | 'a'    | 1       | 4      | 'foo'   |
+---------+-------+--------+---------+--------+---------+
| 2       | 4     | 'b'    | 1       | 4      | 'foo'   |
+---------+-------+--------+---------+--------+---------+
| 2       | 4     | 'b'    | 3       | 7      | 'bar'   |
+---------+-------+--------+---------+--------+---------+
| 2       | 5     | 'c'    | 1       | 4      | 'foo'   |
+---------+-------+--------+---------+--------+---------+
| 2       | 5     | 'c'    | 3       | 7      | 'bar'   |
+---------+-------+--------+---------+--------+---------+
| 2       | 5     | 'c'    | 4       | 9      | 'baz'   |
+---------+-------+--------+---------+--------+---------+
| 2       | 2     | 'g'    | 1       | 4      | 'foo'   |
+---------+-------+--------+---------+--------+---------+
| 4       | 4     | 'h'    | 3       | 7      | 'bar'   |
+---------+-------+--------+---------+--------+---------+
| 5       | 5     | 'i'    | 3       | 7      | 'bar'   |
+---------+-------+--------+---------+--------+---------+
| 5       | 5     | 'i'    | 4       | 9      | 'baz'   |
+---------+-------+--------+---------+--------+---------+

An additional key comparison can be added, e.g.:

>>> from petl import look
>>> from petlx.interval import intervaljoin    
>>> look(left)
+----------+---------+-------+
| 'fruit'  | 'begin' | 'end' |
+==========+=========+=======+
| 'apple'  | 1       | 2     |
+----------+---------+-------+
| 'apple'  | 2       | 4     |
+----------+---------+-------+
| 'apple'  | 2       | 5     |
+----------+---------+-------+
| 'orange' | 2       | 5     |
+----------+---------+-------+
| 'orange' | 9       | 14    |
+----------+---------+-------+
| 'orange' | 19      | 140   |
+----------+---------+-------+
| 'apple'  | 1       | 1     |
+----------+---------+-------+
| 'apple'  | 2       | 2     |
+----------+---------+-------+
| 'apple'  | 4       | 4     |
+----------+---------+-------+
| 'apple'  | 5       | 5     |
+----------+---------+-------+

>>> look(right)
+----------+---------+--------+---------+
| 'type'   | 'start' | 'stop' | 'value' |
+==========+=========+========+=========+
| 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+--------+---------+
| 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+--------+---------+
| 'orange' | 4       | 9      | 'baz'   |
+----------+---------+--------+---------+

>>> result = intervaljoin(left, right, lstart='begin', lstop='end', rstart='start', rstop='stop', lfacet='fruit', rfacet='type')
>>> look(result)
+----------+---------+-------+----------+---------+--------+---------+
| 'fruit'  | 'begin' | 'end' | 'type'   | 'start' | 'stop' | 'value' |
+==========+=========+=======+==========+=========+========+=========+
| 'apple'  | 1       | 2     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 4     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 4     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 5     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 5     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'orange' | 2       | 5     | 'orange' | 4       | 9      | 'baz'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 2     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 4       | 4     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 5       | 5     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'orange' | 5       | 5     | 'orange' | 4       | 9      | 'baz'   |
+----------+---------+-------+----------+---------+--------+---------+

New in version 0.2.

petlx.interval.intervalleftjoin(left, right, lstart='start', lstop='stop', rstart='start', rstop='stop', lfacet=None, rfacet=None, proximity=0, missing=None)[source]

Like intervaljoin() but rows from the left table without a match in the right table are also included. E.g.:

>>> from petl import look
>>> from petlx.interval import intervalleftjoin
>>> look(left)
+----------+---------+-------+
| 'fruit'  | 'begin' | 'end' |
+==========+=========+=======+
| 'apple'  | 1       | 2     |
+----------+---------+-------+
| 'apple'  | 2       | 4     |
+----------+---------+-------+
| 'apple'  | 2       | 5     |
+----------+---------+-------+
| 'orange' | 2       | 5     |
+----------+---------+-------+
| 'orange' | 9       | 14    |
+----------+---------+-------+
| 'orange' | 19      | 140   |
+----------+---------+-------+
| 'apple'  | 1       | 1     |
+----------+---------+-------+
| 'apple'  | 2       | 2     |
+----------+---------+-------+
| 'apple'  | 4       | 4     |
+----------+---------+-------+
| 'apple'  | 5       | 5     |
+----------+---------+-------+

>>> look(right)
+----------+---------+--------+---------+
| 'type'   | 'start' | 'stop' | 'value' |
+==========+=========+========+=========+
| 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+--------+---------+
| 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+--------+---------+
| 'orange' | 4       | 9      | 'baz'   |
+----------+---------+--------+---------+

>>> result = intervalleftjoin(left, right, lstart='begin', lstop='end', rstart='start', rstop='stop')
>>> look(result)
+----------+---------+-------+----------+---------+--------+---------+
| 'fruit'  | 'begin' | 'end' | 'type'   | 'start' | 'stop' | 'value' |
+==========+=========+=======+==========+=========+========+=========+
| 'apple'  | 1       | 2     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 4     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 4     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 5     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 5     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 5     | 'orange' | 4       | 9      | 'baz'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'orange' | 2       | 5     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'orange' | 2       | 5     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'orange' | 2       | 5     | 'orange' | 4       | 9      | 'baz'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'orange' | 9       | 14    | None     | None    | None   | None    |
+----------+---------+-------+----------+---------+--------+---------+

New in version 0.2.

Project Versions

Previous topic

Arrays (numpy)

Next topic

GFF3 Utilities

This Page