Intervals (bx-python)

The package bx.intervals is required. Instructions for installation can be found at https://bitbucket.org/james_taylor/bx-python/wiki/Home or try pip install bx-python.

petlx.interval.intervallookup(table, start='start', stop='stop', valuespec=None, proximity=0)[source]

Construct an interval lookup for the given table. E.g.:

>>> from petlx.interval import intervallookup    
>>> table = [['start', 'stop', 'value'],
...          [1, 4, 'foo'],
...          [3, 7, 'bar'],
...          [4, 9, 'baz']]
>>> lkp = intervallookup(table, 'start', 'stop')
>>> lkp[1:2]
[(1, 4, 'foo')]
>>> lkp[2:4]
[(1, 4, 'foo'), (3, 7, 'bar')]
>>> lkp[2:5]
[(1, 4, 'foo'), (3, 7, 'bar'), (4, 9, 'baz')]
>>> lkp[9:14]
[]
>>> lkp[19:140]
[]
>>> lkp[1]
[]
>>> lkp[2]
[(1, 4, 'foo')]
>>> lkp[4]
[(3, 7, 'bar')]
>>> lkp[5]
[(3, 7, 'bar'), (4, 9, 'baz')]

Note that there must be a non-zero overlap between the query and the interval for the interval to be retrieved, hence lkp[1] returns nothing. Use the proximity keyword argument to find intervals within a given distance of the query.

Some examples using the proximity and valuespec keyword arguments:

>>> table = [['start', 'stop', 'value'],
...          [1, 4, 'foo'],
...          [3, 7, 'bar'],
...          [4, 9, 'baz']]
>>> lkp = intervallookup(table, 'start', 'stop', valuespec='value', proximity=1)
>>> lkp[1:2]
['foo']
>>> lkp[2:4]
['foo', 'bar', 'baz']
>>> lkp[2:5]
['foo', 'bar', 'baz']
>>> lkp[9:14]
['baz']
>>> lkp[19:140]
[]
>>> lkp[1]
['foo']
>>> lkp[2]
['foo']
>>> lkp[4]
['foo', 'bar', 'baz']
>>> lkp[5]
['bar', 'baz']
>>> lkp[9]
['baz']

New in version 0.2.

petlx.interval.intervallookupone(table, start='start', stop='stop', valuespec=None, proximity=0, strict=True)[source]

Construct an interval lookup for the given table, returning at most one result for each query. If strict=True is given, queries returning more than one result will raise a DuplicateKeyError. If strict=False is given, and there is more than one result, the first result is returned.

See also intervallookup().

New in version 0.2.

petlx.interval.intervalrecordlookup(table, start='start', stop='stop', proximity=0)[source]

As intervallookup() but return records (dictionaries of values indexed by field name).

New in version 0.2.

petlx.interval.intervalrecordlookupone(table, start='start', stop='stop', proximity=0, strict=True)[source]

As intervallookupone() but return records (dictionaries of values indexed by field name).

New in version 0.2.

petlx.interval.facetintervallookup(table, key, start='start', stop='stop', valuespec=None, proximity=0)[source]

Construct a faceted interval lookup for the given table. E.g.:

>>> from petl import look
>>> from petlx.interval import facetintervallookup
>>> look(table)
+----------+---------+--------+---------+
| 'type'   | 'start' | 'stop' | 'value' |
+==========+=========+========+=========+
| 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+--------+---------+
| 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+--------+---------+
| 'orange' | 4       | 9      | 'baz'   |
+----------+---------+--------+---------+

>>> lkp = facetintervallookup(table, key='type', start='start', stop='stop')
>>> lkp['apple'][1:2]
[('apple', 1, 4, 'foo')]
>>> lkp['apple'][2:4]
[('apple', 1, 4, 'foo'), ('apple', 3, 7, 'bar')]
>>> lkp['apple'][2:5]
[('apple', 1, 4, 'foo'), ('apple', 3, 7, 'bar')]
>>> lkp['orange'][2:5]
[('orange', 4, 9, 'baz')]
>>> lkp['orange'][9:14]
[]
>>> lkp['orange'][19:140]
[]
>>> lkp['apple'][1]
[]
>>> lkp['apple'][2]
[('apple', 1, 4, 'foo')]
>>> lkp['apple'][4]
[('apple', 3, 7, 'bar')]
>>> lkp['apple'][5]
[('apple', 3, 7, 'bar')]
>>> lkp['orange'][5]
[('orange', 4, 9, 'baz')]

New in version 0.2.

petlx.interval.facetintervallookupone(table, key, start='start', stop='stop', valuespec=None, proximity=0, strict=True)[source]

Construct a faceted interval lookup for the given table, returning at most one result for each query, e.g.:

>>> from petl import look
>>> from petlx.interval import facetintervallookupone
>>> look(table)
+----------+---------+--------+---------+
| 'type'   | 'start' | 'stop' | 'value' |
+==========+=========+========+=========+
| 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+--------+---------+
| 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+--------+---------+
| 'orange' | 4       | 9      | 'baz'   |
+----------+---------+--------+---------+

>>> lkp = facetintervallookupone(table, key='type', start='start', stop='stop', valuespec='value')
>>> lkp['apple'][1:2]
'foo'
>>> lkp['apple'][2:4]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "petlx/interval.py", line 191, in __getitem__
    raise DuplicateKeyError
petl.util.DuplicateKeyError
>>> lkp['apple'][2:5]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "petlx/interval.py", line 191, in __getitem__
    raise DuplicateKeyError
petl.util.DuplicateKeyError
>>> lkp['apple'][4:5]
'bar'
>>> lkp['orange'][4:5]
'baz'
>>> lkp['apple'][5:7]
'bar'
>>> lkp['orange'][5:7]
'baz'
>>> lkp['apple'][8:9]
>>> lkp['orange'][8:9]
'baz'
>>> lkp['orange'][9:14]
>>> lkp['orange'][19:140]
>>> lkp['apple'][1]
>>> lkp['apple'][2]
'foo'
>>> lkp['apple'][4]
'bar'
>>> lkp['apple'][5]
'bar'
>>> lkp['orange'][5]
'baz'
>>> lkp['apple'][8]
>>> lkp['orange'][8]
'baz'

If strict=True is given, queries returning more than one result will raise a DuplicateKeyError. If strict=False is given, and there is more than one result, the first result is returned.

See also facetintervallookup().

New in version 0.2.

petlx.interval.facetintervalrecordlookup(table, key, start='start', stop='stop', proximity=0)[source]

As facetintervallookup() but return records (dictionaries of values indexed by field name).

New in version 0.2.

petlx.interval.facetintervalrecordlookupone(table, key, start, stop, proximity=0, strict=True)[source]

As facetintervallookupone() but return records (dictionaries of values indexed by field name).

New in version 0.2.

petlx.interval.intervaljoin(left, right, lstart='start', lstop='stop', rstart='start', rstop='stop', lfacet=None, rfacet=None, proximity=0)[source]

Join two tables by overlapping intervals. E.g.:

>>> from petl import look
>>> from petlx.interval import intervaljoin
>>> look(left)
+---------+-------+--------+
| 'begin' | 'end' | 'quux' |
+=========+=======+========+
| 1       | 2     | 'a'    |
+---------+-------+--------+
| 2       | 4     | 'b'    |
+---------+-------+--------+
| 2       | 5     | 'c'    |
+---------+-------+--------+
| 9       | 14    | 'd'    |
+---------+-------+--------+
| 9       | 140   | 'e'    |
+---------+-------+--------+
| 1       | 1     | 'f'    |
+---------+-------+--------+
| 2       | 2     | 'g'    |
+---------+-------+--------+
| 4       | 4     | 'h'    |
+---------+-------+--------+
| 5       | 5     | 'i'    |
+---------+-------+--------+
| 1       | 8     | 'j'    |
+---------+-------+--------+

>>> look(right)
+---------+--------+---------+
| 'start' | 'stop' | 'value' |
+=========+========+=========+
| 1       | 4      | 'foo'   |
+---------+--------+---------+
| 3       | 7      | 'bar'   |
+---------+--------+---------+
| 4       | 9      | 'baz'   |
+---------+--------+---------+

>>> result = intervaljoin(left, right, lstart='begin', lstop='end', rstart='start', rstop='stop')
>>> look(result) 
+---------+-------+--------+---------+--------+---------+
| 'begin' | 'end' | 'quux' | 'start' | 'stop' | 'value' |
+=========+=======+========+=========+========+=========+
| 1       | 2     | 'a'    | 1       | 4      | 'foo'   |
+---------+-------+--------+---------+--------+---------+
| 2       | 4     | 'b'    | 1       | 4      | 'foo'   |
+---------+-------+--------+---------+--------+---------+
| 2       | 4     | 'b'    | 3       | 7      | 'bar'   |
+---------+-------+--------+---------+--------+---------+
| 2       | 5     | 'c'    | 1       | 4      | 'foo'   |
+---------+-------+--------+---------+--------+---------+
| 2       | 5     | 'c'    | 3       | 7      | 'bar'   |
+---------+-------+--------+---------+--------+---------+
| 2       | 5     | 'c'    | 4       | 9      | 'baz'   |
+---------+-------+--------+---------+--------+---------+
| 2       | 2     | 'g'    | 1       | 4      | 'foo'   |
+---------+-------+--------+---------+--------+---------+
| 4       | 4     | 'h'    | 3       | 7      | 'bar'   |
+---------+-------+--------+---------+--------+---------+
| 5       | 5     | 'i'    | 3       | 7      | 'bar'   |
+---------+-------+--------+---------+--------+---------+
| 5       | 5     | 'i'    | 4       | 9      | 'baz'   |
+---------+-------+--------+---------+--------+---------+

An additional key comparison can be added, e.g.:

>>> from petl import look
>>> from petlx.interval import intervaljoin    
>>> look(left)
+----------+---------+-------+
| 'fruit'  | 'begin' | 'end' |
+==========+=========+=======+
| 'apple'  | 1       | 2     |
+----------+---------+-------+
| 'apple'  | 2       | 4     |
+----------+---------+-------+
| 'apple'  | 2       | 5     |
+----------+---------+-------+
| 'orange' | 2       | 5     |
+----------+---------+-------+
| 'orange' | 9       | 14    |
+----------+---------+-------+
| 'orange' | 19      | 140   |
+----------+---------+-------+
| 'apple'  | 1       | 1     |
+----------+---------+-------+
| 'apple'  | 2       | 2     |
+----------+---------+-------+
| 'apple'  | 4       | 4     |
+----------+---------+-------+
| 'apple'  | 5       | 5     |
+----------+---------+-------+

>>> look(right)
+----------+---------+--------+---------+
| 'type'   | 'start' | 'stop' | 'value' |
+==========+=========+========+=========+
| 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+--------+---------+
| 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+--------+---------+
| 'orange' | 4       | 9      | 'baz'   |
+----------+---------+--------+---------+

>>> result = intervaljoin(left, right, lstart='begin', lstop='end', rstart='start', rstop='stop', lfacet='fruit', rfacet='type')
>>> look(result)
+----------+---------+-------+----------+---------+--------+---------+
| 'fruit'  | 'begin' | 'end' | 'type'   | 'start' | 'stop' | 'value' |
+==========+=========+=======+==========+=========+========+=========+
| 'apple'  | 1       | 2     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 4     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 4     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 5     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 5     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'orange' | 2       | 5     | 'orange' | 4       | 9      | 'baz'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 2     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 4       | 4     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 5       | 5     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'orange' | 5       | 5     | 'orange' | 4       | 9      | 'baz'   |
+----------+---------+-------+----------+---------+--------+---------+

New in version 0.2.

petlx.interval.intervalleftjoin(left, right, lstart='start', lstop='stop', rstart='start', rstop='stop', lfacet=None, rfacet=None, proximity=0, missing=None)[source]

Like intervaljoin() but rows from the left table without a match in the right table are also included. E.g.:

>>> from petl import look
>>> from petlx.interval import intervalleftjoin
>>> look(left)
+----------+---------+-------+
| 'fruit'  | 'begin' | 'end' |
+==========+=========+=======+
| 'apple'  | 1       | 2     |
+----------+---------+-------+
| 'apple'  | 2       | 4     |
+----------+---------+-------+
| 'apple'  | 2       | 5     |
+----------+---------+-------+
| 'orange' | 2       | 5     |
+----------+---------+-------+
| 'orange' | 9       | 14    |
+----------+---------+-------+
| 'orange' | 19      | 140   |
+----------+---------+-------+
| 'apple'  | 1       | 1     |
+----------+---------+-------+
| 'apple'  | 2       | 2     |
+----------+---------+-------+
| 'apple'  | 4       | 4     |
+----------+---------+-------+
| 'apple'  | 5       | 5     |
+----------+---------+-------+

>>> look(right)
+----------+---------+--------+---------+
| 'type'   | 'start' | 'stop' | 'value' |
+==========+=========+========+=========+
| 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+--------+---------+
| 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+--------+---------+
| 'orange' | 4       | 9      | 'baz'   |
+----------+---------+--------+---------+

>>> result = intervalleftjoin(left, right, lstart='begin', lstop='end', rstart='start', rstop='stop')
>>> look(result)
+----------+---------+-------+----------+---------+--------+---------+
| 'fruit'  | 'begin' | 'end' | 'type'   | 'start' | 'stop' | 'value' |
+==========+=========+=======+==========+=========+========+=========+
| 'apple'  | 1       | 2     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 4     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 4     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 5     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 5     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 5     | 'orange' | 4       | 9      | 'baz'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'orange' | 2       | 5     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'orange' | 2       | 5     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'orange' | 2       | 5     | 'orange' | 4       | 9      | 'baz'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'orange' | 9       | 14    | None     | None    | None   | None    |
+----------+---------+-------+----------+---------+--------+---------+

New in version 0.2.

petlx.interval.intervaljoinvalues(left, right, lstart='start', lstop='stop', rstart='start', rstop='stop', lfacet=None, rfacet=None, proximity=0, valuespec=None, valuesfield='values')[source]

Convenience function to join the left table with values from a specific field in the right hand table.

New in version 0.5.3.

petlx.interval.intervalsubtract(left, right, lstart='start', lstop='stop', rstart='start', rstop='stop', lfacet=None, rfacet=None, proximity=0, missing=None)[source]

Subtract intervals in the right hand table from intervals in the left hand table.

New in version 0.5.4.

petlx.interval.collapsedintervals(tbl, start='start', stop='stop', facet=None)[source]

Utility function to collapse intervals in a table.

If no facet key is given, returns an iterator over (start, stop) tuples.

If facet key is given, returns an iterator over (key, start, stop) tuples.

New in version 0.5.5.

Project Versions

Previous topic

Arrays (numpy)

Next topic

GFF3 Utilities

This Page