Intervals (bx-python)

The package bx.intervals is required. Instructions for installation can be found at https://bitbucket.org/james_taylor/bx-python/wiki/Home or try pip install bx-python.

petlx.interval.intervaljoin(left, right, lstart='start', lstop='stop', rstart='start', rstop='stop', lfacet=None, rfacet=None, proximity=0)[source]

Join two tables by overlapping intervals. E.g.:

>>> from petl import look
>>> from petlx.interval import intervaljoin
>>> look(left)
+---------+-------+--------+
| 'begin' | 'end' | 'quux' |
+=========+=======+========+
| 1       | 2     | 'a'    |
+---------+-------+--------+
| 2       | 4     | 'b'    |
+---------+-------+--------+
| 2       | 5     | 'c'    |
+---------+-------+--------+
| 9       | 14    | 'd'    |
+---------+-------+--------+
| 9       | 140   | 'e'    |
+---------+-------+--------+
| 1       | 1     | 'f'    |
+---------+-------+--------+
| 2       | 2     | 'g'    |
+---------+-------+--------+
| 4       | 4     | 'h'    |
+---------+-------+--------+
| 5       | 5     | 'i'    |
+---------+-------+--------+
| 1       | 8     | 'j'    |
+---------+-------+--------+

>>> look(right)
+---------+--------+---------+
| 'start' | 'stop' | 'value' |
+=========+========+=========+
| 1       | 4      | 'foo'   |
+---------+--------+---------+
| 3       | 7      | 'bar'   |
+---------+--------+---------+
| 4       | 9      | 'baz'   |
+---------+--------+---------+

>>> result = intervaljoin(left, right, lstart='begin', lstop='end', rstart='start', rstop='stop')
>>> look(result) 
+---------+-------+--------+---------+--------+---------+
| 'begin' | 'end' | 'quux' | 'start' | 'stop' | 'value' |
+=========+=======+========+=========+========+=========+
| 1       | 2     | 'a'    | 1       | 4      | 'foo'   |
+---------+-------+--------+---------+--------+---------+
| 2       | 4     | 'b'    | 1       | 4      | 'foo'   |
+---------+-------+--------+---------+--------+---------+
| 2       | 4     | 'b'    | 3       | 7      | 'bar'   |
+---------+-------+--------+---------+--------+---------+
| 2       | 5     | 'c'    | 1       | 4      | 'foo'   |
+---------+-------+--------+---------+--------+---------+
| 2       | 5     | 'c'    | 3       | 7      | 'bar'   |
+---------+-------+--------+---------+--------+---------+
| 2       | 5     | 'c'    | 4       | 9      | 'baz'   |
+---------+-------+--------+---------+--------+---------+
| 2       | 2     | 'g'    | 1       | 4      | 'foo'   |
+---------+-------+--------+---------+--------+---------+
| 4       | 4     | 'h'    | 3       | 7      | 'bar'   |
+---------+-------+--------+---------+--------+---------+
| 5       | 5     | 'i'    | 3       | 7      | 'bar'   |
+---------+-------+--------+---------+--------+---------+
| 5       | 5     | 'i'    | 4       | 9      | 'baz'   |
+---------+-------+--------+---------+--------+---------+

An additional key comparison can be added, e.g.:

>>> from petl import look
>>> look(left)
+----------+---------+-------+
| 'fruit'  | 'begin' | 'end' |
+==========+=========+=======+
| 'apple'  | 1       | 2     |
+----------+---------+-------+
| 'apple'  | 2       | 4     |
+----------+---------+-------+
| 'apple'  | 2       | 5     |
+----------+---------+-------+
| 'orange' | 2       | 5     |
+----------+---------+-------+
| 'orange' | 9       | 14    |
+----------+---------+-------+
| 'orange' | 19      | 140   |
+----------+---------+-------+
| 'apple'  | 1       | 1     |
+----------+---------+-------+
| 'apple'  | 2       | 2     |
+----------+---------+-------+
| 'apple'  | 4       | 4     |
+----------+---------+-------+
| 'apple'  | 5       | 5     |
+----------+---------+-------+

>>> look(right)
+----------+---------+--------+---------+
| 'type'   | 'start' | 'stop' | 'value' |
+==========+=========+========+=========+
| 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+--------+---------+
| 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+--------+---------+
| 'orange' | 4       | 9      | 'baz'   |
+----------+---------+--------+---------+

>>> result = intervaljoin(left, right, lstart='begin', lstop='end', rstart='start', rstop='stop', lfacet='fruit', rfacet='type')
>>> look(result)
+----------+---------+-------+----------+---------+--------+---------+
| 'fruit'  | 'begin' | 'end' | 'type'   | 'start' | 'stop' | 'value' |
+==========+=========+=======+==========+=========+========+=========+
| 'apple'  | 1       | 2     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 4     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 4     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 5     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 5     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'orange' | 2       | 5     | 'orange' | 4       | 9      | 'baz'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 2     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 4       | 4     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 5       | 5     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'orange' | 5       | 5     | 'orange' | 4       | 9      | 'baz'   |
+----------+---------+-------+----------+---------+--------+---------+

New in version 0.2.

petlx.interval.intervalleftjoin(left, right, lstart='start', lstop='stop', rstart='start', rstop='stop', lfacet=None, rfacet=None, proximity=0, missing=None)[source]

Like intervaljoin() but rows from the left table without a match in the right table are also included. E.g.:

>>> from petl import look
>>> from petlx.interval import intervalleftjoin
>>> look(left)
+----------+---------+-------+
| 'fruit'  | 'begin' | 'end' |
+==========+=========+=======+
| 'apple'  | 1       | 2     |
+----------+---------+-------+
| 'apple'  | 2       | 4     |
+----------+---------+-------+
| 'apple'  | 2       | 5     |
+----------+---------+-------+
| 'orange' | 2       | 5     |
+----------+---------+-------+
| 'orange' | 9       | 14    |
+----------+---------+-------+
| 'orange' | 19      | 140   |
+----------+---------+-------+
| 'apple'  | 1       | 1     |
+----------+---------+-------+
| 'apple'  | 2       | 2     |
+----------+---------+-------+
| 'apple'  | 4       | 4     |
+----------+---------+-------+
| 'apple'  | 5       | 5     |
+----------+---------+-------+

>>> look(right)
+----------+---------+--------+---------+
| 'type'   | 'start' | 'stop' | 'value' |
+==========+=========+========+=========+
| 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+--------+---------+
| 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+--------+---------+
| 'orange' | 4       | 9      | 'baz'   |
+----------+---------+--------+---------+

>>> result = intervalleftjoin(left, right, lstart='begin', lstop='end', rstart='start', rstop='stop')
>>> look(result)
+----------+---------+-------+----------+---------+--------+---------+
| 'fruit'  | 'begin' | 'end' | 'type'   | 'start' | 'stop' | 'value' |
+==========+=========+=======+==========+=========+========+=========+
| 'apple'  | 1       | 2     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 4     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 4     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 5     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 5     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'apple'  | 2       | 5     | 'orange' | 4       | 9      | 'baz'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'orange' | 2       | 5     | 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'orange' | 2       | 5     | 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'orange' | 2       | 5     | 'orange' | 4       | 9      | 'baz'   |
+----------+---------+-------+----------+---------+--------+---------+
| 'orange' | 9       | 14    | None     | None    | None   | None    |
+----------+---------+-------+----------+---------+--------+---------+

New in version 0.2.

petlx.interval.intervaljoinvalues(left, right, value, lstart='start', lstop='stop', rstart='start', rstop='stop', lfacet=None, rfacet=None, proximity=0)[source]

Convenience function to join the left table with values from a specific field in the right hand table.

New in version 0.5.3.

petlx.interval.intervalsubtract(left, right, lstart='start', lstop='stop', rstart='start', rstop='stop', lfacet=None, rfacet=None, proximity=0)[source]

Subtract intervals in the right hand table from intervals in the left hand table.

New in version 0.5.4.

petlx.interval.intervallookup(table, start='start', stop='stop', value=None, proximity=0)[source]

Construct an interval lookup for the given table. E.g.:

>>> from petlx.interval import intervallookup    
>>> table = [['start', 'stop', 'value'],
...          [1, 4, 'foo'],
...          [3, 7, 'bar'],
...          [4, 9, 'baz']]
>>> lkp = intervallookup(table, 'start', 'stop')
>>> lkp.find(1, 2)
[(1, 4, 'foo')]
>>> lkp.find(2, 4)
[(1, 4, 'foo'), (3, 7, 'bar')]
>>> lkp.find(2, 5)
[(1, 4, 'foo'), (3, 7, 'bar'), (4, 9, 'baz')]
>>> lkp.find(9, 14)
[]
>>> lkp.find(19, 140)
[]
>>> lkp.find(1)
[]
>>> lkp.find(2)
[(1, 4, 'foo')]
>>> lkp.find(4)
[(3, 7, 'bar')]
>>> lkp.find(5)
[(3, 7, 'bar'), (4, 9, 'baz')]

Note that there must be a non-zero overlap between the query and the interval for the interval to be retrieved, hence lkp.find(1) returns nothing. Use the proximity keyword argument to find intervals within a given distance of the query.

Some examples using the proximity and value keyword arguments:

>>> table = [['start', 'stop', 'value'],
...          [1, 4, 'foo'],
...          [3, 7, 'bar'],
...          [4, 9, 'baz']]
>>> lkp = intervallookup(table, 'start', 'stop', value='value', proximity=1)
>>> lkp.find(1, 2)
['foo']
>>> lkp.find(2, 4)
['foo', 'bar', 'baz']
>>> lkp.find(2, 5)
['foo', 'bar', 'baz']
>>> lkp.find(9, 14)
['baz']
>>> lkp.find(19, 140)
[]
>>> lkp.find(1)
['foo']
>>> lkp.find(2)
['foo']
>>> lkp.find(4)
['foo', 'bar', 'baz']
>>> lkp.find(5)
['bar', 'baz']
>>> lkp.find(9)
['baz']

New in version 0.2.

petlx.interval.intervallookupone(table, start='start', stop='stop', value=None, proximity=0, strict=True)[source]

Construct an interval lookup for the given table, returning at most one result for each query. If strict=True is given, queries returning more than one result will raise a DuplicateKeyError. If strict=False is given, and there is more than one result, the first result is returned.

See also intervallookup().

New in version 0.2.

petlx.interval.intervalrecordlookup(table, start='start', stop='stop', proximity=0)[source]

As intervallookup() but return records.

New in version 0.2.

petlx.interval.intervalrecordlookupone(table, start='start', stop='stop', proximity=0, strict=True)[source]

As intervallookupone() but return records.

New in version 0.2.

petlx.interval.facetintervallookup(table, facet, start='start', stop='stop', value=None, proximity=0)[source]

Construct a faceted interval lookup for the given table. E.g.:

>>> from petl import look
>>> from petlx.interval import facetintervallookup
>>> look(table)
+----------+---------+--------+---------+
| 'type'   | 'start' | 'stop' | 'value' |
+==========+=========+========+=========+
| 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+--------+---------+
| 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+--------+---------+
| 'orange' | 4       | 9      | 'baz'   |
+----------+---------+--------+---------+

>>> lkp = facetintervallookup(table, facet='type', start='start', stop='stop')
>>> lkp['apple'].find(1, 2)
[('apple', 1, 4, 'foo')]
>>> lkp['apple'].find(2, 4)
[('apple', 1, 4, 'foo'), ('apple', 3, 7, 'bar')]
>>> lkp['apple'].find(2, 5)
[('apple', 1, 4, 'foo'), ('apple', 3, 7, 'bar')]
>>> lkp['orange'].find(2, 5)
[('orange', 4, 9, 'baz')]
>>> lkp['orange'].find(9, 14)
[]
>>> lkp['orange'].find(19, 140)
[]
>>> lkp['apple'].find(1)
[]
>>> lkp['apple'].find(2)
[('apple', 1, 4, 'foo')]
>>> lkp['apple'].find(4)
[('apple', 3, 7, 'bar')]
>>> lkp['apple'].find(5)
[('apple', 3, 7, 'bar')]
>>> lkp['orange'].find(5)
[('orange', 4, 9, 'baz')]

New in version 0.2.

petlx.interval.facetintervallookupone(table, facet, start='start', stop='stop', value=None, proximity=0, strict=True)[source]

Construct a faceted interval lookup for the given table, returning at most one result for each query, e.g.:

>>> from petl import look
>>> from petlx.interval import facetintervallookupone
>>> look(table)
+----------+---------+--------+---------+
| 'type'   | 'start' | 'stop' | 'value' |
+==========+=========+========+=========+
| 'apple'  | 1       | 4      | 'foo'   |
+----------+---------+--------+---------+
| 'apple'  | 3       | 7      | 'bar'   |
+----------+---------+--------+---------+
| 'orange' | 4       | 9      | 'baz'   |
+----------+---------+--------+---------+

>>> lkp = facetintervallookupone(table, key='type', start='start', stop='stop', value='value')
>>> lkp['apple'].find(1, 2)
'foo'
>>> lkp['apple'].find(2, 4)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "petlx/interval.py", line 191, in __getitem__
    raise DuplicateKeyError
petl.util.DuplicateKeyError
>>> lkp['apple'].find(2, 5)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "petlx/interval.py", line 191, in __getitem__
    raise DuplicateKeyError
petl.util.DuplicateKeyError
>>> lkp['apple'].find(4, 5)
'bar'
>>> lkp['orange'].find(4, 5)
'baz'
>>> lkp['apple'].find(5, 7)
'bar'
>>> lkp['orange'].find(5, 7)
'baz'
>>> lkp['apple'].find(8, 9)
>>> lkp['orange'].find(8, 9)
'baz'
>>> lkp['orange'].find(9, 14)
>>> lkp['orange'].find(19, 140)
>>> lkp['apple'].find(1)
>>> lkp['apple'].find(2)
'foo'
>>> lkp['apple'].find(4)
'bar'
>>> lkp['apple'].find(5)
'bar'
>>> lkp['orange'].find(5)
'baz'
>>> lkp['apple'].find(8)
>>> lkp['orange'].find(8)
'baz'

If strict=True is given, queries returning more than one result will raise a DuplicateKeyError. If strict=False is given, and there is more than one result, the first result is returned.

See also facetintervallookup().

New in version 0.2.

petlx.interval.facetintervalrecordlookup(table, facet, start='start', stop='stop', proximity=0)[source]

As facetintervallookup() but return records.

New in version 0.2.

petlx.interval.facetintervalrecordlookupone(table, facet, start, stop, proximity=0, strict=True)[source]

As facetintervallookupone() but return records.

New in version 0.2.

petlx.interval.collapsedintervals(tbl, start='start', stop='stop', facet=None)[source]

Utility function to collapse intervals in a table.

If no facet key is given, returns an iterator over (start, stop) tuples.

If facet key is given, returns an iterator over (key, start, stop) tuples.

New in version 0.5.5.

petlx.interval.tupletree(table, start='start', stop='stop', value=None)[source]

Construct an interval tree for the given table, where each node in the tree is a row of the table.

petlx.interval.tupletrees(table, facet, start='start', stop='stop', value=None)[source]

Construct faceted interval trees for the given table, where each node in the tree is a row of the table.

petlx.interval.recordtree(table, start='start', stop='stop')[source]

Construct an interval tree for the given table, where each node in the tree is a row of the table represented as a hybrid tuple/dictionary-style record object.

petlx.interval.recordtrees(table, facet, start='start', stop='stop')[source]

Construct faceted interval trees for the given table, where each node in the tree is a row of the table represented as a hybrid tuple/dictionary-style record object.

Read the Docs v: v0.12
Versions
latest
v0.12
v0.11
v0.10
v0.9
v0.8
v0.7
v0.6
v0.5.1
v0.5
v0.4
v0.3
v0.2
Downloads
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.