I have a problem where I need to be able to efficiently[1] find all of the intervals that contain a particular (integer) value. There are data-structures to do this efficiently, for example the interval tree. However, I have quite a lot of intervals, and I don't want to have to code up a complicated disk-based data structure, so I am constrained to use a database.
The solution I came up with is to assign each interval a particular numerical code, such that a query of a small set of such codes will guarantee finding each interval that contains a given value.
I'm not sure if this is novel. It seems a fairly obvious thing to do, so probably not. If not, can anyone tell me what it is called?
The code is this: I produce a trinary number from the binary representation of the lower and upper bounds of the interval. Starting with the most significant digit I map 0->1 and 1->2 until the first digit that varies between the lower and upper bounds. From there on I just produce 0s.
An example may make this clear:
Lower 00010101010 Upper 00010101111 Output 00021212000
To perform a query, I similarly convert the binary query number to trinary. The query set is then this trinary number with each least significant digit converted to zero in turn.
An example:
Query 00010101100 Output 00021212211 00021212210 00021212200 00021212000 00021210000 00021200000 00021000000 00020000000 00000000000
Here's the source code:
The source code includes code to test this idea using SQLite.
Update: This is hideous. Sorry. I am working on a much nicer scheme, stay tuned.
[1] Clarification: By efficiently I mean faster than O(n).