Segment Tree feat. Fractional Cascading
Recently, I came across a fascinating technique known as Fractional Cascading.
Context
I was solving KQUERY (SPOJ problem)
KQUERY
Given a sequence of n numbers a1, a2, ..., an and a number of k- queries. A k-query is a triple (i, j, k) (1 ≤ i ≤ j ≤ n). For each k-query (i, j, k), you have to return the number of elements greater than k in the subsequence ai, ai+1, ..., aj.
And, my first approach was to create a segment tree where each node stores the sorted array for that range and during query, I can use binary search over all the segment and get the answer but that would have complexity of O((log n)²) per query.
Why (log n)²/query?
And because of the constraints, I was doing 200000*14.87*14.87 => 44223380 ~= 4*1e7 number of operations for all the query (excluding the time to build the tree), which certainly won’t be passing in 1 sec time limit, I thought.
Then, I went on the journey to find the solution of my problem (with my Pokémon, I wish)
And I came across the technique which is known as Fractional Cascading.
Fractional Cascading
Fractional Cascading is a technique using which we can bring down the complexity of per query from O((log n)²) to O(log n) by doing only one binary search instead of doing at every interval.
Though, Fractional Cascading is a very powerful technique which has many application, this time, I am going to limit myself with it’s application in segment tree w.r.t. this question. I will be covering Fractional Cascading in detail in some other post.
In fact, we don’t need to harness the complete power of Fractional Cascading.
As I told earlier, that at each node, I am storing sorted array of the corresponding range, and since I need to find the number of elements greater than k in the given range [i, j]. Apart from storing the element in sorted order, I also store the indices (say p and q) of left and right nodes respectively for the element (say val) such that left_tree[p]>= val and right_tree[q]>=val
for example, refer to the below image, for the element 3 in the root node, the minimum index in the left_tree such that left_tree[index]>=3 is 2 and similarly, for the right_tree, it is 0.
This point is really important and I hope this is clear.
Now, for our each query operation:
Find the index (say, index) in the root node such that the value at that index > k. This can be found via upper_bound in C++.
Now, for each left and right subtree call the query function as you would call in any segment tree query function and pass the respective left_tree[index] and right_tree[index] value (say fractionalIndex) to the recursive function.
When you reach end condition of the recursive query function, return the number of elements which are greater than k.
This you can simply find by subtracting fractionalIndex from size of the segment.
Attaching the query function which I have written for the solution
int query_internal(int root, int tl, int tr, int left, int right, int fractionalIndex){
int segmentSize = int(m_tree[root].size());
if(left > right) return 0;
if(mini >= size) return 0;
if(tl==left && tr ==right){
return (segmentSize-fractionIndex);
}
int mid = (tl+tr)/2;
int lefty = query_internal(root + 1, tl, mid, left, min(mid,right), m_tree[root][fractionalIndex][1]);
int righty = query_internal(root + (2*(mid-tl+1)), mid+1, tr, max(mid+1,left), right, m_tree[root][fractionalIndex][2]);
return lefty+righty;
}
But, all these efforts were not that fruitful as I am still getting TLE.
And now, I am wondering, if I went down the wrong rabbit hole.
But surely I learnt something new, and since I am getting TLE, I am not sure if I my solution is actually wrong or just slow, I yet have to discover a failing test case tho.
Attaching link for one of the solutions, which I try submitting here.
Alternative Solutions
Well, I am in luck because many intelligent people have already solved this problem and there is a very well defined technique for this type of problem via offline query it seems.
So, what I did,
I stored values and respective indices of vector a, in another vector b and sorted it in non-increasing order.
{a[i], i}Similarly, stored all the queries in a queries vector and sorted it in non-increasing order.
for any given query, i, j, k, at index q stored {k+1, i-1, j-1, q};Then, initialised a segment tree with a vector of n length having each element as 0.
I optimised the segment tree via euler path to minimize the memory usage, fenwick tree could be use as well.Then, via two pointer method, I started iterating b and queries array from left to right
If at any index i and j, b[i][0]>=queries[j][0], then I did point update at position b[i][1] of the segment tree
Else, I did the simple range sum query operation over the segment tree as per the stored interval in the queries array and stored the answer in another ans array to output at the end.
Reason behind the 4th step is that, I am updating all the segments which contains value greater than the query k, so that I can do range sum over the interval and get the answer in O(log n)
Attaching my accepted solution below.
void solve(){
int n;
cin>>n;
vector<int> a(n);
for(int i=0;i<n;i++) cin>>a[i];
vector<vector<int>> b(n, vector<int>(2));
for(int i=0;i<n;i++){
b[i][0]=a[i];
b[i][1]=i;
}
sort(b.begin(),b.end(), greater<vector<int>>());
int q;
cin>>q;
vector<vector<int>> queries(q, vector<int>(4));
for(int i=0;i<q;i++){
int x,y,z;
cin>>x>>y>>z;
queries[i][0]=z+1;
queries[i][1]=x-1;
queries[i][2]=y-1;
queries[i][3]=i;
}
sort(queries.begin(),queries.end(), greater<vector<int>>());
vector<int> d(n,0);
SegmentTree seg_tree(d); // you can replace it with BIT and related operations
vector<int> ans(q,0);
int i=0;
int j=0;
while(i<n && j<q){
if(b[i][0]>=queries[j][0]){
seg_tree.update(b[i][1], 1);
i++;
}else{
ans[queries[j][3]]=seg_tree.query(queries[j][1], queries[j][2]);
j++;
}
}
while(j<q){
ans[queries[j][3]]=seg_tree.query(queries[j][1], queries[j][2]);
j++;
}
for(int x:ans) cout<<x<<endl;
}