Whew. This problem took me the better part of a week to solve. The first few days I misunderstood the problem and went off on a bit of a rabbit trail. Even though I have consistently learned to not rush head long into solving a problem before properly understanding it, I keep repeating this mistake.
The problem is in two parts:
1. Find the number of distinct primes for a number n. (For example, 10=2*5, so that is 2 distinct primes. 20=5*2*2, so again, 2 distinct primes)
2. For a range of numbers, for example, 1 to 100, how many numbers have k number of distinct primes?
If we consider the numbers 4 to 10, with k=2. The question is … 4,5,6,7,8,9,10 … out of those numbers, how many have 2 distinct prime factors?
– 4=2*2, so it has 1 distinct prime factor
– 5=5*1, so it has 1 distinct prime factor
– 6=2*3, so it has 2 distinct prime factors
– 7=7*1, so it has 1 distinct prime factor
– 8=2*2*2, so it has 1 distinct prime factor
– 9=3*3, so it has 1 distinct prime factor
– 10=2*5, so it has 2 distinct prime factors
The answer must be 2 then. 6 and 10 have two distinct prime factors.
The constraints were that the range could go up to 100,000 and k could range from 1 to 5.
Initially I thought of actually generating prime number combinations. For example, if k=3, I would generate [2,3,5], [3,5,7], [5,7,11], etc. The idea being that order doesn’t matter and numbers can’t be repeated. The problem was that this would only yield numbers that had uniquely distinct prime factors. In other words, [2,2,3,3,5,5,5] also have three distinct prime factors, so in that sense it is the same as [2,3,5].
I abandoned this approach after trying it out for way too long.
My Good Friend OEIS
After Googling around a bit I ended up on this page: OEIS A001221.
For algorithm competitions, OEIS (Online Encyclopedia of Integer Sequences) is invaluable.
Unfortunately there were no code samples for generating k-distinct primes on the OEIS page. I tried to read the math specifics on Wikipedia and various other math sites and found the function I was looking for was omega(n). Even though I knew the function, I couldn’t find any good implementations of it.
I did actually see something on OEIS that gave me an idea. There was a MAPLE implementation that inspired me to do this:
1. Loop over all the primes under 100,000.
2. For each prime, go through the multiples of it within a limit. (The limit ended up being ceil(100000 / prime(i)) )
3. For each number in (2), we add 1 to an index in a vector. For example, if we’re looking at 7 for the prime, we would have:
a. nums[7*1] = nums[7*1] + 1
b. nums[7*2] = nums[7*2] + 1
c. nums[7*3] = nums[7*3] + 1
In this way I could create a vector that cached the distinct prime count for numbers 1 to 100,000. Then when a range was given to me, I could iterate over the vector and count how many times a given number appeared.
Temptation to Cheat
At one point in my search for an implementation – or at least good written explanation – of omega(n), I came across this page: http://stackoverflow.com/questions/17545888/number-of-distinct-prime-factors-of-a-number.
A guy was doing the same problem and gave up and asked Stack Overflow to finish it for him. Not cool.
The temptation was there to just give in after a week of work. I could just read the answers on SO and submit the problem and be done with it. However I hadn’t spent all that time just to quit and cheat.
My first submission failed from the time constraint. A few weeks ago I made a subtle transition over to C++ for algorithm competitions because I had to use it for a problem on TalentBuddy. The language is still fairly new to me, so when my first submission failed I wasn’t surprised. I must have missed some major performance issues with my code.
The good news was that I didn’t get a “Wrong Answer” reply.
Micro Optimizations Work!
Every single time an algorithm has failed on CodeChef, micro optimizations have not worked. I’ll try minimizing variables, making loops shorter, etc. and it never works. There is always a major issue with the algorithm that only a major re-factor can fix.
For the first time, I was able to do micro optimizations and it worked.
I did some performance profiling and was shocked that my prime generation function only took 2ms to run. Moreover, the caching of k-distinct primes only took 23ms! 25ms out of 1000ms allowed was spent caching the data I needed.
That meant looping through the vector of k-distinct primes was too slow! This really surprised me because I figured that code was as good as it could get.
The big change was accessing a vector by index instead of using .at().
… became …
Just like that my vector caching of k-distinct primes went from 23ms to 10ms. My submission finally worked!
C++ is very powerful and micro optimizations add up. Over time I figure I’ll learn more and more of them. I’m excited to see how other people solved this problem.