*2017*

Originating from Project Euler, Dirichlet convolution saw its use in optimizing the problem to compute the partial sum of some specific multiplicative function. This article is aimed to introduce the exact technique applied.

This tutorial can be viewed as an extension of the previous Mobius Inversion essay, so I recommend to take a look at that one first.

Dirichlet convolution is a way to generate a new function from two functions. Specifically, the Dirichlet convolution of two functions \( f(x) \) and \( g(x) \) is:

\[ \displaystyle f*g(x)=\sum_{d|x}f(d)g(\frac{x}{d}) \]We already know that one property of such convolution is that if \( f(x) \) and \( g(x) \) are all multiplicative, \( f*g(x) \) is multiplicative as well. Based on the property of the Mobius inversion \( \sum_{d|n}\mu(d)=\epsilon(n)=[n=1] \), we can also claim that \( \epsilon(x)=\mu*I(x) \).

Let's say that I wish to find the \( n \)-th partial sum of some multiplicative function \( f(x) \), i.e. I want to know the value of \( s_f(x)=\sum_{i=1}^xf(i) \) for a given \( n \). Now let's pretend that I (miraculously) find a 'magical function' \( g(x) \) such that both the partial sum of \( g(x) \) (denoted as \( s_g(x) \)) and the partial sum of \( f*g(x) \) (denoted as \( s_{f*g}(x) \)) are (miraculously) very easy to obtain in \( O(1) \) time complexity. I can then perform the following trick to compute \( s_f(x) \).

Let's begin with the definition of the Dirichlet convolution:

\[ \displaystyle s_{f*g}(n)=\sum_{i=1}^n\sum_{d|i}g(d)f(\frac{i}{d}) \]We can assume that \( i=pd \), and sum things via \( p \) and \( d \).

\[ \displaystyle s_{f*g}(n)=\sum_{d=1}^n\sum_{p=1}^{\lfloor \frac{n}{d} \rfloor} g(d)f(p) \]Now we can split the part where \( d=1 \).

\[ \displaystyle s_{f*g}(n)=g(1)\sum_{p=1}^nf(p)+\sum_{d=2}^n\sum_{p=1}^{\lfloor \frac{n}{d} \rfloor} g(d)f(p) =g(1)s_f(n)+\sum_{d=2}^ng(d)s_f(\lfloor \frac{n}{d} \rfloor) \]We can see that \( s_f(n) \) has surfaced. Let's move all the rest of the equation to the other hand.

\[ \displaystyle s_f(n)=\frac{s_{f*g}(n)-\sum_{d=2}^ns_f(\lfloor \frac{n}{d} \rfloor)g(d)}{g(1)} \]According to the harmonic lemma introduced in the last article, only at most \( 2\sqrt{n} \) different elements exist for \( \lfloor \frac{n}{d} \rfloor \). We can thus recursively calculate the value of those \( s_f(x) \), dividing the product \( s_f(\lfloor \frac{n}{d} \rfloor)g(d) \) into at most \( 2\sqrt{n} \) different segments, and sum them up in \( O(\sqrt{n}) \) complexity.

What is the overall complexity of the algorithm? If we implement a hash table to store every \( s_f(x) \) we computed, then with the integer division lemma (see last tutorial), it is not difficult to observe that only these values \( (\lfloor \frac{n}{1} \rfloor, \lfloor \frac{n}{2} \rfloor, ... , \lfloor \frac{n}{n} \rfloor) \) of \( s_f(x) \) are computed. Since computing an \( s_f(x) \) costs \( O(\sqrt{x}) \), the overall cost should be \( O(\sum_{i=1}^{\sqrt{n}}\sqrt{i})+O(\sum_{i=1}^{\sqrt{n}}\sqrt{\frac{n}{i}})=O(n^{\frac{3}{4}}) \).

Upon now we have not used the interesting fact that \( f(x) \) is multiplicative. Come to think of it, we can actually use the linear sieve to pre-compute a few first elements of \( s_f(x) \), omitting the need to process them recursively. Let's say that we pre-compute the first \( k \geq \sqrt{n} \) elements of \( s_f(x) \). We can calculate the complexity

\[ \displaystyle O(k)+O(\sum_{i=1}^{\frac{n}{k}}\sqrt{\frac{n}{i}})=O(k)+O(\frac{n}{\sqrt{k}}) \]Apparently, when \( k=n^{\frac{2}{3}} \)

\[ \displaystyle O(n^{\frac{2}{3}})+O(\sum_{i=1}^{n^{\frac{1}{3}}}\sqrt{\frac{n}{i}})=O(n^{\frac{2}{3}}) \]Hence, the algorithm is optimized to \( O(n^{\frac{2}{3}}) \).

The following code gives a brief idea on the implementation of the technique.

/** Prefix sum of multiplicative functions :
* p_f : the prefix sum of f (x) (1 <= x <= th).
* p_g : the prefix sum of g (x) (0 <= x <= N).
* p_c : the prefix sum of f * g (x) (0 <= x <= N).
* th : the thereshold, generally should be n ^ (2 / 3).
*/
struct prefix_mul {
typedef long long (*func)(long long);
func p_f, p_g, p_c;
long long n, th;
std::unordered_map<long long, long long> mem;
prefix_mul(func p_f, func p_g, func p_c) : p_f(p_f), p_g(p_g), p_c(p_c) {}
long long calc(long long x) {
if (x <= th) return p_f(x);
auto d = mem.find(x);
if (d != mem.end()) return d->second;
long long ans = 0;
for (long long i = 2, la; i <= x; i = la + 1) {
la = x / (x / i);
ans = ans + (p_g(la) - p_g(i - 1) + mod) * calc(x / i);
}
ans = p_c(x) - ans;
ans = ans / inv;
return mem[x] = ans;
}
long long solve(long long n, long long th) {
if (n <= 0) return 0;
prefix_mul::n = n;
prefix_mul::th = th;
inv = p_g(1);
return calc(n);
}
};

Though we have finished discussing the algorithm, I have not yet shown how to find the 'magic function'. Unfortunately, there is no guarantee that such function exists in the first place. However, we can think about a question: how many functions are there that is {\it very easy} to compute the partial sum of it?

Few.

Certainly we know that \( \epsilon(x) \), \( I(x) \) and \( Id_k(x) \) are all trivial functions that have formulas for their partial sums, but there are hardly any other ones that come handy in the form of the Dirichlet convolution. The fact suggests that we can apply a trial-and-error method to test these trivial functions until one of them satisfies the condition that \( s_{f*g}(x) \) is also trivial.

Still, if we know some property of the function \( f(x) \), we might have some other ways to work around it.

For instance, consider the Mobius function \( \mu(x) \), we have already noticed that \( \epsilon(x)=\mu*I(x) \), which tells us to take \( I(x) \) as the 'magic function'. Another example would be the Euler's totient function \( \phi(x) \), we can prove that \( n=\sum_{i=1}^n\phi(i) \) (with the '\( p^k \)' method, i.e. show that \( \sum_{i=1}^n\phi(i) \) is multiplicative first, and then compute \( \sum_{i=1}^n\phi(p^k) \)), so \( Id(x)=\phi*I(x) \).

Now since we can calculate the partial sum of \( \mu(d) \) faster, we are able to apply the harmonic lemma on the formula and optimize the loop to \( O(\sqrt{n}) \). Note that while we have to compute multiple partial sums of \( \mu(d) \), the overall complexity is still \( O(n^{\frac{2}{3}}) \) if we do not clear the hash table, due to the fact that all values we plugged in is in the sequence \( (\lfloor \frac{n}{1} \rfloor, \lfloor \frac{n}{2} \rfloor, \ldots , \lfloor \frac{n}{n} \rfloor) \), which are already computed while processing \( \mu(n) \) anyway.

We just apply the optimization technique exactly the same as above.

Now we just have to find a 'magic function' for \( g(l) \). With a little luck, we can notice that \( Id(l)=g*Id_2(l) \), and we just apply the optimization technique exactly the same as above.

(Should you be curious about the reason why \( Id(l)=g*Id_2(l) \) holds true, the answer is still just the ``\( p^k \)'' method: it is vital to show that \( g*Id_2(l) \) is multiplicative, and simply calculating \( g*Id_2(p^k) \) will yield the result.)

Under some circumstances an implemented hash table might not be available. Fortunately, that does not mean you have to hand-code it for the optimization. With a little observation we can notice that all keys appearing in the hash table are in the form \( \frac{n}{i} \). Therefore, if we use \( i \), i.e. \( f(x)=\lfloor \frac{n}{x} \rfloor \) as the hashing function, we can prove that \( f(x) \leq n^{\frac{1}{3}} \) and \( f(x) \) never collides.