Page Nav

HIDE

Breaking News:

latest

Ads Place

Blazing-Fast Algebra and Random Numbers in Python with mtalg

https://ift.tt/3qv4SRo A python tool for multithreaded algebra and pseudo-random number generation Image from  mtalg Data scientists an...

https://ift.tt/3qv4SRo

A python tool for multithreaded algebra and pseudo-random number generation

Image from mtalg

Data scientists and researchers alike often need to perform fast and efficient numerical computations. Working with large data structures makes it therefore desirable to be able to exploit all computational resources available through either multiprocessing or multithreading (see this great article for a refresher). This is so important that numerical libraries such as numpy automatically support multithreading on linear algebra operations.

However, somewhat surprisingly, numpy does not provide out of the box multithreading functionalities for elementwise operations and (pseudo-) random number generation. These are often the main bottlenecks e.g. when performing large scale Monte Carlo simulations, for Bayesian parameter estimation through MCMC, etc.

The Python library mtalg [1] provides both multithreaded elementwise functions as well as multithreaded random number generations, beating most if not all other libraries, including numexpr and just in time compilation with numba (see benchmarks below).

Multithreaded algebra

After installing the library with pip install mtalg, we are able to import mtalg and use its built-in functions:

Notice that operations are performed inplace, by default in the first argument. Therefore, mtalg.add(a,b) is equivalent to a = a + b (just way faster!). This behaviour can be overriden via the optional parameter direction: as an example mtalg.sub(a, b, direction='right') would be equivalent to b = a - b (notice this is not the same as mtalg.sub(b, a) which would instead be equivalent to b = b — a).

By default mtalg sets the number of threads to the number of available CPU cores, but this parameter can be overridden as mtalg.set_num_threads(6). Similarly, we can check the current set value as mtalg.get_num_threads().

Random number generation

Random number generation can be performed as:

The mtalg.random provides a range of different distributions to sample from, all maintaing numpy-like syntax.

mtalg currently supports sampling from the following distributions: beta, binomial, chisquare, exponential, f, gamma, geometric, gumbel, hypergeometric, integers, laplace, logistic, lognormal, logseries, negative_binomial, noncentral_chisquare, noncentral_f, normal, pareto, poisson, power, random, rayleigh, standard_cauchy, standard_exponential, standard_gamma, standard_normal, standard_t, triangular, uniform, vonmises, wald, weibull, zipf.

Benchmarks

Benchmarks against numpy and some of the fastest libraries available demonstrate mtalg‘s speed.

Fig.1 — Elementwise algebra: benchmarks [2] for the add function on one billion operations. Other elementwise functions perform similarly. [Source: Author’s image]
Fig.2 — Elementwise algebra: benchmarks [2] for the add function. Other elementwise functions perform similarly. [Source: Author’s image]

As usual, multithreading involves an overhead so that benefits from multithreading become evident only when dealing with a large number of operations (in the order of 10⁷ / 10⁸ or greater— Fig.2).

Massive speed improvements can also be observed for generation of random numbers, a crucial task when performing large scale Monte Carlo simulations, or in Bayesian parameter estimation through MCMC.

Fig.3 — Random number generation: benchmarks [2] for uniform and standard normal distributed random variates (one billion). Sampling from other distributions perform similarly. [Source: Author’s image]
Fig.4 — Random number generation: benchmarks [2] for uniform and standard normal distributed random variates. Sampling from other distributions perform similarly. [Source: Author’s image]

Conclusions

In this article we have presented mtalg, an intuitive Python library for fast elementwise operations and random number generation.

Feel free to leave comments, suggestions for edits, or ask questions in the dedicated section below!

Also, feel free to get in touch if you would like to contribute to the library!

References

[1] mtalg on GitHub
[2] Benchmarks are carrried out using an Intel(R) Xeon(R) Gold 6142M CPU @ 2.60GHz and 24 threads
[3] Disclaimer: the author is maintainer and co-author of the library presented here.


Blazing-Fast Algebra and Random Numbers in Python with mtalg was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.



from Towards Data Science - Medium https://ift.tt/3qqNnBE
via RiYo Analytics

No comments

Latest Articles