Está en la página 1de 64

Shailen Sobhee – Technical Consulting Engineer

Based on the work of the following Intel colleagues:
Anton Malakhov – Software Engineer
David Liu – Python Technical Consulting Engineer
Anton Gorshkov – Pathfinding
Terry Wilmarth – OpenMP architect
Shailen Sobhee – Technical Consulting Engineer
Based on the work of the following Intel colleagues:
Anton Malakhov – Software Engineer
David Liu – Python Technical Consulting Engineer
Anton Gorshkov – Pathfinding
Terry Wilmarth – OpenMP architect
Why is this important?
Sequential program alone uses
marginal CPU power
on servers

72

28
8

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Without parallelism

Parallel Parallel

Finish
Start

(NumPy|S (Numba|
ciPy| Sklearn|
Numexpr) PyDAAL)
Sequential Regions Python Compute Regions Regions

time

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Without parallelism: {infinite} resources

Amdahl's Law:
“speedup is limited by

Finish
Start

the serial portion of the
work”
Sequential Python Compute Regions Speedup?

time

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Gustafson-Barsis’ Law

States that if the problem size grows along with the number of parallel
processors, while the serial portion grows slowly or remains fixed, speedup
increases as processors are added.

However, data-processing hidden behind libraries like Numpy, SciPy, etc.
-> Larger data sets => more Operational memory (limited resource)
-> Best strategy is to still avoid serial regions

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 7
*Other names and brands may be claimed as the property of others.
Without parallelism: what we can do?

Parallel Parallel

Finish
Start

(NumPy|S (Numba|
ciPy| Sklearn|
Numexpr) PyDAAL)
Sequential Regions Python Compute Regions Regions

time

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Make application parallel

Compute Parallel Regions
(Numba|
Parallel Sklearn|

Finish
Start

(NumPy|S PyDAAL)
ciPy| Regions
Numexpr)
Sequential Regions Python Speedup?

time

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Later, we will see some performance issues
that we can still have with “blind” parallelism.

But first, let’s have a look at some techniques,
and libraries available to implement parallelism
in Python.

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. Intel Confidential 10
*Other names and brands may be claimed as the property of others.
A quick look at some parallelism techniques
• Python Multithreading
• Multiprocessing
• Joblib
• Dask

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 11
*Other names and brands may be claimed as the property of others.
Sources

git clone git@github.com:shailensobhee/python-parallelism-tutorial.git
https://github.com/shailensobhee/python-parallelism-tutorial.git

Intel® VTune™ Amplifier 2018 Update 3:
http://registrationcenter-
download.intel.com/akdlm/irc_nas/tec/13079/vtune_amplifier_2018_update3.
tar.gz
Or shortcut link: https://goo.gl/RAJhWk

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 12
*Other names and brands may be claimed as the property of others.
Python Multithreading

Running several threads is similar to running several different
programs concurrently; think of it like multitasking.

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 13
*Other names and brands may be claimed as the property of others.
Python Multithreading

Running several threads is similar to running several different
programs concurrently; think of it like multitasking.

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 13
*Other names and brands may be claimed as the property of others.
Python Multithreading

Running several threads is similar to running several different
programs concurrently; think of it like multitasking.

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 13
*Other names and brands may be claimed as the property of others.
Python Multithreading

Running several threads is similar to running several different
programs concurrently; think of it like multitasking.

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 13
*Other names and brands may be claimed as the property of others.
Multithreading example
For a given list of numbers, print square and cube for each number.

Input: [2,3,8,9]

Output:
Square: [4,9,64,81]
Cube: [8, 27, 512, 729]

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 14
*Other names and brands may be claimed as the property of others.
Python Multithreading
Running several threads is similar to running several different programs
concurrently, but with the following benefits:
• Multiple threads within a process share the same data space with the main
thread and can therefore share information or communicate with each other
more easily than if they were separate processes.
• Threads sometimes called light-weight processes and they do not require
much memory overhead; they are cheaper than processes.
A thread has a beginning, an execution sequence, and a conclusion. It has an
instruction pointer that keeps track of where within its context it is currently
running.
It can be pre-empted (interrupted)
It can temporarily be put on hold (also known as sleeping) while other threads
Optimization Notice
are running - this is called yielding.
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
15
Python Multithreading
Multi-threading benefits:
• Multiple threads within a process share the same data space with the main
thread and can therefore share information or communicate with each other
more easily than if they were separate processes.
• Threads sometimes called light-weight processes and they do not require
much memory overhead; they are cheaper than processes.
A thread has a beginning, an execution sequence, and a conclusion. It has an
instruction pointer that keeps track of where within its context it is currently
running.
It can be pre-empted (interrupted)
It can temporarily be put on hold (also known as sleeping) while other threads
are running - this is called yielding.
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 16
*Other names and brands may be claimed as the property of others.
5
Multiprocessing 4
3
2
1

Core 1 Core 2 Core 3 Core 4
def f(n):
return n*n

25
16
9
4
1

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 17
*Other names and brands may be claimed as the property of others.
5
4
Multiprocessing 3
2
1

Core 1 Core 2 Core 3 Core 4
def f(n): def f(n): def f(n): def f(n):
return n*n return n*n return n*n return n*n

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 18
*Other names and brands may be claimed as the property of others.
5
4
Multiprocessing 3
2
1

Core 1 Core 2 Core 3 Core 4
def f(n): def f(n): def f(n): def f(n):
return n*n return n*n return n*n return n*n

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 18
*Other names and brands may be claimed as the property of others.
5
4
Multiprocessing 3
2
5 1
1 2 3 4
Core 1 Core 2 Core 3 Core 4
def f(n): def f(n): def f(n): def f(n):
return n*n return n*n return n*n return n*n

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 18
*Other names and brands may be claimed as the property of others.
5
4
Multiprocessing 3
2
5 1
1 2 3 4
Core 1 Core 2 Core 3 Core 4
def f(n): def f(n): def f(n): def f(n):
return n*n return n*n return n*n return n*n

4 9 16
25
1 25
16
9
4
1
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 18
*Other names and brands may be claimed as the property of others.
5
4
Multiprocessing 3
2 Map
5 1
1 2 3 4
Core 1 Core 2 Core 3 Core 4
def f(n): def f(n): def f(n): def f(n):
return n*n return n*n return n*n return n*n

4 9 16
25
1 25 Reduce
16
9
4
1
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 18
*Other names and brands may be claimed as the property of others.
Multiprocessing – address spaces

Go to Jupyter Notebook example

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 19
*Other names and brands may be claimed as the property of others.
What’s the difference between multiprocessing
and multithreading?

Both are ways to achieve Multitasking

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 20
*Other names and brands may be claimed as the property of others.
Multiprocessing vs multithreading
Process

0x0f12 3453
Thread 2

Thread 1 0xC000 1234


… Thread 3

0xFFFF FFFF

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 21
*Other names and brands may be claimed as the property of others.
Process 1 Process 2

0x0f12 3453 File 0x0f12 3453

0xC000 1234 0xC000 1234

… Shared Memory …
… …
… …
0xFFFF FFFF Message 0xFFFF FFFF
Pipe

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 22
*Other names and brands may be claimed as the property of others.
Threads Processes

Picture sources:
https://www.askideas.com/50-most-funny-skinny-pictures-that-will-make-you-laugh-every-time/
https://www.reddit.com/r/whowouldwin/comments/36n70j/10_skinny_guys_vs_10_fat_guys/

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 23
*Other names and brands may be claimed as the property of others.
Threads Processes

Picture sources:
https://www.askideas.com/50-most-funny-skinny-pictures-that-will-make-you-laugh-every-time/
https://www.reddit.com/r/whowouldwin/comments/36n70j/10_skinny_guys_vs_10_fat_guys/

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 23
*Other names and brands may be claimed as the property of others.
Joblib and Dask

Examples in Jupyter

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 24
*Other names and brands may be claimed as the property of others.
The parallelism spaces

Dask

Multiprocessing Application-level Data Parallelism-
Joblib Parallelism Focus NumPy/SciPy
Numba
Cython
Numexpr

*Unicorn? MPI4PY…?
Celery
Concurrent Futures
Buildbot Single-threaded
MPI4PY (Single Node)
Twisted Concurrency
Openstack
Tornado

Async/await
Threading
Trio

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 26
*Other names and brands may be claimed as the property of others.
The risks

Data Parallelism-
Multiprocessing Python Focus
Joblib Multiprocessing OpenMP, TBB, NumPy/SciPy
Pthreads Numba
Cython
Dask Numexpr

Python
Multithreading Nested parallelism area
with risk of oversubscription

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 27
*Other names and brands may be claimed as the property of others.
Nested parallelism
data = numpy.random.random((256, 256))
pool = multiprocessing.pool.ThreadPool() # creates P threads

pool.map( np.linalg.eig, [data for i in range(1024)])

P Python threads * P NumPyMKLOpenMP threads = P2 threads total

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 28
*Other names and brands may be claimed as the property of others.
Oversubscription

P software threads P*P threads

P CPUs P CPUs

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 29
*Other names and brands may be claimed as the property of others.
Oversubscription overheads

• Types of impact
• Direct OS overhead for switching out a thread
• CPU cache becomes cold: invisible impact
• Other threads are waiting until the preempted one returns
• Tensorflow, Scikit-Learn, PyTorch have a recurring battle with these
• How do they solve it?
• Most use OMP_NUM_THREADS=1… KMP_BLOCKTIME=1…
• SMP ironically addresses this (more on this later)

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 30
*Other names and brands may be claimed as the property of others.
Introducing composability modules

• tbb4py: Intel TBB for Python
• A Python C-extension package managing nested parallelism using dynamic task
scheduler of Intel® Threading Building Blocks library
• Instantiates via monkey patching of Python’s pools and enabling TBB threading layer
for Intel® MKL
(no code changes required)
• Dynamically maps tasks onto coordinated pool(s)
to avoid excessive threads

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 31
*Other names and brands may be claimed as the property of others.
Application
Threading Composability Component 1
Subcomponent 1
Subcomponent
1

Subcomponent 2
Subcomponent
1

Subcomponent
Subcomponent
1
1

Subcomponent K

Libraries/modules/components are
Subcomponent
1

Component N
not aware of the big picture Subcomponent 1
Subcomponent
1

Subcomponent M
Subcomponent
1

Subcomponent
1

Can define their own threading

A composable component should be
able to function efficiently among
Parallel turtle
other such components without
affecting their efficiency.

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Application
Threading Composability Component 1
Subcomponent 1
Subcomponent
1

Subcomponent 2
Subcomponent
1

Subcomponent
Subcomponent
1
1

Subcomponent K

Libraries/modules/components are
Subcomponent
1

Component N
not aware of the big picture Subcomponent 1
Subcomponent
1

Subcomponent M
Subcomponent
1

Subcomponent
1

Can define their own threading

A composable component should be
able to function efficiently among
Parallel turtle
other such components without
affecting their efficiency.

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Application
Threading Composability Component 1
Subcomponent 1
Subcomponent
1

Subcomponent 2
Subcomponent
1

Subcomponent
Subcomponent
1
1

Subcomponent K

Libraries/modules/components are
Subcomponent
1

Component N
not aware of the big picture Subcomponent 1
Subcomponent
1

Subcomponent M
Subcomponent
1

Subcomponent
1

Can define their own threading

A composable component should be
able to function efficiently among
Parallel turtle
other such components without
affecting their efficiency.

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Introducing composability modules
• smp: Static Multi-Processing
• A Pure Python package managing nested parallelism through coarse-grain static settings
• Instantiates via monkey patching of Python’s pools
(no code changes required)
• Utilizes affinity mask + OpenMP settings to statically allocate resources and avoid
excessive threads

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 33
*Other names and brands may be claimed as the property of others.
Nested parallelism (again)
data = numpy.random.random((256, 256))
pool = multiprocessing.pool.ThreadPool() # creates P threads

pool.map( np.linalg.eig, [data for i in range(1024)])

P Python threads * P NumPyMKLOpenMP threads = P2 threads total

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 34
*Other names and brands may be claimed as the property of others.
TBB’s Thread coordination system

tbb4py module
Application Application
Running
Python & MKL
under
OpenMP
the TBB
Threading TBB pool scheduler

Coordinated
Separate,
TBB Threads
Uncoordinated
Too many Software threads Software Threads mapped
OpenMP Parallel
compete for logical to logical processors
regions
processors

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
SMP’s total threading affinity system

Application Application

Running
under
OpenMP the SMP
Threading module

Separate,
Uncoordinated
Too many Software threads
OpenMP Parallel
compete for logical
regions
processors

ThreadPool propagates Augmented MKL or BLAS
static masks/settings threading

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 36
*Other names and brands may be claimed as the property of others.
Without nested parallelism

Parallel Parallel

Finish
Start

(NumPy|S (Numba|
ciPy| Sklearn|
Numexpr) PyDAAL)
Sequential Regions Python Compute Regions Regions

time

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Without nested parallelism: {infinite} resources

Amdahl's Law:
“speedup is limited by

Finish
Start

the serial portion of the
work”
Sequential Python Compute Regions Speedup?

time

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Without nested parallelism: what we can do?

Parallel Parallel

Finish
Start

(NumPy|S (Numba|
ciPy| Sklearn|
Numexpr) PyDAAL)
Sequential Regions Python Compute Regions Regions

time

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Nested parallelism: Make application parallel

Compute Parallel Regions
(Numba|
Parallel Sklearn|

Finish
Start

(NumPy|S PyDAAL)
ciPy| Regions
Numexpr)
Sequential Regions Python Speedup?

time

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
nested parallelism: Oversubscription

Parallel
(Numba| Sklearn| PyDAAL)
Compute Regions
Regions Reasonable
limit
#threads

Finish
Parallel
(NumPy|SciPy|Numexpr)
Regions
Sequential Python Speedup?

time

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Application
Threading Composability Component 1
Subcomponent 1
Subcomponent 1

Subcomponent 2
Subcomponent 1

Subcomponent 1
Subcomponent 1

Subcomponent K

Libraries/modules/components are
Subcomponent 1

Component N
not aware of the big picture Subcomponent 1
Subcomponent 1

Subcomponent M
Subcomponent 1

Subcomponent 1

Can define their own threading

A composable component should be
able to function efficiently among
Parallel turtle
other such components without
affecting their efficiency.

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Application
Threading Composability Component 1
Subcomponent 1
Subcomponent 1

Subcomponent 2
Subcomponent 1

Subcomponent 1
Subcomponent 1

Subcomponent K

Libraries/modules/components are
Subcomponent 1

Component N
not aware of the big picture Subcomponent 1
Subcomponent 1

Subcomponent M
Subcomponent 1

Subcomponent 1

Can define their own threading

A composable component should be
able to function efficiently among
Parallel turtle
other such components without
affecting their efficiency.

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Application
Threading Composability Component 1
Subcomponent 1
Subcomponent 1

Subcomponent 2
Subcomponent 1

Subcomponent 1
Subcomponent 1

Subcomponent K

Libraries/modules/components are
Subcomponent 1

Component N
not aware of the big picture Subcomponent 1
Subcomponent 1

Subcomponent M
Subcomponent 1

Subcomponent 1

Can define their own threading

A composable component should be
able to function efficiently among
Parallel turtle
other such components without
affecting their efficiency.

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Our GOAL

Remove the burden from user’s shoulders
Automatic settings for the same
or better performance
and safe execution

Parallel turtle

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Intel® TBB: parallelism orchestration in Python ecosystem

(Intel®Distribution for)$ python -m tbb application.py

PyDAAL
NumPy

Thread
Scikit-

Joblib
SciPy

learn

Dask

Pool

Pool

OpenCV
Numba
Intel® Intel® TBB
Intel® DAAL
MKL module for Python

Intel® TBB runtime Apache
2.0

--ipc: Inter Process Coordination layer
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
‘TBB’ Module for Python: what’s new

New open-source license: Apache 2.0
New command line interface:
 python -m tbb --help
 python -m tbb application.py
 python -m tbb --ipc application.py
 python -m tbb --max-num-threads=16 application.py
Available within Intel® Distribution for Python, on Anaconda channel:
 conda create --name intel3 -c intel tbb numpy
Python & IPC module are open source in special branch on github:
https://github.intel.com/avmoskal/tbb/tree/tbb_2018
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
‘SMP’ Module for Python:
Static/Symmetric/Synchronized Multi-Processing
• Modifies multiprocessing.Pool and other pool interfaces
• Automatically adjust nested parallelism

• Manages thread affinity masks

• Restricts oversubscription

• Install & run:
conda install -c intel smp

python -m smp --help

python -m smp application.py

python -m smp -f 1 application.py
https://github.com/IntelPython/composability_be
nch
python -m smp -p 16 application.py
Configuration: 2-socket system with Intel(R)
Xeon(R) CPU E5-2699 v4 (2.20GHz, 22 cores, HT)
and 128 GB RAM.
Python 3.5.3, mkl (2017.0.3_intel_6), numpy
• Available open-source at: https://github.com/IntelPython/smp (1.12.1_py35_intel_8), dask (0.15.0_py35_0), tbb
(2017.0.7_py35_intel_2) and smp (0.1.3_py_2).
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Hint for optimization

$ python –m tbb

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Summary

• Nested Parallelism is the key
• Oversubscription might hit performance of your app

• We introduced helper modules:
• TBB
• SMP

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 55
*Other names and brands may be claimed as the property of others.
Legal Disclaimer & Optimization Notice
The benchmark results reported above may need to be revised as additional testing is conducted. The results depend on the specific platform configurations and
workloads utilized in the testing, and may not be applicable to any particular user’s components, computer system or workloads. The results are not necessarily
representative of other benchmarks and other benchmark results may show greater or lesser impact from mitigations.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark
and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause
the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the
performance of that product when combined with other products. For more complete information visit www.intel.com/benchmarks.
INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY
RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY,
RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR
INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
Copyright © 2018, Intel Corporation. All rights reserved. Intel, Pentium, Xeon, Xeon Phi, Core, VTune, Cilk, and the Intel logo are trademarks of Intel Corporation
in the U.S. and other countries.

Optimization Notice
Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors.
These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for
use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the
applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 70
*Other names and brands may be claimed as the property of others.
Example: QR with numpy

1 import time, numpy as np
2 x = np.random.random((100000, 2000))
3 t0 = time.time()
4 q, r = np.linalg.qr(x)
5 test = np.allclose(x, q.dot(r))
6 assert(test)
7 print(time.time() - t0)
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Example: QR with dask

1 import time, dask, dask.array as da

2 x = da.random.random((100000, 2000),

3 chunks=(10000, 2000)) # 10 tasks*

4 t0 = time.time()

5 q, r = da.linalg.qr(x)

6 test = da.all(da.isclose(x, q.dot(r)))

7 assert(test.compute()) # threaded

8 print(time.time() - t0)

Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.