Skip to content

Remarks / todo after 1st session

Think about what is important (and has to be presented) and what is not so important and should not be presented

It is a very general remark but I think it was a problem of this session.

An example: in my point of view, it is not useful to present pipenv in such training on Python HPC.

Change the README to tell participants that they have to install the virtual box.

  • Add a file explaining how to setup the virtual box (for example the potential problems with the BIOS configuration) and even how to create the virtual box from nothing

dtw_cort_dist

  • Add Julia

  • Add a function using high level Numpy functions

  • time the 3 functions

  • change the structure to avoid using runpy + a directory per case only when needed...

  • plots to show the speed up

  • use pytest for checking

  • write a nice blog post to summarize the results

Simple decorators

Should be introduced at the end of the presentation on functions

  • user perspective

  • how to create decorators

  • functools.wraps

Simple generators

When should it be presented? After loops, functions and list comprehensions.

Standardize presentations (all using ipynb and all included in the main web page)

  • Merge the 2 presentations on packaging (with a little bit more on pytest)

Put the presentations on the web!

A good presentation of pip, virtualenv, pyenv (show how to use PyPy), conda

Change order of the presentations: (wrapping, accelerators, parallel) to (parallel, wrapping, accelerators)

  • We need to present the GIL before wrapping and accelerators.

  • We don't need extensions to present parallelism

  • We can come back to parallelism once accelerators are presented (OpenMP with Pythran, parallel loops with Numba / Cython, Dask + Numba(?))

Improve the presentation on parallelism

See also https://github.com/fchuffar/practicle_sessions

  • a little bit more on the architecture of the computers

  • concepts of distributed and shared memories

  • concepts of processes, threads, processors, cores, CPU

A little bit more on GPU ?

A little bit more on the statistic ecosystem (pandas, statsmodels, seaborn, ...), especially if there are R users