Multiple Processing in Python

Goals I am recently working on a computer vision task and need a large volume data to be downloaded and processed. However, it takes too much time working in a single thread. So, working in parrallel way in a HPC system would be a better choice. Understand the multiprocessing, subprocess, threading package in python The workflow for a MPI work Transfer to HPC Multiprocessing package: Process-based parallelism Pool object: parallelizing execution and distributing data (data parallelism) Basic example:...

January 13, 2023 · 2 min

linux-operations

Mout a disk using gparted to format the disk to ext4 file system: sudo apt-get install gparted -> sudo gparted -> format the disk mount the disk to mountpoint, eg: sudo mount /dev/sda2 ~/HDD Permanently mounting: cat /etc/fstab to get UUID -> here change the ownership of the folder ~/HDD: sudo chown xxy ~/ vim cheatsheet Docker 1 2 3 4 5 1. build with `Dockerfile`: sudo docker build -t xxy ....

November 9, 2022 · 6 min

Jupyterhub

Here I would introduce how to use jupyter hub in HPC Introduction JupyterHub is the best way to serve Jupyter notebook for multiple users. Because jupyterhub manages a separate Jupyter encironment for each user, it can be used in a class of students, a corporate data scientific research group. It is a multi-user Hub that spawns, manages, and proxies multiple instances of the sinfle0user Jupyter notebook server. It offers distributions for different use cases....

November 7, 2022 · 2 min

Pytorch Lightning

what is pytorch lightning PyTorch Lightning is the deep learning framework with “batteries included” for professional AI researchers and machine learning engineers who need maximal flexibility while super-charging performance at scale. quick start Your browser does not support the video tag. summary steps: lightning module forward func configure optimizers def training_step def validation_step remove .cuda() backward and step as hook init lightning module init trainer add other functions as call back explanation about dataloader and sampler LightningDataModule was designed as a way of decoupling data-related hooks from the LightningDataModule, so you can develop dataset agonostic models....

October 31, 2022 · 4 min

How to deploy singularity for data processing

Installation Install on local machine from singularity-installation Create an “install.def” file: An example file: (docker image downloaded from here) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 Bootstrap: docker From: pytorch/pytorch:1.9.0-cuda11.1-cudnn8-devel %post apt-get update apt-get install -y gcc apt-get install -y g++ apt-get install -y libglib2....

May 24, 2022 · 3 min

RemoteX11 configuration on vscode

Remote X11 understanding Suppose we have a local machine (windows/linux), wanna do some deep learning training or data analysis in a remote linux server to . To show images like plt.plot() & plt.show() in local machine we need X11 forwarding which directly renderes images in local machine. Ok, first step we should connect to a remote linux server from our local machine. Supposing using SSH connection in MobaXterm, we need a private key in local machine and a public key in remote server....

March 20, 2022 · 6 min

Jekyll

Using Jekyll to create a gitpage on windows Understanding Jekyll, Gem, Bundle, Ruby what is Ruby :hear_no_evil: Ruby is a dynamic, open source programming language with a focus on simplicity and productivity. It has an elegant syntax that is natural to read and easy to write. Ruby is most used for building web applications. However, it is a general-purpose language similar to Python, so it has many other applications like data analysis, prototyping, and proof of concepts....

March 20, 2022 · 3 min