C++ Programming Tutorials

Topic : Parallel Programming
Author : LUPG
Page : << Previous 2 Next >>

handle the given event type.
4. Go back to '1'.
(mantra - a technical support technician is sitting by the phone, doing nothing in particular. Every once in a while a phone call arrives - an event - and the technician handles the call, and afterwards goes back to waiting for another call).

Process Scheduling

Process Preemption

An event of the system suspending the execution of a process that is in the middle of putting the CPU to good use, in order to let other processes run. This makes sure no single process will take over the full CPU, even if it gets into an infinite loop and does not make any blocking system calls.

Context Switch

The method by which the operating system makes one CPU suspend the execution of on process, and continue executing a second process. This is called a context switch since the CPU needs to switch to using the full context of the second process - its execution stack, its memory area, the values the registers contained when last executing this process, etc. A context switch may occure when the running process blocks waiting for some resource to become free, when it yields the CPU voluntarily, or when it is being preempted by the operating system's scheduler. (mantra - you talk to your mom on the phone, while trying to watch a very interesting T.V. program - you switch your attention between you T.V. and your mom constantly, always taking a short while to remember where you were before the last switch).

Round-Robin Scheduling

A scheduling algorithm in which several processes are being executed by an executed one after the other in turn, each getting the same amount of time to run. After this time slice passes, a context-switch is made, where the first process on the list gets to run next, while the suspended process is being put back at the end of the list. (mantra - when the ideal teacher in class makes sure every pupil gets a chance to speak in turn - well, as if that ideal teacher exists...).

Prioritized Scheduling

A scheduling algorithm in which a priority is assigned to each process, and when several processes are waiting for the CPU to get executed, the one with highest priority is being run, usually until it gets blocked waiting for some resource, or until a process with a higher priority wakes up and demands CPU time. (mantra - on a T.V. debate, some people are given higher priority then others, and interrupt everybody else when they have something to say. On Israeli T.V. - see under 'Popolitica').

Load Balancing

A method by which several tasks are scheduled for separate CPUs or computers, based on the current load on each CPU. Sometimes this includes the ability to migrate processes from one CPU (or computer) to another during its execution.

Commands Pipelining

A process in which commands are broken into several smaller steps, that may be executed in sequence using different components of the CPU, meanwhile allowing others components of the CPU to start executing the next command in parallel. For example, suppose that we have a CPU component that can read data from memory, and another that can do mathematical calculations. If we have two consecutive addition commands that don't affect each other (i.e. working on un-related parameters), the CPU could first load the data for the first command using its I/O component, and then while the mathematical component calculates the sum of the data, the I/O component can start fetching the data for the next command in parallel, making overall execution faster. (mantra - when you wake up very late, and find you are very late for school, you start wearing your shirt, and while slipping it on your body with one hand, the other hand rushes to pick up a new pair of socks from the drawer, and so on).

Implementations Of Parallel Systems

Parallel systems implementation may be done in software, in hardware, or as a combination of both. It can also be made using symmetric elements, all of which are capable of performing the same tasks in the same level of efficiency, or using units that are specializing in in different jobs. We will show here several of the commonly used approaches for building a parallel system. In real life cases, we will usually see several methods combined, hopefully due to some good reasoning by the designers, and not by just the driving of marketeers wanting their product to carry the title of a "Parallelic system designed with the Cache-Coherency paradigm, using state-of-the-art virtual multi-threaded scheduling policy". You get the idea.

Implementations In Software

Software implementations of parallel systems usually have to handle the task of letting many processes run on a limited amount of hardware, using it in the most efficient way possible. Most efficient might mean "making sure the CPU is never idle", or better yet "making sure the whole system finishes its tasks as quickly as possible, in human time".

Multi Processes

This time we use the term process to denote an operating system process. All Unix systems, as well as many other operating systems, are multi-process systems. In most of these systems, each process is given its own address space, and they all share the same CPU in turn. The scheduling algorithm might be a simple round-robin algorithm, or one that takes priorities into account (priorities are mostly needed for real-time programs, that must be granted some limits on how long it'll take since an event they need arrives, until they are allowed to execute it).

Multi Threading Kernels

A system similar to the multi-process system, except that here a process may be divided into several threads. All threads share the same data area, and are being scheduled in a similar way to how processes are scheduled. Due to the sharing of data area (and few other resources), a context-switch between two threads of the same process is faster then a context switch between two processes. Also, passing data between threads is easier then between two processes, due to the shared address space. On the other hand, protection of threads from one another is weaker then that given to processes - if one thread corrupts its memory area, all other threads in its process may be directly affected. Threads supported by the kernel are sometimes called 'Light-Weight Processes' (LWP).

Multi Threading Libraries

Similar to the kernel-based threads, except that the operating system's kernel is not aware of these threads - they are created and handled by a library. The advantage is that "context-switching" between them is faster - it does not involve any system call (with the overhead of switching into kernel mode). On the other hand, if one thread gets blocked by the operating system when invoking some system call (e.g. sleep(), or waiting for input from a network device), the whole process gets suspended, including all its threads. Thus a multi-threaded library implementation is suitable for calculation-intensive applications, not for I/O intensive applications.

Implementations In Hardware

Implementations of parallel systems often involve using extra hardware - several CPUs that can execute different processes in parallel being the most notable hardware addition.

SMP - Symmetric Multi-Processing

This method is used to contain several CPUs in a single computer, each of which has access to the same set of memory chips, and each working as a general-purpose CPU, that can execute any process in the system. The operating system is responsible for assigning newly created processes to specific processes, using some load-balancing algorithm. Sometimes these systems also handle moving a process from one CPU to another, that just became free.

In the past it was a simple task (after the current CPU finished executing one command, simple copy its registers contents into another CPU, and set it to continue execution from the same position). However, these days a CPU executes several commands of a process simultaneously, using pipelining techniques, and also contains an internal cache containing some data and commands, making such a process migration harder to implement, and more wasteful (all the partially-completed commands in the pipeline have to be flashed, the cache of the second CPU has to be reloaded, etc).

SMP systems exist now on many platforms, including PCs running with Intel and Intel-clone CPUs, Sun SPARC stations, using Ultra-SPARC CPUs (or older SPARC-10 and SPARC-20 CPUs), IBM RS/6000 systems using several PowerPC CPUs, and so on. Support for SMP systems is built into many operating systems running on these types of hardware, usually different by the amount of CPUs supported, as well as various internal implementation parameters.

CC-NUMA - Cache-Coherent Non-Uniform Memory Access

This is a different concept of having a parallel system with several CPUs. sometimes this system uses specialized processes to do different tasks, sometimes the access to just access to different memory parts is done in an asymmetric way (i.e. every CPU (or group of CPUs) have its own memory part, and thus memory is not shared

Page : << Previous 2 Next >>

C++ Home Menu