Linux进程调度

现在的位置: 首页 > 综合 > 正文

RSS

2018年10月02日 ⁄ 综合 ⁄ 共 5604字 ⁄ 字号小中大 ⁄ 评论关闭

一、策略

1.1 IO消耗性进程和CPU消耗性进程

IO消耗性进程：

大部分时间用来提交IO请求或等待IO请求。

缩短运行时间，提高运行频率(优先调度)。

CPU消耗性进程：

大部分时间用在执行代码上。

延长运行时间，降低运行频率。

1.2 进程优先级

优先级高的先运行，低的后运行，相同优先级的轮流运行。

动态优先级。

nice值：

范围-20到19 ，默认为0 。

值越小优先级越高。

实时优先级：

范围0到99。

优先级高于普通进程。

1.3 时间片

太长太短都不好。

进程不一定一次用完它所有的时间片。

所有进程时间片用完后，重新计算时间片。

1.4 进程抢占

进程切换到TASK_RUNNING状态，会检查其优先级，决定是否抢占。

时间片为0的进程，会被抢占。

二、Linux调度算法

2.1 可执行队列

运行队列struct runqueue，每个CPU一个。

Linux-2.6.32-220.el6.x86_64代码中的可执行队列结构体：

/*
 * This is the main, per-CPU runqueue data structure.
 *
 * Locking rule: those places that want to lock multiple runqueues
 * (such as the load balancing or the thread migration code), lock
 * acquire operations must be ordered by ascending &runqueue.
 */
struct rq {
	/* runqueue lock: */
	spinlock_t lock;

	/*
	 * nr_running and cpu_load should be in the same cacheline because
	 * remote CPUs use both these fields when doing load calculation.
	 */
	unsigned long nr_running;
	#define CPU_LOAD_IDX_MAX 5
	unsigned long cpu_load[CPU_LOAD_IDX_MAX];
#ifndef __GENKSYMS__
	unsigned long last_load_update_tick;
#endif
#ifdef CONFIG_NO_HZ
#ifndef __GENKSYMS__
	unsigned char nohz_balance_kick;
#else
	unsigned long last_tick_seen;
	unsigned char in_nohz_recently;
#endif
#endif

#ifndef __GENKSYMS__
	unsigned int skip_clock_update;
#endif
	/* capture load from *all* tasks on this cpu: */
	struct load_weight load;
	unsigned long nr_load_updates;
	u64 nr_switches;
#ifdef __GENKSYMS__
	u64 nr_migrations_in;
#endif
	struct cfs_rq cfs;
	struct rt_rq rt;

#ifdef CONFIG_FAIR_GROUP_SCHED
	/* list of leaf cfs_rq on this cpu: */
	struct list_head leaf_cfs_rq_list;
#endif
#ifdef CONFIG_RT_GROUP_SCHED
	struct list_head leaf_rt_rq_list;
#endif

	/*
	 * This is part of a global counter where only the total sum
	 * over all CPUs matters. A task can increase this counter on
	 * one CPU and if it got migrated afterwards it may decrease
	 * it on another CPU. Always updated under the runqueue lock:
	 */
	unsigned long nr_uninterruptible;

	struct task_struct *curr, *idle;
	unsigned long next_balance;
	struct mm_struct *prev_mm;

	u64 clock;

	atomic_t nr_iowait;

#ifdef CONFIG_SMP
	struct root_domain *rd;
	struct sched_domain *sd;

	unsigned char idle_at_tick;
	/* For active balancing */
	int post_schedule;
	int active_balance;
	int push_cpu;
	/* cpu of this runqueue: */
	int cpu;
	int online;

	unsigned long avg_load_per_task;

	struct task_struct *migration_thread;
	struct list_head migration_queue;

	u64 rt_avg;
	u64 age_stamp;
	u64 idle_stamp;
	u64 avg_idle;
#endif

	/* calc_load related fields */
	unsigned long calc_load_update;
	long calc_load_active;

#ifdef CONFIG_SCHED_HRTICK
#ifdef CONFIG_SMP
	int hrtick_csd_pending;
	struct call_single_data hrtick_csd;
#endif
	struct hrtimer hrtick_timer;
#endif

#ifdef CONFIG_SCHEDSTATS
	/* latency stats */
	struct sched_info rq_sched_info;
	unsigned long long rq_cpu_time;
	/* could above be rq->cfs_rq.exec_clock + rq->rt_rq.rt_runtime ? */

	/* sys_sched_yield() stats */
	unsigned int yld_count;

	/* schedule() stats */
	unsigned int sched_switch;
	unsigned int sched_count;
	unsigned int sched_goidle;

	/* try_to_wake_up() stats */
	unsigned int ttwu_count;
	unsigned int ttwu_local;

	/* BKL stats */
	unsigned int bkl_count;
#endif
};

运行队列相关函数：

//在<linux/sched.c>中实现

//返回给定处理器上的可执行队列。
#define cpu_rq(cpu)		(&per_cpu(runqueues, (cpu)))   
//返回当前处理器上的可执行队列。
#define this_rq()		(&__get_cpu_var(runqueues))   
//返回给定进程所在的可执行队列。
#define task_rq(p)		cpu_rq(task_cpu(p))                 

//操作可执行队列时加锁。
static struct rq *task_rq_lock(struct task_struct *p, unsigned long *flags)
static inline void task_rq_unlock(struct rq *rq, unsigned long *flags)
static struct rq *this_rq_lock(void)

//需要加2次锁时，保证加锁顺序与解锁顺序相反。
static void double_rq_lock(struct rq *rq1, struct rq *rq2)
static void double_rq_unlock(struct rq *rq1, struct rq *rq2)

2.2 优先级数组

struct prio_array优先级数组：

使可运行处理器的每一种优先级都包含一个相应的队列，而这些队列包含对应优先级上的可执行进程链表。

位图：

开始时位图所有位置0，当某个优先级的进程开始准备执行时，位图中相应的位置1。

MAX_PRIO：

定义了系统拥有的优先级个数。默认值为140。

每个处理器维护2个优先级数组，一个活跃数组，一个过期数组。

2.3 重新计算时间片

所有进程时间片都用完，再重新计算时间片，相当耗时。

每个处理器维护2个优先级数组，一个活跃数组，一个过期数组。

2.4 schedule

1. 找到活动优先级数组中第一个被设置的位(该位对应着优先级最高的活动进程)；

2. 调度程序选择这个级别链表里的头一个进程；

3. 如果该进程不是当前进程，context_switch被调用。

2.5 计算优先级和时间片

nice值：static_prio

-20到19，默认0 。

用户设置不可改变。

动态优先级：prio

effective_prio()返回在nice上+5或-5。

动态优先级计算：

判断一个进程是IO消耗型，还是处理器消耗型；

主要通过休眠时间推断；

sleep_avg 0到MAX_SLEEP_AVG 默认10毫秒。

时间片计算：

以静态优先级为基础计算；

创建一个新进程时，父子进程均分时间片。

2.6 唤醒与睡眠

睡眠：

1. 进程把自己设置成休眠状态；

2. 从可执行队列移出，放入等待队列；

3. 调用schedule选择和执行一个其他进程。

唤醒：

1. 进程被设置成休眠状态；

2. 从等待队列移出，放入可执行队列。

休眠通过等待队列wait_queue_header_t处理。

2.7 负载平衡程序

负载平衡程序：

有多个处理器时，负载平衡程序使得各个处理器上调度的进程数均衡。

负载平衡程序有kernel/sched.c中 load_balance()函数实现。

调用时机：

schedule()调用时，可执行队列为空。

定时器定时调用。

三、抢占和上下文切换

3.1 上下文切换

上下文切换：

从一个可执行进程切换到另一个可执行进程。

context_switch()负责切换，schedule()会调用该函数。

知道什么时候调用schedule()：

need_resched来标志是否需要重新执行一次调度。

void set_tsk_need_resched(struct task_struct *tsk) //设置指定进程中的need_resched标志
void clear_tsk_need_resched(struct task_struct *tsk) //清除指定进程中的need_resched标志
int need_resched(void) //检查need_resched标志的值，如果被设置就返回真，否则返回假。

3.2 用户抢占

用户抢占发生在：

从系统调用返回用户空间。

从中断处理程序返回用户空间。

3.3 内核抢占

调度安全：

只要调度是安全的，内核可以在任何时间抢占。

没有锁，调度就是安全的。

thread_info引入了preempt_count锁计数器，为0时，表示没有锁。

内核抢占发生在：

从中断处理程序，返回内核空间之前。

内核代码再一次具有可抢占性的时候。

内核显示的调用schedule()。

内核中的任务阻塞（会导致调用schedule()）。

四、实时调度

4.1 SCHED_FIFO

比普通进程高。

没有时间片，一直运行，直到运行完或让出处理器。

4.2 SCHED_RR

比普通进程高。

有时间片。

4.3 SCHED_NORMAL

实时优先级范围：

0到MAX_RT_PRIO-1(100)。

nice值范围：

-20到19。

共享了MAX_RT_PRIO到MAX_RT_PRIO+40。

五、与调度相关的系统调用

#include <linux/sched.h>

int sched_setscheduler(struct task_struct *, int, struct sched_param *); //设置进程调度策略
long sched_setaffinity(pid_t pid, const struct cpumask *new_mask); //设置进程的处理器的亲和力
long sched_getaffinity(pid_t pid, struct cpumask *mask); //获取进程的处理器的亲和力

【上篇】《shell十三问》下
【下篇】Linux进程管理

作者: lech

该日志由 lech 于6年前发表在综合分类下，最后更新于 2018年10月02日.
转载请注明: Linux进程调度 | 学步园 +复制链接

抱歉!评论已关闭.

学步园

Linux进程调度

作者: lech

书签

最新文章New

本站推荐

返回首页