现在的位置: 首页 > 综合 > 正文

nohz动态时钟

2017年12月22日 ⁄ 综合 ⁄ 共 26243字 ⁄ 字号 评论关闭

Most of the contents is from the book, but as it based on 2.6 kernel.  
I analysis the a new version 3.1.1. So there are some changes.
 ====================================

Dynamic Ticks:

contents:

0. Introductions

1.  Dynamic Ticks for  Low-Resolution Systems

2. The Dynamic Tick Handler

3. Updating jiffies

4.  Dynamic Ticks for High-Resolution Systems       

5. Stopping and Starting Periodic Ticks

==================

0. Introductions

Periodic ticks have provided a notion of time to the Linux kernel for many of years.
The approach is simple and effective, but show one particular deficiency on systems
where power consumption does matter: The periodic tick requires that the system
is in an active state at a certain frequency. Longer periods of rest are impossible

because of this.

Dynamic ticks mend this problem. The periodic tick is only activated when some tasks
actually do need to be peformend.

How can the kernel decide if the systemm has nothing to do? If no active tasks
are on the run queue, the kernel picks a special task, the idle task, to run. At this point,
the dynamic tick mechanism enters the game. Whenever the idle task is selected
to run, the periodic tick is disabled until the next timer will expire. The tick is

re-enabled again after this time span, or when an interrupt occurs. In the meantime,

the CPU can enjoy a well-deserved sleep.

!!! NOTE that only classical timers need to be considered for this purpose.
High-resolution timers are not bound by the tick frequency, and are also not
implemented on top of periodic ticks.

The following are the data structures related.
enum tick_nohz_mode {
    NOHZ_MODE_INACTIVE,// periodic ticks are active
    NOHZ_MODE_LOWRES, //dynamic ticks are used based on low-resolution mode
    NOHZ_MODE_HIGHRES,// dynamic ticks are used based on high-resolution mode
};

/**
 * struct tick_sched - sched tick emulation and no idle tick control/stats
 * @sched_timer:    hrtimer to schedule the periodic tick in high
 *          resolution mode
 * @idle_tick:      Store the last idle tick expiry time when the tick
 *          timer is modified for idle sleeps. This is necessary
 *          to resume the tick timer operation in the timeline
 *          when the CPU returns from idle
                // A sufficient number of tick intervals are added to obtain the expiration time
                // for the next tick.
 * @tick_stopped:   Indicator that the idle tick has been stopped
 * @idle_jiffies:   jiffies at the entry to idle for idle time accounting   +
 * @idle_calls:     Total number of idle calls
 * @idle_sleeps:    Number of idle calls, where the sched tick was stopped
 * @idle_entrytime: Time when the idle call was entered
 * @idle_waketime:  Time when the idle was interrupted
 * @idle_exittime:  Time when the idle state was left
 * @idle_sleeptime: Sum of the time slept in idle with sched tick stopped
 * @iowait_sleeptime:   Sum of the time slept in idle with sched tick stopped, with IO outstanding
 * @sleep_length:   Duration of the current idle sleep
 * @do_timer_lst:   CPU was the last one doing do_timer before going idle
 */
struct tick_sched {
    struct hrtimer          sched_timer;
    unsigned long           check_clocks;
      /*
      * The current mode of operation.
      */
    enum tick_nohz_mode     nohz_mode;
    ktime_t             idle_tick;
    int             inidle;
    int             tick_stopped;
    unsigned long           idle_jiffies;
    unsigned long           idle_calls;
    unsigned long           idle_sleeps;
    int             idle_active;
    ktime_t             idle_entrytime;
    ktime_t             idle_waketime;
    ktime_t             idle_exittime;
    ktime_t             idle_sleeptime;
    ktime_t             iowait_sleeptime;
    ktime_t             sleep_length;
    unsigned long           last_jiffies;
    unsigned long           next_jiffies;
    ktime_t             idle_expires;
    int             do_timer_last;
};

@tick_cpu_sched is a global per-CPU variable that provides an instance of
struct tick_sched. This is required because disabling ticks naturally works
per CPU, not globally for the whole system.

1.  Dynamic Ticks for  Low-Resolution Systems

 Consider the situation in which the kernel does not use high-resolution timers
 and provides only low resolution. We will show how are dynamic ticks implemented in this
 scenario.
  -- Switching to Dynamic Ticks
     Recall that, in the Generic Time Substem, tick_setup_device() is used to
     set up a tick device. If the clock event device supports periodic events,
     tick_setup_periodic() installs tick_handle_periodic() as handler function of
     the tick device.  tick_handle_periodic() is called on the next event of the tick
     device. tick_periodic() is called in tick_handle_periodic(). tick_periodic() is
     responsible for handling the perioc tick on a given CPU required as an
     argument. The following is the call tree.
     

        Call Tree:
         tick_handle_periodic    
                tick_periodic
                   
        Call Tree:
         tick_periodic  |  tick_nohz_handler |  tick_sched_timer
                update_process_times
                         run_local_timers
                                 hrtimer_run_queues
                                 raise_softirq(TIMER_SOFTIRQ) // This will trigger the excuting of
                                                                                                    // run_timer_softirq()
        Call Tree:
        run_timer_softirq
            hrtimer_run_pending
                tick_check_oneshot_change //****
                hrtimer_switch_to_hres
                
  From the call tree above, we know that every time update_local_timers() called,
  run_timer_softirq() will be called. In run_timer_softirq(), hrtimer_run_pending is called.
  hrtimer_run_queues() calles tick_check_oneshot_change() to decide if high-resolution
  timers can be activated. Additionally, the function checks if dynamic ticks can be
  enabled on low-resolution systems. This is possible under two conditions:
    1> A clock event device that supports one-shot mode is available.
    2> high-resolution is not enabled.

   
1405 /*  
1406  * Called from timer softirq every jiffy, expire hrtimers:
1407  *     
1408  * For HRT its the fall back code to run the softirq in the timer
1409  * softirq context in case the hrtimer initialization failed or has
1410  * not been done yet.
1411  */
1412 void hrtimer_run_pending(void)
1413 {
1414     if (hrtimer_hres_active())
1415         return;
1416     
1417     /*
1418      * This _is_ ugly: We have to check in the softirq context,
1419      * whether we can switch to highres and / or nohz mode. The
1420      * clocksource switch happens in the timer interrupt with
1421      * xtime_lock held. Notification from there only sets the
1422      * check bit in the tick_oneshot code, otherwise we might
1423      * deadlock vs. xtime_lock.
1424      */
1425     if (tick_check_oneshot_change(!hrtimer_is_hres_enabled()))
1426         hrtimer_switch_to_hres();
1427 }  

839 /**
840  * Check, if a change happened, which makes oneshot possible.
841  *
842  * Called cyclic from the hrtimer softirq (driven by the timer
843  * softirq) allow_nohz signals, that we can switch into low-res nohz
844  * mode, because high resolution timers are disabled (either compile
845  * or runtime).
846  */  
847 int tick_check_oneshot_change(int allow_nohz)
848 {
849     struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
850
851     if (!test_and_clear_bit(0, &ts->check_clocks))
852         return 0;
853
854     if (ts->nohz_mode != NOHZ_MODE_INACTIVE)
855         return 0;
856
857     if (!timekeeping_valid_for_hres() || !tick_is_oneshot_available())
858         return 0;
859
860     if (!allow_nohz)
861         return 1;
862
863     tick_nohz_switch_to_nohz();
864     return 0;
865 }

 46 /**
 47  * tick_is_oneshot_available - check for a oneshot capable event device
 48  */
 49 int tick_is_oneshot_available(void)
 50 {
 51     struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev);
 52
 53     if (!dev || !(dev->features & CLOCK_EVT_FEAT_ONESHOT))
 54         return 0;
 55     if (!(dev->features & CLOCK_EVT_FEAT_C3STOP))
 56         return 1;
 57     return tick_broadcast_oneshot_available();
 58 }


Call Tree:
hrtimer_run_pending
    tick_nohz_switch_to_nohz
        tick_switch_to_oneshot(tick_nohz_handler)
                tick_broadcast_switch_to_oneshot // omit this now
        hrtimer_init
        tick_init_jiffy_update
        hrtimer_set_expires
        tick_program_event

    
609 /**
610  * tick_nohz_switch_to_nohz - switch to nohz mode
611  */
612 static void tick_nohz_switch_to_nohz(void)
613 {
614     struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
615     ktime_t next;
616
            /* If CONFIG_NO_HZ is set, tick nohz is enabled by default. */
617     if (!tick_nohz_enabled)
618         return;
619
620     local_irq_disable();
       /*
       * set @tick_cpu_device's event device's mode to TICKDEV_MODE_ONESHOT,
       * change the event device's event_handler to tick_nohz_handler().
       */
621     if (tick_switch_to_oneshot(tick_nohz_handler)) {
622         local_irq_enable();
623         return;
624     }   
625         
626     ts->nohz_mode = NOHZ_MODE_LOWRES;
627         
628     /*
629      * Recycle the hrtimer in ts, so we can share the
630      * hrtimer_forward with the highres code.
631      */
632     hrtimer_init(&ts->sched_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
633     /* Get the next period */
634     next = tick_init_jiffy_update();
635
           /*
           * To get things going, the kernel finally needs to activate the first periodic
           * tick by setting the timer to expire at the point in time when the next
           * periodic tick would have been due.
           */
636     for (;;) {
637         hrtimer_set_expires(&ts->sched_timer, next);
638         if (!tick_program_event(next, 0))
639             break;
640         next = ktime_add(next, tick_period);
641     }
642     local_irq_enable();
643
644     printk(KERN_INFO "Switched to NOHz mode on CPU #%d\n", smp_processor_id());
645 }
646

101 /*
102  * NOHZ - aka dynamic tick functionality
103  */
104 #ifdef CONFIG_NO_HZ
105 /*
106  * NO HZ enabled ?
107  */
108 static int tick_nohz_enabled __read_mostly  = 1;
109
110 /*
111  * Enable / Disable tickless mode
112  */
113 static int __init setup_tick_nohz(char *str)
114 {
115     if (!strcmp(str, "off"))
116         tick_nohz_enabled = 0;
117     else if (!strcmp(str, "on"))
118         tick_nohz_enabled = 1;
119     else
120         return 0;
121     return 1;
122 }
123
124 __setup("nohz=", setup_tick_nohz);

126 /**
127  * tick_switch_to_oneshot - switch to oneshot mode
128  */
       /*
       * set @tick_cpu_device's event device's mode to TICKDEV_MODE_ONESHOT,
       * change the event device's event_handler to @handler.
       */
129 int tick_switch_to_oneshot(void (*handler)(struct clock_event_device *))
130 {
131     struct tick_device *td = &__get_cpu_var(tick_cpu_device);
132     struct clock_event_device *dev = td->evtdev;
133     
134     if (!dev || !(dev->features & CLOCK_EVT_FEAT_ONESHOT) ||
135             !tick_device_is_functional(dev)) {
136     
137         printk(KERN_INFO "Clockevents: "
138                "could not switch to one-shot mode:");
139         if (!dev) {
140             printk(" no tick device\n");
141         } else {
142             if (!tick_device_is_functional(dev))
143                 printk(" %s is not functional.\n", dev->name);
144             else
145                 printk(" %s does not support one-shot mode.\n",
146                        dev->name);
147         }
148         return -EINVAL;
149     }
150
151     td->mode = TICKDEV_MODE_ONESHOT;
152     dev->event_handler = handler;
153     clockevents_set_mode(dev, CLOCK_EVT_MODE_ONESHOT);
154     tick_broadcast_switch_to_oneshot();
155     return 0;
156 }

2. The Dynamic Tick Handler


The new tick handler tick_nohz_handler() needs to assume two responsibilities:
    1> Perform all actions required for the tick mechanism.
    2> Reprogram the tick device such that the next tick expires at the right time.

    -------------------------------------------------------
        Call Tree:
         tick_periodic  |  tick_nohz_handler |  tick_sched_timer
                update_process_times
                         run_local_timers
                                 hrtimer_run_queues
                                 raise_softirq(TIMER_SOFTIRQ) // This will trigger the excuting of
                                                                                                    // run_timer_softirq()
        Call Tree:
        run_timer_softirq
            hrtimer_run_pending
                tick_check_oneshot_change //****
                hrtimer_switch_to_hres
                
 
         Call Tree: // ****
         tick_nohz_handler
                tick_do_update_jiffies64       
                update_process_times
                profile_tick
                tick_nohz_reprogram   
                

561 /*
562  * The nohz low res interrupt handler
563  */
564 static void tick_nohz_handler(struct clock_event_device *dev)
565 {
566     struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
567     struct pt_regs *regs = get_irq_regs();
568     int cpu = smp_processor_id();
569     ktime_t now = ktime_get();
570
        /*
        *  This ensures that the event device will not expire anytime soon, or
        *  never for  practical purpose.
        */
571     dev->next_event.tv64 = KTIME_MAX;
572
573     /*
574      * Check if the do_timer duty was dropped. We don't care about
575      * concurrency: This happens only when the cpu in charge went
576      * into a long sleep. If two cpus happen to assign themself to
577      * this duty, then the jiffies update is still serialized by
578      * xtime_lock.
579      */
            /*
            * The role of the global tick device is assumed by one particular CPU,
            * and the handler needs to check if the current CPU is the responsible one.
            * If a CPU goes into a long sleep, then it cannot be responsible for the
            * global tick anymore, and drops the duty. If this is the case, the next
            * CPU handler is called must assume the duty.
            */
580     if (unlikely(tick_do_timer_cpu == TICK_DO_TIMER_NONE))
581         tick_do_timer_cpu = cpu;
582
583     /* Check, if the jiffies need an update */
          /*
          * If the CPU is responsible to provide the globale tick, it is sufficent to
          * call tick_do_update_jiffies64() which will be talked later.
          */
584     if (tick_do_timer_cpu == cpu)
585         tick_do_update_jiffies64(now);
586
587     /*
588      * When we are idle and the tick is stopped, we have to touch
589      * the watchdog as we might not schedule for a really long
590      * time. This happens on complete idle SMP systems while
591      * waiting on the login prompt. We also increment the "start
592      * of idle" jiffy stamp so the idle accounting adjustment we
593      * do when we go busy again does not account too much ticks.
594      */
595     if (ts->tick_stopped) {
596         touch_softlockup_watchdog();
597         ts->idle_jiffies++;
598     }
599
            /*
            * Take over the duty of the local tick.
            */
600     update_process_times(user_mode(regs));
601     profile_tick(CPU_PROFILING);
602
           /*
           * Sets the tick timer to expire at the next jiffy.
           * The while loop ensures that reprogramming is repeated until it
           * succeeds if the procceing should have taken too long and the next
           * tick lies already in the past.
           */
603     while (tick_nohz_reprogram(ts, now)) {
604         now = ktime_get();
605         tick_do_update_jiffies64(now);
606     }
607 }
608
 

3. Updating jiffies

The global tick device calls tick_do_update_jiffies64() to update the global

jiffies_64 variable, the basis of low-resolution timer handling. When periodic
ticks are in use, this is comparantively simple because the function is called
whenever a jiffy has passed. When dynamic ticks are enabled, the situation
ca arise in which all CPUs of the system are idle and none provides global
ticks. This needs to be taken into account by tick_do_update_jiffies64.
        

 43 /*
 44  * Must be called with interrupts disabled !
 45  */
 46 static void tick_do_update_jiffies64(ktime_t now)
 47 {
 48     unsigned long ticks = 0;
 49     ktime_t delta;
 50
 51     /*
 52      * Do a quick check without holding xtime_lock:
 53      */
 54     delta = ktime_sub(now, last_jiffies_update);
 55     if (delta.tv64 < tick_period.tv64)
 56         return;
 57
 58     /* Reevalute with xtime_lock held */
 59     write_seqlock(&xtime_lock);
 60
 61     delta = ktime_sub(now, last_jiffies_update);
        /*
        * Updating the jiffies value is naturally only required if the last update
        * is more than one tick period ago.
        */
 62     if (delta.tv64 >= tick_period.tv64) {
 63
 64         delta = ktime_sub(delta, tick_period);
 65         last_jiffies_update = ktime_add(last_jiffies_update,
 66                         tick_period);
 67
 68         /* Slow path for long timeouts */
 69         if (unlikely(delta.tv64 >= tick_period.tv64)) {
 70             s64 incr = ktime_to_ns(tick_period);
 71
 72             ticks = ktime_divns(delta, incr);
 73
 74             last_jiffies_update = ktime_add_ns(last_jiffies_update,
 75                                incr * ticks);
 76         }
            /*
            * Update the global jiffies value.
            */
 77         do_timer(++ticks);
 78
 79         /* Keep the tick_next_period variable up to date */
 80         tick_next_period = ktime_add(last_jiffies_update, tick_period);
 81     }
 82     write_sequnlock(&xtime_lock);
 83 }
 84
 

4.  Dynamic Ticks for High-Resolution Systems

   Clock event devices run in one-shot mode anyway if the kernel used high
   timer resolution, support for dynamic ticks is much easier to implement
   than in the low-resolution case. The periodic tick is emulated by tick_sched_timer()
   as discussed above. The function is also used to implement dynamic ticks.

710 /*
711  * High resolution timer specific code
712  */
713 #ifdef CONFIG_HIGH_RES_TIMERS
714 /*
715  * We rearm the timer until we get disabled by the idle code.
716  * Called with interrupts disabled and timer->base->cpu_base->lock held.
717  */
718 static enum hrtimer_restart tick_sched_timer(struct hrtimer *timer)
...
726 #ifdef CONFIG_NO_HZ
727     /*
728      * Check if the do_timer duty was dropped. We don't care about
729      * concurrency: This happens only when the cpu in charge went
730      * into a long sleep. If two cpus happen to assign themself to
731      * this duty, then the jiffies update is still serialized by
732      * xtime_lock.
733      */
734     if (unlikely(tick_do_timer_cpu == TICK_DO_TIMER_NONE))
735         tick_do_timer_cpu = cpu;
736 #endif
...
}

    Recall that tick_setup_sched_timer() is used to initialize the tick emulation
    layer for high-resolution systmes. If dynamic ticks are enabled at compile
    time, a short piece of code is added to the function:
    

        768 /**
        769  * tick_setup_sched_timer - setup the tick emulation timer
        770  */
        771 void tick_setup_sched_timer(void)
               ...
        795 #ifdef CONFIG_NO_HZ
        796     if (tick_nohz_enabled) {
        797         ts->nohz_mode = NOHZ_MODE_HIGHRES;
        798         printk(KERN_INFO "Switched to NOHz mode on CPU #%d\n", smp_processor_id());
        799     }
        800 #endif

5. Stopping and Starting Periodic Ticks


Dynamic ticks provide the framework to defer periodic ticks for a while.
What the kernel still needs to decide is when ticks are supported to be
stopped and restarted.

A natural possibility to stop ticks is when the idle task is scheduled:
This proves that the proccessor really does not have anything better to do.

 The idle task is implemented in an architecutre-specific way, and not
 all architectures have been updated to support disabling the periodic tick
 yet.  Architectures differ in some details, but the genral principle is the
 same.

void cpu_idle(void)
{
   ...
    /* endless idle loop with no priority at all */
    while (1) {
      tick_nohz_stop_sched_tick(1);
        while (!need_resched()) {
            ...
                if (cpuidle_idle_call())
                    pm_idle();
             ...
        }
      ...
        tick_nohz_restart_sched_tick();
       ...
    }
}

After calling tick_nohz_stop_sched_tick() to turn off ticks, the system goes into
an endless loop that ends when a process is available to be scheduled on the
processor. Ticks are then necessary again, and are reactivated by
tick_nohz_restart_sched_tick().

Two conditions can thus require restarting ticks:
1> An external interrupt make a process runnable, which requires the ticks
  mechanism to work. In this case, ticks need to be resumed earlier than
  initially planned.
2> The next tick event is due, and the clock interrupt signals that the time
   for this has come. In this case, the ticks mechanism is resummed as planned
   before.

Essentially, tick_nohz_stop_sched_tick() needs to perform three tasks:
 1> Check if the next timer wheel event is more than one tick away.
 2> If this is the case, reprogram the tick device to omit the next tick only
    when it is necessary again. This automatically omits all ticks that are not required
 3> Update the statistical information in tick_sched().

 
/**
 * tick_nohz_stop_sched_tick - stop the idle tick from the idle task
 *
 * When the next event is more than a tick into the future, stop the idle tick
 * Called either from the idle loop or from irq_exit() when an idle period was
 * just interrupted by an interrupt which did not cause a reschedule.
 */
void tick_nohz_stop_sched_tick(int inidle)
{
    unsigned long seq, last_jiffies, next_jiffies, delta_jiffies, flags;
    struct tick_sched *ts;
    ktime_t last_update, expires, now;
    struct clock_event_device *dev = __get_cpu_var(tick_cpu_device).evtdev;
    u64 time_delta;
    int cpu;

    local_irq_save(flags);

    cpu = smp_processor_id();
    ts = &per_cpu(tick_cpu_sched, cpu);

    /*
     * Call to tick_nohz_start_idle stops the last_update_time from being
     * updated. Thus, it must not be called in the event we are called from
     * irq_exit() with the prior state different than idle.
     */
    if (!inidle && !ts->inidle)
        goto end;
    /*
     * Set ts->inidle unconditionally. Even if the system did not
     * switch to NOHZ mode the cpu frequency governers rely on the
     * update of the idle time accounting in tick_nohz_start_idle().
     */
    ts->inidle = 1;

    now = tick_nohz_start_idle(cpu, ts);

    /*
     * If this cpu is offline and it is the one which updates
     * jiffies, then give up the assignment and let it be taken by
     * the cpu which runs the tick timer next. If we don't drop
     * this here the jiffies might be stale and do_timer() never
     * invoked.
     */
    if (unlikely(!cpu_online(cpu))) {
        if (cpu == tick_do_timer_cpu)
            tick_do_timer_cpu = TICK_DO_TIMER_NONE;
    }

    if (unlikely(ts->nohz_mode == NOHZ_MODE_INACTIVE))
        goto end;

    if (need_resched())
        goto end;
    if (unlikely(local_softirq_pending() && cpu_online(cpu))) {
        static int ratelimit;

        if (ratelimit < 10) {
            printk(KERN_ERR "NOHZ: local_softirq_pending %02x\n",
                   (unsigned int) local_softirq_pending());
            ratelimit++;
        }
        goto end;
    }

    ts->idle_calls++;
    /* Read jiffies and the time when jiffies were updated last */
    do {
        seq = read_seqbegin(&xtime_lock);
        last_update = last_jiffies_update;
        last_jiffies = jiffies;
        time_delta = timekeeping_max_deferment();
    } while (read_seqretry(&xtime_lock, seq));

    if (rcu_needs_cpu(cpu) || printk_needs_cpu(cpu) ||
        arch_needs_cpu(cpu)) {
        next_jiffies = last_jiffies + 1;
        delta_jiffies = 1;
   } else {
        /* Get the next timer wheel timer */
        next_jiffies = get_next_timer_interrupt(last_jiffies);
        delta_jiffies = next_jiffies - last_jiffies;
    }
    /*
     * Do not stop the tick, if we are only one off
     * or if the cpu is required for rcu
     */
    if (!ts->tick_stopped && delta_jiffies == 1)
        goto out;

    /* Schedule the tick, if we are at least one jiffie off */
    if ((long)delta_jiffies >= 1) {

        /*
         * If this cpu is the one which updates jiffies, then
         * give up the assignment and let it be taken by the
         * cpu which runs the tick timer next, which might be
         * this cpu as well. If we don't drop this here the
         * jiffies might be stale and do_timer() never
         * invoked. Keep track of the fact that it was the one
         * which had the do_timer() duty last. If this cpu is
         * the one which had the do_timer() duty last, we
         * limit the sleep time to the timekeeping
         * max_deferement value which we retrieved
         * above. Otherwise we can sleep as long as we want.
         */
        if (cpu == tick_do_timer_cpu) {
            tick_do_timer_cpu = TICK_DO_TIMER_NONE;
            ts->do_timer_last = 1;
        } else if (tick_do_timer_cpu != TICK_DO_TIMER_NONE) {
            time_delta = KTIME_MAX;
            ts->do_timer_last = 0;
        } else if (!ts->do_timer_last) {
            time_delta = KTIME_MAX;
        }

        /*
         * calculate the expiry time for the next timer wheel
         * timer. delta_jiffies >= NEXT_TIMER_MAX_DELTA signals
         * that there is no timer pending or at least extremely
         * far into the future (12 days for HZ=1000). In this
         * case we set the expiry to the end of time.
         */
        if (likely(delta_jiffies < NEXT_TIMER_MAX_DELTA)) {
            /*
             * Calculate the time delta for the next timer event.
             * If the time delta exceeds the maximum time delta
             * permitted by the current clocksource then adjust
             * the time delta accordingly to ensure the
             * clocksource does not wrap.
             */
            time_delta = min_t(u64, time_delta,
                       tick_period.tv64 * delta_jiffies);
        }

        if (time_delta < KTIME_MAX)
            expires = ktime_add_ns(last_update, time_delta);
        else
            expires.tv64 = KTIME_MAX;

        /* Skip reprogram of event if its not changed */
        if (ts->tick_stopped && ktime_equal(expires, dev->next_event))
            goto out;

        /*
         * nohz_stop_sched_tick can be called several times before
         * the nohz_restart_sched_tick is called. This happens when
         * interrupts arrive which do not cause a reschedule. In the
         * first call we save the current tick time, so we can restart
         * the scheduler tick in nohz_restart_sched_tick.
         */
        if (!ts->tick_stopped) {
            select_nohz_load_balancer(1);

            ts->idle_tick = hrtimer_get_expires(&ts->sched_timer);
            ts->tick_stopped = 1;
            ts->idle_jiffies = last_jiffies;
            rcu_enter_nohz();
        }
       ts->idle_sleeps++;

        /* Mark expires */
        ts->idle_expires = expires;

        /*
         * If the expiration time == KTIME_MAX, then
         * in this case we simply stop the tick timer.
         */
         if (unlikely(expires.tv64 == KTIME_MAX)) {
            if (ts->nohz_mode == NOHZ_MODE_HIGHRES)
                hrtimer_cancel(&ts->sched_timer);
            goto out;
        }

        if (ts->nohz_mode == NOHZ_MODE_HIGHRES) {
            hrtimer_start(&ts->sched_timer, expires,
                      HRTIMER_MODE_ABS_PINNED);
            /* Check, if the timer was already in the past */
            if (hrtimer_active(&ts->sched_timer))
                goto out;
        } else if (!tick_program_event(expires, 0))
                goto out;
        /*
         * We are past the event already. So we crossed a
         * jiffie boundary. Update jiffies and raise the
         * softirq.
         */
        tick_do_update_jiffies64(ktime_get());
    }
    raise_softirq_irqoff(TIMER_SOFTIRQ);
out:
    ts->next_jiffies = next_jiffies;
    ts->last_jiffies = last_jiffies;
    ts->sleep_length = ktime_sub(dev->next_event, now);
end:
    local_irq_restore(flags);
}

----------------
tick_nohz_restart_sched_tick
        tick_do_update_jiffies64
         /* Account idle time */
         /* Set tick_sched->tick_stopped = 0 */
         /* program the next tick event */

         
/**
 * tick_nohz_restart_sched_tick - restart the idle tick from the idle task
 *
 * Restart the idle tick when the CPU is woken up from idle
 */
void tick_nohz_restart_sched_tick(void)
{
    int cpu = smp_processor_id();
    struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
#ifndef CONFIG_VIRT_CPU_ACCOUNTING
    unsigned long ticks;
#endif
    ktime_t now;

    local_irq_disable();
    if (ts->idle_active || (ts->inidle && ts->tick_stopped))
        now = ktime_get();

    if (ts->idle_active)
        tick_nohz_stop_idle(cpu, now);

    if (!ts->inidle || !ts->tick_stopped) {
        ts->inidle = 0;
        local_irq_enable();
        return;
    }

    ts->inidle = 0;

    rcu_exit_nohz();

    /* Update jiffies first */
    select_nohz_load_balancer(0);
    tick_do_update_jiffies64(now);

#ifndef CONFIG_VIRT_CPU_ACCOUNTING
    /*
     * We stopped the tick in idle. Update process times would miss the
     * time we slept as update_process_times does only a 1 tick
     * accounting. Enforce that this is accounted to idle !
     */
    ticks = jiffies - ts->idle_jiffies;
    /*
     * We might be one off. Do not randomly account a huge number of ticks!
     */
    if (ticks && ticks < LONG_MAX)
        account_idle_ticks(ticks);
#endif

    touch_softlockup_watchdog();
    /*
     * Cancel the scheduled timer and restore the tick
     */
    ts->tick_stopped  = 0;
    ts->idle_exittime = now;

    tick_nohz_restart(ts, now);

    local_irq_enable();
}

【上篇】
【下篇】

抱歉!评论已关闭.