• 1. Understanding Linux Kernel - Booting, Syscalls, Interrupts & Context SwitchingBy – Jayant Upadhyay 2003CS50214 Pankaj K. Sharma 2003CS50219 Sohit Bansal 2003CS50224 Akshay Gaur 2003CS50209
  • 2. Overview of BootingThe process can be divided into following six logical stages: BIOS selects the boot device BIOS loads the boot sector from the boot device Boot-sector loads setup, decompression routines and compressed kernel image Kernel is uncompressed in protected mode Low level initialization is performed by the asm code High-level C initialization
  • 3. BIOS POSTPOST – Power On Self Test Power supply starts the clock generator and asserts #POWERGOOD signal on the bus CPU #RESET line is asserted POST checks are performed with interrupts disabled IVT initialized at address zero BIOS bootstrap function is invoked via INT 0x19. This loads track 0, sector 1 at physical address 0x7C00(0x07C0:0000)
  • 4. Boot-sector & Setup The boot-sector to boot linux kernel could be either: Linux boot-sector(arch/i386/boot/bootsect.S) LILO (or other bootloader’s) boot-sector
  • 5. Linux Boot-sectorbootsector.S Firstly moves the bootsector’s code from 0x7C00 to 0x90000 Then it jumps to the newly made copy of bootsector i.e. in segment 0x90000 Prepares the stack at $INITSEG:0x4000-0xC This is where the limit on setup size comes from Setup sectors are loaded immediately after the bootsector i.e. at physical address using BIOS service INT 0x13
  • 6. If loading is failed due to some reason error code is dumped n it retry in endless loop If loading setup_sects sectors of setup code succeeded we jump to label ok_load_setup Kernel image is then loaded 0x10000. This is done to preserve the firmware data in low memory ( 0-64K ) After the kernel is loaded we jump to $SETUPSEG:0(arch/i386/boot/setup.S)
  • 7. setup.S Once the data is no longer needed (e.g. no more calls to BIOS) it is overwritten by moving the entire (compressed) kernel image from 0x10000 to 0x1000. sets things up for protected mode and jumps to 0x1000 which is the head of the compressed kernel, i.e. arch/386/boot/compressed/{head.S,misc.c} This sets up stack and calls decompress_kernel() which uncompresses the kernel to address 0x100000 and jumps to it.
  • 8. How to load a big kernel?The setup sectors are loaded as usual at 0x90200, but the kernel is loaded 64K chunk at a time using a special helper routine that calls BIOS to move data from low to high memory. This helper routine is referred to by bootsect_kludge in bootsect.S and is defined as bootsect_helper in setup.S. The bootsect_kludge label in setup.S contains the value of setup segment and the offset of bootsect_helper code in it so that bootsector can use the lcall instruction to jump to it (inter−segment jump). This routine uses BIOS service int 0x15 (ax=0x8700) to move to high memory and resets %es to always point to 0x10000. This ensures that the code in bootsect.S doesn't run out of low memory when copying data from disk.
  • 9. Using LILO as bootloaderThere are several advantages in using a specialised bootloader (LILO) over a bare bones Linux bootsector: Ability to choose between multiple Linux kernels or even multiple OSes. Ability to pass kernel command line parameters Ability to load much larger bzImage kernels − up to 2.5M vs 1M. Old versions of LILO (v17 and earlier) could not load bzImage kernels. The newer versions (as of a couple of years ago or earlier) use the same technique as bootsect+setup of moving data from low into high memory by means of BIOS services.
  • 10. High Level InitializationBy "high−level initialisation" we consider anything which is not directly related to bootstrap, even though parts of the code to perform this are written in asm, namely arch/i386/kernel/head.S which is the head of the uncompressed kernel. The following steps are performed: Initialise segment values (%ds = %es = %fs = %gs = __KERNEL_DS = 0x18). Initialise page tables. Enable paging by setting PG bit in %cr0. Zero−clean BSS (on SMP, only first CPU does this). Copy the first 2k of bootup parameters (kernel commandline). Check CPU type using EFLAGS and, if possible, cpuid, able to detect 386 and higher. The first CPU calls start_kernel(), all others call arch/i386/kernel/smpboot.c:initialize_secondary() if ready=1, which just reloads esp/eip and doesn't return.
  • 11. The init/main.c:start_kernel() is written in C and does the following: Perform arch−specific setup (memory layout analysis, copying boot command line again, etc.). Print Linux kernel "banner" containing the version, compiler used to build it etc. to the kernel ring buffer for messages. This is taken from the variable linux_banner defined in init/version.c and is the same string as displayed by cat /proc/version. Initialise traps, irqs, data required for scheduler. Parse boot commandline options & Initialise console. If module support was compiled into the kernel, initialise dynamical module loading facility.
  • 12. If "profile=" command line was supplied, initialise profiling buffers. kmem_cache_init(), initialise most of slab allocator. Enable interrupts. Calculate BogoMips value for this CPU. Call mem_init() which calculates max_mapnr, totalram_pages and high_memory and prints out the "Memory: ..." line. kmem_cache_sizes_init(), finish slab allocator initialisation. Initialise data structures used by procfs. fork_init(), create uid_cache, initialise max_threads based on the amount of memory available and configure RLIMIT_NPROC for init_task to be max_threads/2.
  • 13. Create various slab caches needed for VFS, VM, buffer cache, etc. If System V IPC support is compiled in, initialise the IPC subsystem. Note that for System V shm, this includes mounting an internal (in−kernel) instance of shmfs filesystem. If quota support is compiled into the kernel, create and initialise a special slab cache for it. Perform arch−specific "check for bugs" and, whenever possible, activate workaround for processor/bus/etc bugs. Comparing various architectures reveals that "ia64 has no bugs" and "ia32 has quite a few bugs", good example is "f00f bug" which is only checked if kernel is compiled for less than 686 and worked around accordingly. Set a flag to indicate that a schedule should be invoked at "next opportunity" and create a kernel thread init() which execs execute_command if supplied via "init=" boot parameter, or tries to exe /sbin/init, /etc/init, /bin/init, /bin/sh in this order; if all these fail, panic with "suggestion" to use "init=" parameter. Go into the idle loop, this is an idle thread with pid=0.
  • 14. Interrupts and ExceptionsHardware support for getting CPUs attention Often transfers from user to kernel mode Nested interrupts are possible; interrupt can occur while an interrupt handler is already executing (in kernel mode) Asynchronous: device or timer generated Unrelated to currently executing process Synchronous: immediate result of last instruction Often represents a hardware error condition Intel terminology and hardware Irqs, vectors, IDT, gates, PIC, APIC Interrupt handling: data structures, flow of control Handlers: softirqs, tasklets, bottom halves
  • 15. Basic IdeasSimilar to context switch (but lighter weight) Hardware saves a small amount of context on stack Includes interrupted instruction if restart needed Execution resumes with special “iret” instruction Structure: top and bottom halves Top-half: do minimum work and return Bottom-half: deferred processing Handler code executed in response Possible to temporarily mask interrupts Handlers need not be reentrant But other interrupts can occur, causing nesting
  • 16. Interrupts vs ExceptionsVarying terminology but for Intel: Interrupt (synchronous, device generated) Maskable: device-generated, associated with IRQs (interrupt request lines); may be temporarily disabled (still pending) Nonmaskable: some critical hardware failures Exceptions (asynchronous) Processor-detected Faults – correctable (restartable); e.g. page fault Traps – no reexecution needed; e.g. breakpoint Aborts – severe error; process usually terminated (by signal) Programmed exceptions (software interrupts) int (system call), int3 (breakpoint) into (overflow), bounds (address check)
  • 17. Vectors, IDTVector: index (0-255) into descriptor table (IDT) Special register: idtr points to table (use lidt to load) IDT: table of “gate descriptors” Segment selector + offset for handler Descriptor Privilege Level (DPL) Gates (slightly different ways of entering kernel) Task gate: includes TSS to transfer to (not used by Linux) Interrupt gate: disables further interrupts Trap gate: further interrupts still allowed Vector assignments Exceptions, NMI are fixed Maskable interrupts can be assigned as needed
  • 18. PICProgrammable Interrupt Controller (PIC) chip between devices and cpu Fixed number of wires in from devices IRQs: Interrupt ReQuest lines Single wire to CPU + some registers PIC translates IRQ to vector Raises interrupt to CPU Vector available in register Waits for ack from CPU Other interrupts may be pending Possible to “mask” interrupts at PIC or CPU Early systems cascaded two 8 input chips (8259A)
  • 19. Interrupt Handling ComponentsPICCPUMemory BusINTRvector015IRQsIDT0255handleridtrMask points
  • 20. IO-APIC, LAPICAdvanced PIC for SMP systems Used in all modern systems Interrupts “routed” to CPU over system bus IPI: inter-processor interrupt Local APIC versus “frontend” IO-APIC Devices connect to front-end IO-APIC IO-APIC communicates (over bus) with Local APIC Interrupt routing Allows broadcast or selective routing of interrupts Need to distribute interrupt handling load Routes to lowest priority process Special register: Task Priority Register (TPR) Arbitrates (round-robin) if equal priority
  • 21. Intel ExceptionsArchitecture (processor) dependent Intel has about 20 (out of 32 possible) Most exceptions send signal to current process Default action often just kills process Page fault is the one exception; very complex handler Some examples: 0 SIGFPE Divide by zero error 3 SIGTRAP Breakpoint 6 SIGILL Invalid op-code 11 SIGBUS Segment not present 12 SIGBUS Stack overflow 13 SIGSEGV General protection fault (DPL violation) 14 SIGSEGV Page fault
  • 22. Hardware HandlingOn entry: Which vector? Get corresponding descriptor in IDT Find specified descriptor in GDT (for handler) Check privilege levels (CPL, DPL) If entering kernel mode, set kernel stack Save eflags, cs, (original) eip on stack -> Jump to appropriate handler Assembly code prepares C stack, calls handler On return (i.e. iret): Restore registers from stack If returning to user mode, restore user stack Clear segment registers (if privileged selectors)
  • 23. Nested ExecutionInterrupts can be interrupted By different interrupts; handlers need not be reentrant No notion of priority in Linux Small portions execute with interrupts disabled Interrupts remain pending until acked by CPU Exceptions can be interrupted By interrupts (devices needing service) Exceptions can nest two levels deep Exceptions indicate coding error Exception code (kernel code) shouldn’t have bugs Page fault is possible (trying to touch user data)
  • 24. IDT InitializationInitialized once by BIOS in real mode Linux re-initializes during kernel init Must not expose kernel to user mode access start by zeroing all descriptors Linux lingo: Interrupt gate (same as Intel; no user access) Not accessible from user mode System gate (Intel trap gate; user access) Used for int, int3, into, bounds Trap gate (same as Intel; no user access) Used for exceptions
  • 25. Exception HandlingSome exceptions push error code on stack IDT points to small individual handlers (assembly) handler_name: pushl $0 // placeholder if no error code pushl $do_handler_name jmp error_code Common code sets up for C call Pops handler address from stack, calls All handlers check if kernel mode Exceptions caused by touching bad syscall params Return to userland with error code Other exceptions-> die() // kernel Oops Most handlers just generate signal for current current->tss.error_code = error_code; current->tss.trap_no = vector; force_sig(sig_number, current);
  • 26. Interrupt HandlingMore complex than exceptions Requires registry, deferred processing, etc. Some issues: IRQs are often shared; all handlers (ISRs) are executed so they must query device IRQs are dynamically allocated to reduce contention Example: floppy allocates when accessed Three types of actions: Critical: Top-half (interrupts disabled – briefly!) Example: acknowledge interrupt Non-critical: Top-half (interrupts enabled) Example: read key scan code, add to buffer Non-critical deferrable: Do it “later” (interrupts enabled) Example: copy keyboard buffer to terminal handler process Softirqs, tasklets, bottom halves (deprecated)
  • 27. IRQ, Vector AssignmentPCI bus usually assigns IRQs at boot Vectors usually IRQ# + 32 Below 32 reserved for non-maskable, execeptions Vector 128 used for syscall Vectors 251-255 used for IPI Some IRQs are fixed by architecture IRQ0: interval timer IRQ2: cascade pin for 8259A See /proc/interrupts for assignments
  • 28. IRQ Data Structuresirq_desc: array of IRQ descriptors status (flags), lock, depth (for nested disables) handler: PIC device driver! action: linked list of irqaction structs (containing ISRs) irqaction: ISR info handler: actual ISR! flags: SA_INTERRUPT: interrupts disabled if set SA_SHIRQ: sharing allowed SA_SAMPLE_RANDOM: input for /dev/random entropy pool name: for /proc/interrupts dev_id, next irq_stat: per-cpu counters (for /proc/interrupts)
  • 29. Interrupt ProcessingBUILD_IRQ macro generates: IRQn_interrupt: pushl $n-256 // negative to distinguish syscalls jmp common_interrupt Common code: common_interrupt: SAVE_ALL // save a few more registers than hardware call do_IRQ jmp $ret_from_intr do_IRQ() is C code that handles all interrupts
  • 30. Low-level IRQ Processingdo_IRQ(): get vector, index into irq_desc for appropriate struct grab per-vector spinlock, ack (to PIC) and mask line set flags (IRQ_PENDING) really process IRQ? (may be disabled, etc.) call handle_IRQ_event() some logic for handling lost IRQs on SMP systems handle_IRQ_event(): enable interrupts if needed (SA_INTERRUPT clear) execute all ISRs for this vector: action->handler(irq, action->dev_id, regs);
  • 31. Deferrable FunctionsBottom-halves (deprecated): Old static array of function pointers that are marked for execution (can be masked temporarily) Executed on kernel to user transition Executed serially (globally) on SMP system Mostly for networking code: Tasklets: Different tasklets can execute concurrently Softirqs: The same softirq can execute concurrently Layered implementation: Bottom-halves implemented using tasklets Tasklets implemented using softirqs When executed? (pretty frequently) When last (nested) interrupt handler terminates When network packet receiver When idle: per-cpu ksoftirqd kernel thread Lot’s of detail in book; a bit complex …
  • 32. Return Code PathInterleaved assembly entry points: ret_from_exception() ret_from_inr() ret_from_sys_call() ret_from_fork() See flowchart in text (Fig 4-5 page 158) Things that happen: Run scheduler if necessary Return to user mode if no nested handlers Restore context, user-stack, switch mode Re-enable interrupts if necessary Deliver pending signals (Some DOS emulation stuff – VM86 Mode)
  • 33. System Calls
  • 34. System CallsInterface between user-level processes and hardware devices. CPU, memory, disks etc. Make programming easier: Let kernel take care of hardware-specific issues. Increase system security: Let kernel check requested service via syscall. Provide portability: Maintain interface but change functional implementation.
  • 35. 35Mode, Space, ContextMode: hardware restricted execution state restricted access, privileged instructions user mode vs. kernel mode “dual-mode architecture”, “protected mode” Intel supports 4 protection “rings”: 0 kernel, 1 unused, 2 unused, 3 user Space: kernel (system) vs. user (process) address space requires MMU support (virtual memory) “userland”: any process address space; there are many user address spaces reality: kernel is often mapped into user process space Context: kernel activity on “behalf” of ??? process: on behalf of current process system: unrelated to current process (maybe no process!) example “interrupt context” blocking not allowed!
  • 36. POSIX APIsAPI = Application Programmer Interface. Function defn specifying how to obtain service. By contrast, a system call is an explicit request to kernel made via a software interrupt. Standard C library (libc) contains wrapper routines that make system calls. e.g., malloc, free are libc routines that use the brk system call. POSIX-compliant = having a standard set of APIs. Non-UNIX systems can be POSIX-compliant if they offer the required set of APIs.
  • 37. 37Interrupts and ExceptionsInterrupts - async device to cpu communication example: service request, completion notification aside: IPI – interprocessor interrupt (another cpu!) system may be interrupted in either kernel or user mode interrupts are logically unrelated to current processing Exceptions - sync hardware error notification example: divide-by-zero (AU), illegal address (MMU) exceptions are caused by current processing Software interrupts (traps) synchronous “simulated” interrupt allows controlled “entry” into the kernel from userland
  • 38. Linux System CallsInvoked by executing int $0x80. Programmed exception vector number 128. CPU switches to kernel mode & executes a kernel function. Calling process passes syscall number identifying system call in eax register (on Intel processors). Syscall handler responsible for: Saving registers on kernel mode stack. Invoking syscall service routine. Exiting by calling ret_from_sys_call().
  • 39. Linux System CallsSystem call dispatch table: Associates syscall number with corresponding service routine. Stored in sys_call_table array having up to NR_syscall entries (usually 256 maximum). nth entry contains service routine address of syscall n.
  • 40. 40Kernel Entry and ExitKernelDevicesLibrary CodeSystem Call Interfacetrap / interrupt tablesystem call tableschedulerbootIPI: inter- processor interrupt80hexceptions (error traps)interruptdevice dialogtrappage faults
  • 41. Initializing System Callstrap_init() called during kernel initialization sets up the IDT (interrupt descriptor table) entry corresponding to vector 128: set_system_gate(0x80, &system_call); A system gate descriptor is placed in the IDT, identifying address of system_call routine. Does not disable maskable interrupts. Sets the descriptor privilege level (DPL) to 3: Allows User Mode processes to invoke exception handlers (i.e. syscall routines).
  • 42. The system_call() FunctionSaves syscall number & CPU registers used by exception handler on the stack, except those automatically saved by control unit. Checks for valid system call. Invokes specific service routine associated with syscall number (contained in eax): call *sys_call_table(0, %eax, 4) Return code of system call is stored in eax.
  • 43. Parameter Passing As the syscall number, user-space must relay the parameters to the kernel during the exception trap The parameters are stored in registers: onx86, the registers ebx, ecx, edx, esi, and edi contain, in order, the first five arguments. In the unlikely case of six or more arguments, a single register is used to hold a pointer to user-space where all the parameters reside The return value is sent to user-space via register, eax on x86
  • 44. Writing a system call for LinuxDefine its purpose, i.e., exactly one purpose Decide arguments, return value, and error codes Design the interface with forward compatibility in mind return appropriate error codesVerifying the Parameters The pointer points to a region of memory in user-space The pointer points to a region of memory in the process’s address space If reading, the memory is marked readable. If writing, the memory is marked writable
  • 45. Asmlinkage long sys_scopy(unsigned long *src, unsigned long *dst, unsigned long len) { unsigned long buf; /*fail if the kernel wordsize and user wordsize do not match */ if (len != sizeof(buf)) return –EINVAL; if (copy_from_user(&buf, src, len)) return –EFAULT; if (copy_to_user(dst, &buf, len)) return –EFAULT; return len; /*return amount of data copied */ }copy_to_user(usr_dst, krnl_src, len); copy_from_user(krnl_dst, usr_src, len);
  • 46. System Call ContextIn process context, the kernel is capable of sleeping (e.g., blocked on a call or calling schedule()): make use of the majority of the kernel’s functionality; simplifying kernel programming In process context, the kernel is preemptible: system calls must be reentrant (the current task may be preempted by another task that may then execute the same system call).
  • 47. 47Blocking System Callssystem calls may block “in the kernel” “slow” system calls may block indefinitely reads, writes of pipes, terminals, net devices some ipc calls, pause, some opens and ioctls disk io is NOT slow (it will eventually complete) blocking slow calls may be “interrupted” by a signal returns EINTR problem: slow calls must be wrapped in a loop BSD introduced “automatic restart” of slow interrupted calls POSIX didn’t specify semantics Linux no automatic restart by default specify restart when setting signal handler (SA_RESTART)
  • 48. Linux Files Relating to SyscallsMain files: arch/i386/kernel/entry.S System call and low-level fault handling routines. include/asm-i386/unistd.h System call numbers and macros. kernel/sys.c System call service routines.
  • 49. arch/i386/kernel/entry.S Add system calls by appending entry to sys_call_table: .long SYMBOL_NAME(sys_my_system_call)
  • 50. include/asm-i386/unistd.h Each system call needs a number in the system call table: e.g., #define __NR_write 4 #define __NR_my_system_call nnn, where nnn is next free entry in system call table.
  • 51. kernel/sys.c Service routine bodies are defined here: e.g., asmlinkage retval sys_my_system_call (parameters) { body of service routine; return retval; }
  • 52. 52Example System Callssys_foo, do_foo idiom all system calls proper begin with sys_ often delegate to do_ function for the real work asmlinkage gcc magic to keep parameters on the stack avoids register optimizations sys_ni_syscall just return ENOSYS! guards position 0 in table (catch uninitialized bugs) fills “holes” for obsolete syscalls or library implemented calls
  • 53. 53Example System Calls: sys_time kernel/time.c: sys_time just return the number of seconds since Jan 1, 1970 available as volatile CURRENT_TIME (xtime.tv_sec) snapshot current time check user-supplied pointer for validity copy time to user space (asm/uaccess.h:put_user) return time snapshot or error
  • 54. 54Example System Calls: sys_rebootkernel/sys.c : sys_reboot require SYS_BOOT capability check “magic numbers” (0xfee1dead, Torvalds family birthdays) acquire the “big kernel lock” switch options shutdown in various ways: restart, halt, poweroff “user-specified” shutdown command for some architectures toggle control-alt-delete processing go through reboot_notifier callbacks as appropriate unlock and return error if failure
  • 55. 55Example System Calls: sys_sysinfokernel/info.c : sys_sysinfo allocate a local struct to return info to user space disable (clear) interrupts to keep info consistent calculate uptime calculate 1, 5, 15 second “load averages” average length of run queue over interval use confusing int math to avoid floating-point inefficiency enable (set) interrupts return number of processes and some mem stats copy local struct values to user space (copy_to_user)
  • 56. Context switch in Linux
  • 57. Memory layout – general pictureStackProcess Y user memoryTSS of CPU itss->esp0StackProcess X user memoryStackProcess Z user memoryKernel memoryStackProcess Z kernel stack and task_structtask_structStackProcess X kernel stack and task_structtask_structStackProcess Y kernel stack and task_structtask_struct
  • 58. TSStss->esp0#1 – kernel stack after any system call, before context switchssespeflagscseiporig_eaxesdseaxebpediesiedxecxebxprevesp…Schedule() function frame……User StackUser CodeSaved on the kernel stack during a transition to kernel mode by a jump to interrupt and by SAVE_ALL macrotask_structthread.esp0
  • 59. prevTSStss->esp0esp#2 – stack of prev before switch_to macro in schedule() functask_structthread.eipthread.espthread.esp0Schedule() saved EAX, ECX, EDXOld (schedule’s()) EBP Arguments to contex_switch()Return address to schedule() ……
  • 60. TSStss->esp0#3 – switch_to: save esi, edi, ebp on the stack of prevtask_structthread.eipthread.espthread.esp0prevespSchedule() saved EAX, ECX, EDXOld (schedule’s()) EBP Arguments to contex_switch()Return address to schedule() EDIESIEBP……
  • 61. TSStss->esp0#4 – switch_to: save esp in prev->thread.espespSchedule() saved EAX, ECX, EDXOld (schedule’s()) EBP Arguments to contex_switch()Return address to schedule() EDIESIEBP…task_structthread.eipthread.espthread.esp0prev…
  • 62. nextTSStss->esp0#5 – switch_to: load next->thread.esp into espprevSchedule() saved EAX, ECX, EDXOld (schedule’s()) EBP Arguments to contex_switch()Return address to schedule() EDIESIEBP…task_structthread.eipthread.espthread.esp0Schedule() saved EAX, ECX, EDXOld (schedule’s()) EBP Arguments to contex_switch()Return address to schedule() EDIESIEBP…esp$1fthread.eipthread.espthread.esp0task_struct……
  • 63. nextTSStss->esp0prevSchedule() saved EAX, ECX, EDXOld (schedule’s()) EBP Arguments to contex_switch()Return address to schedule() EDIESIEBP…task_structthread.eipthread.espthread.esp0Schedule() saved EAX, ECX, EDXOld (schedule’s()) EBP Arguments to contex_switch()Return address to schedule() EDIESIEBP…esp$1fthread.eipthread.espthread.esp0task_struct…#6 – switch_to: save return address in the prev->thread.eip$1f…
  • 64. nextTSStss->esp0prevSchedule() saved EAX, ECX, EDXOld (schedule’s()) EBP Arguments to contex_switch()Return address to schedule() EDIESIEBP…task_structthread.eipthread.espthread.esp0Schedule() saved EAX, ECX, EDXOld (schedule’s()) EBP Arguments to contex_switch()Return address to schedule() EDIESIEBP…esp$1fthread.eipthread.espthread.esp0task_struct…$1f#7 – switch_to: save return address on the stack of next$1f…
  • 65. nextprevSchedule() saved EAX, ECX, EDXOld (schedule’s()) EBP Arguments to contex_switch()Return address to schedule() EDIESIEBP…task_structthread.eipthread.espthread.esp0Schedule() saved EAX, ECX, EDXOld (schedule’s()) EBP Arguments to contex_switch()Return address to schedule() EDIESIEBP…esp$1fthread.eipthread.espthread.esp0task_struct…$1f$1f#8 – __switch_to func: save the base of next’s stack in TSSTSStss->esp0…
  • 66. nextprevSchedule() saved EAX, ECX, EDXOld (schedule’s()) EBP Arguments to contex_switch()Return address to schedule() EDIESIEBP…task_structthread.eipthread.espthread.esp0Schedule() saved EAX, ECX, EDXOld (schedule’s()) EBP Arguments to contex_switch()Return address to schedule() EDIESIEBP…esp$1fthread.eipthread.espthread.esp0task_struct…$1f#9 – back in switch_to: eip points to $1f instruction labeleip1:TSStss->esp0…
  • 67. nextTSStss->esp0prevSchedule() saved EAX, ECX, EDXOld (schedule’s()) EBP Arguments to contex_switch()Return address to schedule() EDIESIEBP…task_structthread.eipthread.espthread.esp0Schedule() saved EAX, ECX, EDXOld (schedule’s()) EBP Arguments to contex_switch()Return address to schedule() …esp$1fthread.eipthread.espthread.esp0task_struct…$1f#10 – switch_to: restore esi, edi, ebp from the stack of next…
  • 68. Thank you