Understanding Vulnerabilities 1: C, ASM, and Overflows: Computer Security Lectures 2014/15 S2

This video is part of the computer/information/cyber security and ethical hacking lecture series; by Z. Cliffe Schreuders at Leeds Beckett University. Laboratory work sheets, slides, and other open educational resources are available at http://z.cliffe.schreuders.org.

The slides themselves are creative commons licensed CC-BY-SA, and images used are licensed as individually attributed.

Topics covered in this lecture include:

We will delve into the technical details required to understand common kinds of system software vulnerabilities
Program Languages
Why look at C/C++
C – explained
Spot the bug(s)!
The First “Computer Bug”
Moth found trapped in the Mark II Aiken Relay Calculator at Harvard University, 9 September 1947
Making mistakes that result in software vulnerabilities is very easy!
Writing code without mistakes is hard!
Common errors
Memory errors and bounds checking: esp, buffer overflows
Un-sanitised input: incl, command injection
Race conditions
Pointers, strings, and other complications
Misconfigured/used access controls and other security mechanisms
We will look at each of these…
We will use C as our example programming language
C was created for development of UNIX
The C language is closely tied to Unix system calls, but is used for all sorts of things
The Linux kernel is written in C
A subset of C++
By today’s standards C is a fairly low level language (exposes complexity)
Does not have all the security primitives that are built into some other languages
C was designed to be light-weight – it leaves more to the programmer
No enforced bounds checking
C is not (or is weakly) a type safe language
Here is a crash course in C programming…
Variables and C
Reading input and C
Conditions and C
Functions and C
Loops and C
Fixed iteration
More C programming…
How to compile a C program
C stores program code as .c and .h files
If you have a single .c file you can compile
If it is a more complicated program
Machine code
Machine code is instructions that can be executed directly by a CPU
Access to files and network resources needs to be via an OS system call
Machine code is specific to a CPU architecture/instruction set (x84 or 64 bit)
This binary data contains a sequence of opcodes and operands
CPU registers
Assembly language
Mnemonic rather than binary opcodes
AT&T vs Intel
Important opcode mnemonics
Stack instructions
Integer instructions
Code jumping
Debugging software
GDB: : The GNU Project Debugger
Temp storage and memory
The stack: static and limited, this is where most variables and function information is stored
The heap: dynamic – for when a program wants more memory to use for variables
The call stack (or simply “the stack”) is an area of memory used to keep track of program execution
When a function is called, a new stack frame is added to the stack for that function.
Stores which code to return to after the function ends
Stores local variables and parameters for the function
The stack: important x86 CPU registers
EBP: frame/base pointer
ESP: stack pointer
EIP: next instruction, Retrieved from the return address stored on the stack
C strings
C does not have a “string” data type, instead an array of characters are used
Strings and bounds
Many standard C string functions do not provide bounds checking
gets() is never safe
Stack smashing buffer overflows
If we carefully craft input we can make the program jump to somewhere else in its own logic (known as arc injection)
Or execute our own “shell code” (known as code injection)
The stack grows with each function called, and contains pointers to where to jump to when the function ends
Avoid these functions

Previous post:

Next post: