Lab 03: Memory Safety [CS Open CourseWare]

Lab 03: Memory Safety

During this lab you will learn about the memory safety mechanisms that are available in D.

@safe functions

A class of programming errors involve corrupting data at unrelated locations in memory by writing at those locations unintentionally. Such errors are mostly due to mistakes made in using pointers and applying type casts. @safe functions guarantee that they do not contain any operation that may corrupt memory. An extensive list of operations that are forbidden in @safe functions can be found here.

Let's take this code for example:

import std.stdio : writeln;
 
void main()                                                                                                                                   
{
    int a = 7;
    int b = 9;
 
    /* some code later */
 
    *(&a + 1) = 2;
    writeln(b);
    writeln(b);
}

Running this code yields the result:

9
2

Wait, what? No, this is not a typo: the value of b changes between two instructions for no apparent reason. This is happening because the compiler does not offer any guarantees when it comes to pointer arithmetic on local variables; so by definition this is an unsafe operation and if faulting line would have been hidden between another 1K lines of code, it would have taken a lot of time to get to the root of the problem. Now let us annotate the main function definition with @safe:

void main() @safe

The compiler correctly highlights:

test.d(10): Error: cannot take address of local a in @safe function main

For templated functions, the @safe attribute is inferred after the function has been generated with the provided instantiation types. This means that a templated function can generate a @safe function for one type and an un-@safe function for a different type.

@trusted

The safety rules work well to prevent memory corruption, but they prevent a lot of valid, and actually safe, code. For example, consider a function that wants to use the system call read, which is prototyped like this:

ssize_t read(int fd, void* ptr, size_t nBytes);

For those unfamiliar with this function, it reads data from the given file descriptor, and puts it into the buffer pointed at by ptr and expected to be nBytes bytes long. It returns the number of bytes actually read, or a negative value if an error occurs.

Using this function to read data into a stack-allocated buffer might look like this:

ubyte[128] buf;
auto nread = read(fd, buf.ptr, buf.length);

How is this done inside a @safe function? The main issue with using read in @safe code is that pointers can only pass a single value, in this case a single ubyte. read expects to store more bytes of the buffer. In D, we would normally pass the data to be read as a dynamic array. However, read is not D code, and uses a common C idiom of passing the buffer and length separately, so it cannot be marked @safe. Consider the following call from @safe code:

auto nread = read(fd, buf.ptr, 10_000);

This call is definitely not safe. What is safe in the above read example is only the one call, where the understanding of the read function and calling context assures memory outside the buffer will not be written.

To solve this situation, D provides the @trusted attribute, which tells the compiler that the code inside the function is assumed to be @safe, but will not be mechanically checked. It’s on you, the developer, to make sure the code is actually @safe.

A function that solves the problem might look like this in D:

auto safeRead(int fd, ubyte[] buf) @trusted
{
    return read(fd, buf.ptr, buf.length);
}

Whenever marking an entire function @trusted, consider if code could call this function from any context that would compromise memory safety. If so, this function should not be marked @trusted under any circumstances. Even if the intention is to only call it in safe ways, the compiler will not prevent unsafe usage by others. safeRead should be fine to call from any @safe context, so it’s a great candidate to mark @trusted.

@system

@system is the default safety attribute for functions and it implies that no automatic safety checks are performed.

Type Qualifiers

Type qualifiers modify a type by applying a type constructor. Type constructors are: const, immutable, shared, and inout. Each applies transitively to all subtypes.

const/immutable

When examining a data structure or interface, it is very helpful to be able to easily tell which data can be expected to not change, which data might change, and who may change that data. This is done with the aid of the language typing system. Data can be marked as const or immutable, with the default being changeable (or mutable).

1. immutable applies to data that cannot change. Immutable data values, once constructed, remain the same for the duration of the program's execution. Immutable data can be placed in ROM (Read Only Memory) or in memory pages marked by the hardware as read only. Since immutable data does not change, it enables many opportunities for program optimization, and has applications in functional style programming.

immutable char[] s = "foo";
s[0] = 'a';  // error, s refers to immutable data
s = "bar";   // error, s is immutable
 
immutable int x = 3;  // x is set to 3
x = 4;        // error, x is immutable
char[x] z;    // z is an array of 3 chars
 
immutable(char)[] s = "hello";
s[0] = 'b';  // error, s[] is immutable
s = null;    // ok, s itself is not immutable

2. const applies to data that cannot be changed by the const reference to that data. It may, however, be changed by another reference to that same data. Const finds applications in passing data through interfaces that promise not to modify them.

import std.stdio : writeln;
 
int a;
 
void gun()
{
    a = 10;
}
 
void fun(const(int)* b)
{
    writeln(*b);     // prints 42
    gun();
    writeln(*b);     // prints 10
    *b = 3;          // error, cannot modify const expression *b                                                                                                                       
}
 
void main()
{
    a = 42;
    fun(&a);
}

Both immutable and const are transitive, which means that any data reachable through an immutable reference is also immutable, and likewise for const.

Implicit conversion rules

1. Mutable data is implicitly convertible to const, but not to immutable.

2. const data is not implicitly convertible to neither mutable, nor immutable.

3. immutable data is implicitly convertible to const, but not to mutable.

shared

Unlike most other programming languages, data is not automatically shared in D; data is thread-local by default. Although module-level variables may give the impression of being accessible by all threads, each thread actually gets its own copy:

import std.stdio;
import std.concurrency;
import core.thread;
 
int variable;
void printInfo(string message)
{
    writefln("%s: %s (@%s)", message, variable, &variable);
}
 
void worker()
{
    variable = 42;
    printInfo("Before the worker is terminated");
}
 
void main()
{
    spawn(&worker);
    thread_joinAll();
    printInfo("After the worker is terminated");
}

the variable that is modified inside worker() is not the same variable that is seen by main(). Mutable variables that need to be shared must be defined with the shared keyword:

shared int variable;

On the other hand, since immutable variables cannot be modified, there is no problem with sharing them directly. For that reason, immutable implies shared.

Exercises

The exercises for lab-03 are located in this repo.

1. Qualifiers

Navigate to the 1-qualifiers directory. Read and understand the source file qualifiers.d. Explain the working and the failing cases.

2. Qualified functions

Navigate to the 2-qualfuncs directory. Inspect the source file qualfuncs.d. Should this code compile or not? Why?

Add immutable at the end of the foo function signature.

    void foo() immutable

What happens? How can this code be fixed? Ask the lab rats for guidance.

3. Const

Navigate to the 3-const directory. Inspect the source file const.d. Compile and run the code. Explain the results.

4. Array sort

Implement a generic sorting algorithm for dynamic arrays.

The arrays can be immutable/const/mutable.
The array element type can be any type that is ordering comparable.
The qualifier (mutable/const/immutabe) applies to both the array and the elements.
You must use template constraints and/or static ifs.
Recommendation: use 2 template overloads (one for mutable and one for const/immutable); the mutable version does in-place sorting, the const/immutable version creates a new array and returns it.

5. @safe

Navigate to the 5-safe directory. Inspect the source file. Compile and run the code.

What does the code do? Why is it useful to take the address of a parameter?
Add the @safe attribute to the main function. What happens?
Add @safe to the func and gun functions. Analyze the error messages.
How can we get rid of the first error message?
What about the second error message?

6. @trusted

Navigate to the 6-trusted directory. Inspect the source file. Compile and run the code.

Is this code safe? Why?
Apply @safe to the main function. What happens? Why?
Move the read line in a new function, safeRead, that will be marked as @trusted.

@trusted disables all the compiler checks for safety inside @safe functions. Be very careful when using it. With great power comes great responsibility.

7. Template attribute inference

Navigate to the 7-template-inference directory. Inspect the source file. Compile and run the code.

Add the @safe attribute to the main function. How do you explain the result? Why isn't the compiler complaining for the second invocation of func?

Lab 03: Memory Safety
Exercises

dss/laboratoare/03.txt · Last modified: 2019/06/21 13:10 by razvan.nitu1305

Old revisions

Media Manager Back to top