Classes¶

Emil Sekerinski, McMaster University, Fall 2019¶

On Objects and Classes¶

Suppose the task is to store and manipulate geometric shapes, like those used in drawing editors. A point may be represented by a pair of integers, the x and y-coordinate, and a rectangle by the x and y-coordinates together with the width and height, which is expressed by the types Point and Rectangle:

Point = integer × integer
Rectangle = integer × integer × integer × integer

The operation of moving a point is then expressed by two functions move, distinguished by their parameter types:

move: Point × integer × integer → Point
move((x, y), dx, dy) = (x + dx, y + dy)

move: Rectangle × integer × integer → Rectangle
move((x, y, w, h), dx, dy) = (x + dx, y + dy, w, h)

Classes group data together with operations on that data. A class declaration consists of number of fields declaring the data, x and y and below, and a number of methods defining the operations, set and move below:

algorithm
class Point
    var x, y: integer
    method set(x0, y0: integer)
        x, y := x0, y0
    method move(dx, dy: integer)
        x, y := x + dx, y + dy

Above, x and y refer to the corresponding fields of the "current" object.

As objects are created from a class, they are provided with an identity (name, reference). Declaring variable p of class Point allows it to hold the identity of a Point object; classes are used in variable declarations as if they are types:

var p: Point

Above declaration implies that, after initialization, p will always refer to a Point object. On object is created by "calling" the class:

p ← Point()

The fields of an object are accessed and the methods are called using the dot notation. For example, following sets the x and y-coordinate:

algorithm
p.set(3, 5)
{p.x = 3 ∧ p.y = 5}
p.move(-1, 1)
{p.x = 2 ∧ p.y = 6}

By convention self refers to the current object and self.a refers to field a of the current object. The fact that x and y in set and move refer to fields of the current object can be made explicit by prefixing them with self:

algorithm
class Point
    var x, y: integer
    method set(x0, y0: integer)
        self.x, self.y := x0, y0
    method move(dx, dy: integer)
        self.x, self.y := self.x + dx, self.y + dy

Mostly, we leave out the prefix self and think of x, y as being "variables of the current object".

Class Invariants¶

Consider a class for rectangles with coordinates, width, and height such that width and height are "always" positive. This is expressed by a class invariant, written as an annotation in the class:

algorithm
class Rectangle
    var x, y, w, h: integer
    {w > 0 ∧ h > 0}
    initialization(x0, y0, w0, h0: integer)
        {w0 > 0 ∧ h0 > 0}
        x, y, w, h := x0, y0, w0, h0
    method move(dx, dy: integer)
        x, y := x + dx, y + dy
    method mirror()
        w, h := h, w
    method area() → (a: integer)
        a := w × h
        {a > 0}

Above, the class invariant involves only the fields of the current object.

Question: Assuming a Rectangle object is created and its methods are called. How can you argue that area will always return a positive area?

The class invariant is like a contract among the methods:

each method can assume that the class invariant holds at the entry; in turn, it has to guarantee that the invariant holds at the exit: both move and mirror guarantee w > 0 ∧ h > 0 at exit as they do not change w, h; method mirror guarantees w > 0 ∧ h > 0 finally provided w > 0 ∧ h > 0 holds initially as it exchanges w, h.
the initialization has to guarantee the class invariant to hold at exit: w > 0 ∧ h > 0 holds at exit as w and h are assigned values for which this is assumed.

No description has been provided for this image

This requires that the fields cannot be changed from the outside, i.e. the fields are read-only: they can still be read, as in:

algorithm
procedure equal(r, s: Rectangle) → (eq: boolean)
    eq := (r.x, r.y, r.w, r.h) = (s.x, s.y, s.w, s.h)

Rule for correctness of classes: Suppose class C is given with class invariant I and methods m1, ..., mn; the initialization S0 and the method bodies S1,..., Sn are annotated statements:

algorithm
class C
    var ...
    {I}
    initialization(...)
        S0
    method mi(...) → (...)
        Si

Class C is correct if

the initialization establishes I,

algorithm
{true}  S0  {I}

every method mi preserves I:

algorithm
{I}  Si  {I}

If S0, S1, ..., Sn above are annotated statements, the correctness rule adds another annotation. In general:

algorithm
{P0}  {P1}  S  {Q0}  {Q1}

is the same as

algorithm
{P0 ∧ P1}  S  {Q0 ∧ Q1}

Let RI: w > 0 ∧ h > 0 be the object invariant of Rectangle. Class Rectangle is correct if

the initialization establishes the object invariant,

algorithm
{w0 > 0 ∧ h0 > 0}  x, y, w, h := x0, y0, w0, h0  {RI}

methods move, mirror, area preserve the class invariant:

algorithm
{RI}  x, y := x + dx, y + dy  {RI}

algorithm
{RI}  w, h := h, w  {RI}

algorithm
{RI}  a := w × h  {RI ∧ a > 0}

Exercise: Prove these four conditions!

Java has an assert statement that allows run-time checks to be included in programs:

assert B;

These can be turned on (with the -ea option when compiling) or off. Consecutive assertions can be merged:

assert B1; assert B2;

is the same as

assert B1 && B2;

Annotations and assertions are different:

Annotations express properties of programs and allow these to be statically verified.
Assertions are statements in programs; including them changes the behaviour of programs.

It is good programming practice to have all or some annotations checked through assertions.

In [ ]:

class Rectangle {
    private int x, y, w, h;
    void checkInvariant() {
        assert w > 0 && h > 0;
    }
    Rectangle(int x0, int y0, int w0, int h0) {
        assert w0 > 0 && h0 > 0;
        x = x0; y = y0; w = w0; h = h0;
        checkInvariant();
    }
    void move(int dx, int dy) {
        x += dx; y += dy;
        checkInvariant();
    }
    void mirror() {
        int t = w; w = h; h = t;
        checkInvariant();
    }
    int area() {
        int a = w * h;
        checkInvariant();
        assert a > 0;
        return a;
    }
}

In [ ]:

Rectangle r = new Rectangle(3, 4, 2, 1); r.move(1, 1); r.mirror(); r.area()

In [ ]:

r = new Rectangle(3, 4, -2, 1)

Question: Is it necessary to check the invariant at the beginning of each method?

In Rectangle, all fields are private: if the invariant holds after the initialization and after each method, it is guaranteed to hold before each method as well.

Question: What is the class invariant of IntStack? What are the required preconditions of the initialization and the methods? Add checks for all relevant annotations!

In [ ]:

class IntStack {
    int n = 0;
    int[] a;
    void checkInvariant() {
        assert a != null && 0 <= n && n <= a.length
    }
    IntStack(int capacity) {
        a = new int[capacity];
        checkInvariant();
    }
    void put(int x) {
        a[n] = x; n += 1;
        checkInvariant()
    }
    int get() {
        assert n > 0;
        n -= 1; checkInvarian t(); return a[n]; 
    } 
    int size() {
        return n;
    }
}

In [ ]:

IntStack s = new IntStack(1); s.put(3);

Question: To ensure that the invariant is not invalidated, is it necessary to check the parameter of the initialization? Is it necessary to add assertions that the stack does not overflow or underflow? Is it recommendable to insert if statements when the stack would overflow or underflow?

In a robust programming style, class invariants should always be maintained, even in the presence of exceptions.

In the initialization, if capacity is negative, new will raise an exception and a will not be assigned an array, i.e., it will remain null, to which it is implicitly initialized. Thus an object with an invalid invariant is created. A solution would be to initialize a to an array of length 0 before raising an assertion exception.
If the stack is full and put is called, the assignment to a[n] will raise an exception, but the invariant will still be maintained.
If the stack is empty and get is called, the assignment n -= 1 will invalidate the invariant before a[n] raises an exception. A solution would be to rewrite get as

int get() {
        int r = a[n]; n -= 1; return r;
    }

Adding an if statement, as in

void put(int x) {
        if (n < a.length) {a[n] = x; n += 1;}
    }

to avoid an exception is not recommended, as the caller will not be notified that the call was not successful.

The 'nil' Value¶

If a variable may not refer to an object or some other value, it can be declared as optional and take the additional value nil (sometimes called null):

var p: opt Point
p := nil

Optional field types allow recursive class declarations to be expressed, where one field of the class itself (there is no recursion among methods involved):

class Polygon
    var x, y: integer
    var next: opt Polygon
    initialization(x, y: integer)
        self.x, self.y, self.next := x, y, nil
    method add(x, y: integer)
        next ← Polygon(x, y)

For example, C# uses the shorthand int? x for a nullable integer type. Optional types can be found in Scala, F#. In Java, C/C++, and Python all object references always include a null value, allowing recursive declarations as above without optional types. However, in that case, the type-system cannot guarantee the absence of dereferencing null:

integer distance(Point p, Point q) {
    return (q.x - p.x) + (q.y - p.y)
}

Above method can be called with p or q being null, which leads to an exception. Java also has optional types, but they don't address this issue.

Definition of Classes¶

The declaration of class C can abstractly be explained by variables and procedures. The statement v :∈ s sets v nondeterministically to an element of s; the statement v :∉ s set v to an element not in s. This is used to explain creating of an object as selecting a reference that is not already in use:

algorithm
class C
    var a: A
    initialization(v0: V0)
        S0
    method mi(vi: Vi) → (wi: Wi)
        Si

algorithm
var C: set of Object = {}
var a: Object → A
procedure C(v0: V0) → (self: C)
    self :∉ C ; C := C ∪ {self} ; S0
procedure C.mi(self: C, vi: Vi) → (wi: Wi)
    Si

Let x: C be an object of class C. Accessing a field is just function application:

x.a  =  a(x)

A method call is a procedure call, where the type of the object determines the procedure and the object becomes the self parameter:

v ← x.m(E)  =  v ← C.m(x, E)

Above, C has multiple roles:

in x: C as a type of variables
in var C: set of Object as the set of created objects of class C
in C.m for prefixing its methods, in case these names are used in other classes as well.

This formal definition can be used to justify the rule for correctness of classes.

Ownership¶

Question: In the following design, a line is meant to be pair of "different" points. What is the problem?

In [ ]:

class Point {
    int x, y;
    Point(int x0, int y0) {
        x = x0; y = y0;
    }
    void move(int dx, int dy) {
        x += dx; y += dy;
    }
}

class Line {
    Point start, end;
    void checkInvariant() {
        assert start.x != end.x || start.y != end.y;
    }
    Line(Point s, Point e) {
        assert s.x != e.x || s.y != e.y;
        start = s; end = e;
        checkInvariant();
    }
    void move(int dx, int dy) {
        start.move(dx, dy); end.move(dx, dy);
        checkInvariant();
    }
}

Point p = new Point(1, 3); Point q = new Point(2, 4);
Line l = new Line(p, q); p.move(1, 1);

In [ ]:

l.checkInvariant()

Although the fields of line l are not modified from outside, the class invariant is broken as the fields refer to objects that can be modified from outside l.

Question: How can breaking the class invariant be avoided?

The invariant is weakened, e.g.:

void checkInvariant() {
        assert start != null && end != null;
    }

New points are created on initialization of a line:

Line(Point s, Point e) {
        assert s.x != e.x || s.y != e.y;
        start = Point(s.x, s.y); end = Point(e.x, e.y);
        checkInvariant();
    }

In the latter case, the newly created points are being owned by the line. In general, every object is owned by another object, resulting in a hierarchical ownership structure, where the main program is an object that is not owned. We think of every object having a hidden owner field. Ownership can change at run-time.

Rust and Swift support forms of ownership, although for different reasons. Ownership can be traced statically through types, such that no overhead occurs at run-time.

Multi-Object Class Invariant¶

The rule for the correctness of classes applies also if a class invariant mentions only the fields of the current object and the fields of owned objects, also recursively: the class invariant will hold at entry to each method call if it did hold at the exit of each call and the initialization.

While ownership imposes a hierarchical structure on all objects, this does not preclude arbitrary structures among objects, as in the next example.

Observer Pattern¶

The Observer Pattern is one of the 23 design patterns of Design Patterns: Elements of Reusable Object-Oriented Software by Gamma, Helm, Johnson, Vlissides, 1994. Code for the pattern can be found in the standard Java and .Net libraries; templates for Python exist as well. Examples are:

a user interface of a drawing or text editor, in which one model has several views: as soon as the model changes, all views have to be notified,
an auction system in which one item is being observed by several bidders: as soon as there is a new bid on the item, all bidders have to be notified.

In the Java example below, a subject has a state, here just an integer. One observer is interested in the whole state and another observed only if that integer is even.

In [ ]:

import java.util.*;
interface Observer {
    void update();
}
class Subject {
    int state = 0;
    Set<Observer> obs = new HashSet<Observer>();
    void attach(Observer o) {
        obs.add(o);
    }
    void notifyObservers() {
        for (Observer o: obs) o.update();
    }
    void modify() {
        state += 1;
        notifyObservers();
    }
}
class ObserveAll implements Observer{
    int state;
    Subject sub;
    ObserveAll(Subject s) {
        sub = s; state = s.state;
    }
    public void update() {
        state = sub.state;
        System.out.println("state is " + state);
    }
}
class ObserveEven implements Observer {
    boolean even;
    Subject sub;
    ObserveEven(Subject s) {
        sub = s; even = s.state % 2 == 0;
    }
    public void update() {
        even = sub.state % 2 == 0;
        if (even) System.out.println("state is even");
    }
}

In [ ]:

Subject sub = new Subject();
Observer oa = new ObserveAll(sub);
Observer oe = new ObserveEven(sub);
sub.attach(oa); sub.attach(oe);
sub.modify(); sub.modify(); sub.modify(); sub.modify();

As a set of observers is used, there is no guarantee in which order the observers are being notified.

Intuitively, establishing

oa.state == sub.state is the goal of oa,
oe.even == (sub.state % 2 == 0) is the goal of oe.

However,

even == (sub.state % 2 == 0)is not an invariant of oa, as at entry to update, it does not hold
something equivalent cannot be the invariant of sub, as sub does "not know" the fields of all of its observers.

To express the goal as an invariant, a third class needs to be introduced:

In [ ]:

class Controller {
    Subject sub = new Subject();
    ObserveAll oa = new ObserveAll(sub);
    ObserveEven oe = new ObserveEven(sub);
    void checkInvariant() {
        assert this.oa.state == this.sub.state;
        assert this.oe.even == (this.sub.state % 2 == 0);
    }
    Controller() {
        sub.attach(oa); sub.attach(oe); checkInvariant();
    }
    void modify() {sub.modify(); checkInvariant();}
}
Controller c = new Controller();
c.modify(); c.modify(); c.modify(); c.modify();

Facade Pattern¶

Above program is also an instance of the Facade Pattern: users ("clients") don't call classes directly, but through a Facade class. The class invariant of Facade may involve the "hidden" class, implying that the objects of those classes are owned by the Facade. The role of the Facade is to maintain that invariant.