The Crack Programming Language Guide

This is the Manual for version 0.2 of the Crack Programming Language. Version 0.2 is a fairly complete programming language - lots of bugs fixed, lots of holes plugged since 0.1. We believe the language to be usable for real work at this time, but there are still a number of important features that remain to be added (including generics, exceptions, annotations and partial closures). While the compiler is generating debug information at this time, only a limited subset of it is currently usable (you can get a stack trace in gdb).

But that said, if you want to get in on the ground floor of a new scripting language that is C-like, fast, and interfaces well with C and C++ code, Crack is the language and this version is still pretty close to the ground floor.

So without further caveats - let's do some Crack!

Overview

If you're a seasoned programmer, here is a quick profile to help orient you to Crack:

Major Influences

C, C++, Java, Python

Syntax

C-style, curly-brace

Typing

Static, strong (with some implicit conversion)

Compiler

JIT native compiled (at runtime)

Paradigms

Object oriented, procedural

Garbage Collection

Reference counted objects, programmer controlled

OO Features

Virtual functions, overloaded functions, multiple inheritance

Crack has been developed on Linux x86 and x86-64. Portability will play a bigger role in future versions of the language.

Installation

See the INSTALL file for the latest installation instructions.

Hello World

Here's the crack "hello world" program:

    #!/usr/local/bin/crack
    import crack.io cout;
    cout `hello world!\n`;

If you write this as a script and "chmod u+x" it, when you run it you should see "hello world!" written to the terminal.

The first line is the standard unix "#!" line. It tells the kernel to execute the script by passing its full name as an argument to the "/usr/local/bin/crack" program.

The second line imports the "cout" variable from the crack.io module. Like C++, Crack uses "cin", "cout" and "cerr" for its standard input, output and error streams. "cout" and "cerr" are both "formatters," which means that they support the use of the back-tick operator for formatting.

The third line actually uses the back-tick operator to print some text. We won't go into too much detail on this operator right now - suffice it to say that this line is roughly equivalent to "cout.format('hello world!\n');" The "\n" at the end of the string translates into a newline (the ASCII "LF" character, character code 10).

Expressions consisting of a value followed by back-tick quoted text and code are called "interpolation expressions."

Comments

Crack permits the use of C, C++ and shell style comments:

    /* C Style comment */
    // C++ style comment
    # shell style comment

For code that you hope to get a lot of re-use out of, we recommend the convention of Doxygen-style doc-comments for classes, functions and global variables:

    /** C-style doc-comment */
    /// C++ style doc-comment
    ## shell-ish doc-comment

These currently get treated the same as any other comments. However, future versions of Crack will parse them and store them with the meta-data for the code, permitting the easy extraction of reference documentation from the source.

Variables and Types

Like most languages, Crack allows you to define variables. But unlike most other scripting languages, Crack is statically typed so you must specify (or imply) a type for all variables.

    # define the variable i and initialize it to 100.
    int i = 100;

You can also define variables using the more terse ":=" operator, which derives the type from that of the value:

    i := 100;           # equivalent to "int i = 100;"
    j := uint32(100);   # equivalent to "uint32 j = 100;"

If you don't specify an initializer for a variable, the default initializer will be used. For the numeric types, this is zero. For bool, it's false. For complex types (which we'll discuss later), the default constructor is used to create a new instance.

Built-in Types

The Crack language defines the following set of built-in types - these can be expected to exist in every namespace without requiring an explicit import:

void

The "void" type - this only exists so you can have a function that doesn't return anything. Bad things will happen if you try to define void variables.

byte

An 8-bit unsigned integer (like C's unsigned char)

bool

A boolean. Values are true and false, which are built-in variables.

int32

A 32-bit signed integer.

uint32

A 32-bit unsigned integer.

int64

A 64-bit signed integer.

uint64

A 64-bit unsigned integer.

float32

A 32-bit floating point.

float64

A 64-bit floating point.

int

An integer of the C compiler's default int-size for the platform (this is an alias to either int32 or int64).

uint

An unsigned integer of the C compiler's default unsigned int-size for the platform (this is an alias to either uint32 or uint64).

float

A floating point of the C compiler's float size for the platform (this is an alias to either float32 or float64).

byteptr

A pointer to an array of bytes (roughly like C's char*)

voidptr

A pointer to anything (like C's void*). All high level classes can implicitly convert to voidptr.

array[class]

The low-level array type. You should generally avoid using this in favor of high-level data structures (see crack.container). They are not memory-managed, and don't do memory management of their elements.

This is Crack's only existing generic datatype - to use it, you specialize it with another class type, for example: array[int]

VTableBase

The base class used for all classes that can have virtual functions (more on this later).

Object

The implicit base class of all classes that don't define base classes (extends VTableBase)

String

An immutable, memory managed string of bytes.

StaticString

This is a String whose buffer can point to read-only memory.

Class

The type of class objects themselves. Crack classes exist at runtime as well as compile time. See Classes are Variables.

Of these, the byte, bool, int, uint and float types (including all variations of int, uint and float) are primitives. These types are notable in that they are copy-by-value and consume no memory external to the scope in which they are defined.

The byteptr, voidptr and array types are classified as primitive pointer types.

Primitive types, primitive pointer types, and the void type are all classified as low-level types. They are distinguished from the higher level aggregate types by naming convention: low-level types will always be all lower case (and digits), high-level types (at least the ones in the standard libraries) will always begin with an upper-case character. You may not currently subclass low-level types, this restriction will be lifted in a future version of Crack.

High level or aggregate types are first class objects: variables of this type are pointers to allocated regions of memory large enough to accommodate the state data defined for the type. They can be extended to create other high-level types through sub-classing (more on this later).

Type names in Crack are very simple. They are either a single word or "array[ other-type-name ]" The latter form, though currently used only for arrays, will eventually be expanded as the instantiation mechanism for generic types, similar to generics in Java.

Integer Constants

Integer constants can be defined as an integer value in the code:

    int i = 100;
    int j = -1;

Integer constants are "adaptive", which means that they will convert to whatever type is required by their usage as long as the value is within the range of values for that type (for example, "uint u = -1" would be illegal).

Integer constants can be defined using hexadecimal, octal and binary notation as well as the default decimal notation. Examples follow:

    int x = 0x1AF;  # hex constant
    int o1 = 0117, o2 = 0o117; # both the same octal constant, c-style and
                               # normalized notation.
    int b = 0b10110; # binary constant

Implicit Conversion

In certain cases, types will automatically convert to other types. Most types will implicitly convert to boolean, allowing pretty much anything to be used as the condition in an if or while statement.

Aggregate types will implicitly convert to voidptr.

Numeric types will implicitly convert between one another as long as there is no risk of precision loss. In cases where there is a risk of precision loss, you can use explicit construction to force a conversion - truncating the value if necessary.

    # implicit conversions
    byte b;
    int32 i32 = b;
    uint32 u32 = b;
    int64 i64 = i32;
    i64 = u32;
    uint64 u64 = u32;
    float32 f32 = b;
    float64 f64 = i32;
    f64 = u32;
    
    # explicit conversions
    i32 = int32(i64);
    b = byte(f32);
    i64 = int64(u64);

Strings

Most programming languages support strings of characters, which are usually implemented as some kind of array. Crack strings are strings of bytes - you can embed any kind of byte values you want in them, there are no assumptions about encoding.

String constants are sequences of bytes enclosed in single or double quotes (which are equivalent forms):

    String s = "first string";
    t := 'second string';

String constants are actually instances of the "StaticString" class - they're just like strings except that since their buffers are constants, they don't try to deallocate them on destruction.

As in the other C-like languages, string constants (both single and double quoted) can have escape sequences in them. We've dealt with one of these already ("\n"). The full list is:

\t

ASCII Tab character (9).

\n

ASCII newline character (10).

\a

ASCII alarm character (7).

\r

ASCII carriage return (13).

\b

ASCII backspace (8).

\x XX

Two digit hex character value (examples: "\x1f", "\x07")

\ OOO

1 to 3 character octal character value. (examples: "\0", "\141")

\ literal-newline

If you put a backslash in front of the end of the line in a string, the newline is ignored. This allows you to wrap large strings across multiple lines.

Control Structures

Crack 0.2 only supports two control structures: the "if/else" statement and the "while" statement. "if" runs code blocks depending on whether a condition is true or false:

    import crack.io cout;
    if (true)
        cout `true is true\n`; 
    else
        cout `something is wrong\n`;

The code above will always print out "true is true".

If we wanted to do something a little more useful, we could have used it to check the command line argument:

    import crack.sys argv;
    import crack.io cout;
    
    if (argv.count() > 1 && argv[1] == 'true')
        cout `arg is true\n`;
    else
        cout `arg is false\n`;

There's a lot of new stuff going on here: first of all, we're importing the "argv" variable form crack.sys. This variable contains the program's command line arguments.

count() is a method (a function attached to a value called "the receiver") that returns the number of items in argv. argv[1] accesses item 1 of the argument list (indexes are zero-based, so item 1 is the second element of the sequence).

The "&&" is a short-circuit logical and: it returns true if both of the expressions are true, but it won't evaluate the second expression unless the first is true. This is important in this case, because if we were to check argv[1] in a case where argv had less than two elements, a fatal error would result.

There is also a "||" operator which is a short-circuit logical or. It returns true if either expression is true but does not evaluate the second expression if the first is true.

The if statement need not be accompanied by an else:

    if (argv.count() > 1 && argv[1] == 'true')
        cout `arg is true\n`;
    cout `this gets written no matter what the args are\n`;

The code in an if or an else can either be a single statement, or a sequence of statements enclosed in curly braces:

    if (argv.count() > 1 && argv[1] == 'true') {
        cout `arg is true\n`;
        cout `and so are you!\n`;
    }

You can also chain if/else blocks:

    argCount := argv.count();
    if (argCount > 2)
        cout `more than one arg\n`;
    else if (argCount > 1)
        cout `just one arg\n`;
    else
        cout `no args.\n`;

Note that blocks of code in curly braces can include the definitions of new variables that are only visible from within that block. Each block is a namespace that inherits definitions from the outer namespace. The top-level code in the file is the module namespace.

The while statement

The while statement repeatedly executes the same code block while the condition is true. For example, we could iterate over the list of arguments with the following code:

    import crack.sys argv;
    import crack.io cout;
    
    uint i;
    while (i < argv.count()) {
        cout `argv $i: $(argv[i])\n`;
        i = i + 1;
    }

Note that the code in the while is enclosed in curly braces. In general, the code managed by a control structure can either be a single statement, or a group of statements enclosed in curly braces. The if statement works the same way.

This example also introduces the primary feature of the back-tick operator: variable interpolation. A dollar sign followed by a variable name formats the variable. A dollar sign followed by a parenthesized expression formats the value of the expression.

Functions

Functions let you encapsulate common functionality. They are defined with a type name, an argument list, and a block of code, just like in C:

    int factorial(int val) {
        if (val == 1)
            return 1;
        else
            return val * factorial(val - 1);
    }

Also note that Crack supports recursion: you can call a function from within the definition of that function.

You can define a function that doesn't return a value by using the special "void" type:

    void printInt(int i) {
        cout `$i\n`;
    }

Primitive types are always passed "by value." The system makes a copy of them for the function. If they are high-level types, you can modify the objects that they reference.

Multiple functions can share the same name as long as their arguments differ: this feature is called overloading. For example, rather than "printInt" above, we could have defined a print function for multiple types:

    void print(int64 i) {
        cout `int $i\n`;
    }
    
    void print(uint64 u) {
        cout `uint $u\n`;
    }
    
    void print(String s) {
        cout `String $s\n`;
    }

The compiler chooses a function using a two-pass process: the first pass attempts to find a match based on the argument types without any conversions. The second pass attempts to find a match applying conversions whenever possible.

The general order of resolution in both passes is:

So for example, if we called print() with a uint64 parameter, the resolver would check the first print, then check the second print, find a match and use print(uint64 u). If we called it with an int32 parameter, the resolver would try all three functions, and not find a match. It would then repeat the search with conversion enabled and immediately match the first function, because int32 can implicitly convert to int64.

We mentioned searching across namespaces: functions can be defined in most block contexts, including within other functions:

    void outer() {
        void inner(int i) {
            cout `in inner\n`;
        }
        
        inner(100);
    }
    
    # we can't call "inner() from here...

If there were another function, "inner(uint u)" defined in the same scope as outer(), the resolver would consider inner(int i) prior to inner(uint u). It would be an error to define an inner(int i) outside outer, because this would hide the definition in the parent scope.

Note that it is currently an error to use instance variables from the outer function in the inner function:

    void outer() { int a; int inner() { return a; } } # DOESN'T WORK

Attempting to do this will result in a compile-time error.

[it should be noted that a future version of Crack will support this partially, without assignment, and will also support a limited form of closure]

Operators

Crack supports the complete set of C operators. As in C++, they can be used with non-numeric types through Operator Overloading.

Comparison Operators

Comparison operators compare two values and return a boolean.

==

True if the two values are equal. 1 == 1 is true.

!=

True if the two values are not equal. 1 != 2

>

True if the left value is greater than the right value. 2 > 1

<

True if the left value is less than the right value. 1 < 2

>=

True if the left value is greater than or equal to the right value. 2 >= 2

<=

True if the right value is greater than or equal to the left value. 2 <= 2.

is

True if the object on the left is identical (not merely equal) to the object on the right. This isn't defined for numbers, only for aggregates and primitive pointer types. It essentially checks for the equivalence of the pointers.

Basic Arithmetic

All integer and floating point types support the basic arithmetic operators.

+

Add two values. 2 + 2 == 4

-

Subtract two values. 4 - 2 == 2

/

Divide one value by another. 6 / 3 == 2

*

Multiply one value by another

Unary plus and minus are also supported, so we can say:

    x := 2;
    y := 10 + -x;  # y is 8
    z := 10 - +x;  # z is also 8

Unary plus and minus are special when applied to constants. In this case, they create a new constant, preserving their adaptiveness.

Bitwise Operators

All of the integer types support the following bitwise operations:

&

Bitwise and. 5 & 4 == 4

|

Bitwise or. 5 | 4 == 5

<<

Shift all bits left by the amount specfied on the right. 1 << 2 == 4

>>

Shift all bits right by the amount specified on the right. 4 >> 2 == 1 If this is done on a signed operator, this is an arithmetic shift. Arithmetic shifts preserve the sign of the value shifted.

^

Exclusive or. 5 ^ 6 == 3

Augmented Assignment

All of the binary operators but the comparison operators can be used with assignment to do C-like "augmented assignment." In general, the augmented expression x op = y is equivalent to x = x op y.

So specifically, the augmented assignment operators are: +=, -=, /=, *=, &=, |=, >>=, <<=, ^=.

The Ternary Operator

Like the C family of languages, crack supports the "ternary operator." The ternary operator evaluates to one expression or another based on the results of a boolean expression:

    a = b ? c : d;

If b is true, a will be set to c and d will not be evaluated. If false, a will be set to d and c will not be evaluated..

c and d may be of different types. If they are, the result will be of the same type as c if d is of a type derived from c's type (see Inheritence below) and the same type as d if c is of a type derived from d.

If the types of c and d do not share a common ancestor, the compiler will first attempt to convert d to the c's type. That failing, it will attempt to convert c to d's type. If none of this works, it will finally give an error.

Precedence

Operator precedence is the same as in C. So the rules are (highest precedence first):

Classes

Classes are a feature of object oriented programming languages that combine a set of data variables with a set of special functions called "methods." As a simple example of a class, consider the representation of an x, y graphics coordinate:

    import crack.lang Writer;
    import crack.io cout, Formatter;

    class Coord {
        int x, y;
        
        oper init(int x0, int y0) : x = x0, y = y0 {}
        oper init() {}

        void writeTo(Writer out) {
            Formatter(out) `Coord($x, $y)`;
        }
    }

This class has two "instance variables:" x and y. These get bundled together in a package whenever we create an instance of the class.

The "oper init" syntax creates a constructor, which is a special function that gets called when an instance of the class is created. The constructor performs basic initialization of all of the instance variables. The second "oper init", the one without arguments, is called the "default constructor." As in C++, default constructors get generated automatically if the class has no other defined constructors. If the class does define constructors, and you want a default constructor, you have to specify one explicitly as we've done above.

We can create an instance of Coord like so:

    c := Coord(3, 4);

Alternately, we can use a more C-like syntax:

    Coord c = {3, 4};

Both of these are just different syntactic flavors of the same thing: in both cases we're defining a variable "c" that is a reference to a Coord object. The system initializes this variable by:

Note that the all variables of class types are references - they behave very much like pointers in C. So if we were to initialize one variable from another, both variables would refer to the same object:

    c := Coord(3, 4);
    d := c;
    c.y = 5; # d.y is now also 5

This is different from the way that the primitive types behave. Primitive types are always passed "by value." So:

    c := 100;
    d := c;
    c = c + 1; # c is now 101, d is still 100

We can tell if two variables are references to the same object using the special is operator:

    c := Coord(1, 2);
    d := c;
    e := Coord(1, 2);
    if (c is d)
        cout `this will always be printed\n`;
    if (c is e)
        cout `this will never be printed\n`;

Note that identity (the property tested by the is operator) in Crack is a different concept from equality (as tested by the == operator). Two objects have the same identity if their underlying references are equal. However, references to two different object may still be equal if they have the same state (as determined by the cmp() method). In the example above, it might be reasonable to expect that c and e are equal, since they both have values (x = 1, y = 2), although in fact they would not be unless Coord implemented a cmp() method which provided this logic. The cmp() method provided by Object is simply an identity check.

There is a special constant, "null" which allows you to clear these kinds of variables so that they don't reference any object.

    # initialize c to null, then set it conditionally
    Coord c = null;
    if (positive)
        c = Coord(1, 1);
    else
        c = Coord(-1, -1);

You can use the is operator on null values:

    void drawImage(Coord pos, Image img, Coord size) {
        if (size is null)
            copyImage(pos, img);
        else
            stretchImage(pos, img, size);
    }

For classes derived from Object, null values are always treated as false:

    Coord c = null;
    if (!c)
        cout `this will always be printed\n`;

Our Coord class also has a writeTo() method. This allows us to implement the writeTo() method which controls how an Object is written using the back-tick operator. For example:

    cout `$(Coord(10, 20))\n`; # prints "Coord(10, 10)" to standard output.

writeTo() uses the instance variables x and y. One characteristic of methods is that instance variables and other methods can be used without qualification (you don't need a "self" or "this" variable, although this is possible, see below). As another example, we could define a method to give us the square of the distance from the origin as follows:

    int distOrgSquared() {
        return x * x + y * y;
    }

We could then add this information to our writeTo() method:

    void writeTo(Writer out) {
        Formatter(out) `Coord($x, $y) [dist squared = $(distOrgSquared())]`;
    }

Methods also have a a special variable called "this". Just as in C++, this refers to the object that the method has been called on. In traditional Object-Oriented parlance, this object is called "the receiver."

We could have rewritten distOrgSquared() as follows:

    int distOrgSquared() {
        return this.x * this.x + this.y * this.y;
    }

The this variable is mainly useful for passing the receiver to other functions.

Classes are Variables

In addition to being compile-time entities, Classes are also variables that can be accessed at runtime. They are of type Class. So, for example, we can do this:

    class Foo {}
    Class foo2 = Foo;
    if (foo.isSubclass(Object))
        cout `Foo is an Object\n`;

Constructors

We mentioned the "oper init" functions earlier. These are called constructors. In Java and C++, constructors are defined using a function that looks like the class name. In the interests of providing uniform syntax for all special methods, Crack uses the "oper" keyword to introduce overloaded operators and special methods, including the constructors and destructors.

Constructor definitions have some special syntax. The return type can be omitted, and you can provide an initializer list for member variables and base classes.

In the example above, we defined two constructors:

    oper init(int x0, int y0) : x = x0, y = y0 {}
    oper init() {}

In the first case, the initializer list initializes the x and y member variables from the arguments x0 and y0. Note that the initializers are specified using assignment syntax: "x = x0" instead of the construction syntax that C++ would have used: "x(x0)".

The construction syntax can be used, too, but it has a different meaning. Construction syntax means "construct the variable with the given arguments." Assignment syntax means "initialize the variable from the given value."

So, for example, "x(x0)" would be equivalent to "x = int(x0)", which is perfectly legal. The uses for these two types of syntax becomes more obvious when we deal with members that are themselves class instances.

For example, let's say that we want to define a line segment:

    class LineSegment {
    
        # two coordinates
        Coord c0, c1;
        
        ## Construct from two coordinates.
        oper init(Coord initC0, Coord initC1) : 
            c0 = initC0,  
            c1 = initC1 {
        }
        
        ## Construct from raw x and y values
        oper init(int x0, int y0, int x1, int y1) :
            c0(x0, y0),
            c1(x1, y1) {
        }
    }

In the first constructor, we're using the assignment syntax because we want to bind the objects passed in (initC0 and initC1) to the c0 and c1 variables. If we had instead used construction syntax:

    oper init(Coord initC0, Coord initC1) : 
        c0(initC0),  
        c1(initC1) {
    }

the compiler would have tried to find a Coord constructor that accepts another Coord object as an argument. Since there is no such constructor, we would have gotten an error. We could have instead done this:

    oper init(Coord initC0, Coord initC1) : 
        c0(initC0.x, initC0.y),  
        c1(initC1.x, initC1.y) {
    }

This would have called the two argument constructors and created two new Coord objects for c0 and c1. There's an important difference between this and the assignment syntax we started with: with the assignment syntax, c0 and c1 become references to the objects that were passed into them. If we did this:

    Coord c0 = {10, 10}, c1 = {20, 20};
    ls := LineSegment(c0, c1);
    c0.x = 20;  # l.c0.x is now also 20.

changing c0.x in this case also changes the value within ls because the ls's c0 is the same object as the caller's c0. If we had instead using the construction syntax, ls would have had its own copies of the Coord objects, and changing c0's x value wouldn't have had any effect on ls.

If you don't specify an initializer for one of your instance variables, the constructor will initialize the variable based on whatever initializers you gave it in the instance variable definition. So, for example, if we wanted coordinates to default to "-1, -1" for some reason, we could have done this:

class Coord { int x = -1, y = -1; }

As with ordinary variables, the default constructor is used if no initializers are specified.

Initializers are not necessarily run in the order that you specify them: they are run in the order of member definition. So in our examples above, if we had specified an initializer list of ": y = y0, x = x0", x still would have been initialized first.

You can define as many constructors as you want as long as their arguments have different types. This is another example of overloading: the compiler can tell the difference between them from their argument types.

The default constructor is the constructor without any arguments. If you don't define any constructors in your class, the compiler will attempt to generate a default constructor for you - it will generate a constructor that initializes the members with their variable initializers, using their default constructors if there were no initializers.

In future versions of Crack, if a class defines no constructors, it will attempt to inherit all of the constructors of the base classes (see Inheritance).

Inheritance

One important property of object-oriented programming languages is inheritance: the ability to create a new class by extending an existing class. Crack supports inheritance with a syntax similar to that of C++. Let's say that we wanted a coordinate like in our last example, only we also wanted it to have a name. We could create a new class for this:

    class NamedCoord {
        int x, y;
        String name;
    }

but then we'd have to write everything that we wanted to reuse over again in the new class. And every time we fixed a bug in Coord, we'd have to fix the same bug in NamedCoord. Inheritance provides a better way to reuse code:

    class NamedCoord : Coord {
        
        String name;
        
        oper init(int x, int y, String name0) : Coord(x, y), name = name0 {}
        
        void writeTo(Writer out) {
            Formatter(out) `NamedCoord($x, $y)`;
        }
    }

In the example above, we're creating a new class called NamedCoord that is derived from Coord. It will inherit all of Coord's instance variables and methods. We call Coord NamedCoord's base class. NamedCoord is a subclass or derived class of Coord.

In addition to allowing reuse of code, inheritance also has the advantage that instances of the derived class can be used in situations that call for an instance of the base class. So if we had a function that accepted a Coord, we could pass it a NamedCoord:

    void drawLine(Coord c0, Coord c1) { ... }
    
    NamedCoord c1 = {1, 2, 'c1'}, c2 = {3, 4, 'c2'};
    drawLine(c1, c2);

Note that this is not conversion: instances of NamedCoord are already instances of Coord. As such, function calls passing classes derived from argument types will match in the first resolution pass.

One of the first things we have to deal with in creating NamedCoord is Coord's constructor. Note that in the new initializer list, we have an entry for the base class as well as for the name variable. If we didn't specify a constructor, the compiler would have used the default constructor if there was one.

Like member initializers, base classes are initialized in the order in which they are defined. All base class initializers are run before any of the instance variable initializers for the class. Consider the following example:


    import crack.io cout;
    
    class A {
        oper init(String name) { cout `initializing $name\n`; }
        oper init() {}
    }
    
    class B : A {
        A a1, a2;
        
        # the order of initializers is ignored.
        oper init() : a2('a2'), a1('a1'), A('base class') {}
    }
    
    # create a temporary instance of B, prints
    B();

This will print the following:

    initializing base class
    initializing a1
    initializing a2

Going back to our NamedCoord example, we also defined another writeTo() method:

    void writeTo(Writer out) {
        Formatter(out) `NamedCoord($x, $y)`;
    }

We did this because Coord's writeTo() method writes out "Coord($x, $y)". We want to write "NamedCoord($x, $y)".

Sometimes you want to call the base class version of a function that is overridden in the derived class. Most often this is used to extend the base class functionality. Crack lets you do this by qualifying the method with the class name. For example, we could have instead overridden writeTo() like this:

    void writeTo(Writer out) {
        out.write('Named');
        Coord.writeTo(out);
    }

Multiple Inheritance

Crack supports multiple inheritance: you can have any number of base classes. With the exception of the special VTableBase class, it is an error to inherit from the same base class multiple times, even indirectly. Eventually, Crack will support this using virtual base classes like in C++. We'll talk more about multiple inheritance in the next section.

Destructors

In addition to "oper init" constructors, Crack classes can have destructors. These are called by Object.oper release() when an object's reference count drops to zero. They can also be called explicitly by objects implementing their own memory management strategies.

You can implement the destructor for a class by defining an "oper del" method:

    class Noisy {
        oper del() { cout `Noisy object deleted\n`; }
    }
    
    Noisy x;  # Prints a message when x goes out of scope.

After calling the user defined code, oper del automatically calls oper release on all of the instance variables that have an oper release method (see Reference Counting). It then automatically calls the oper del method of each of its base classes. In both cases, these calls are in reverse order of initialization: first the instance variables in the reverse order that they are defined, then the base classes in the reverse order that they are listed.

Because of all of this automatic destruction, most oper del method don't need to have any user code at all - everything takes care of its own cleanup. If you don't define an oper del method, the compiler will generate one by default.

The only cases where you really need to define an oper del method are in the case of certain external consequences: for example, a File object might want to make sure that its file descriptor is closed upon destruction.

It should be noted that an object must do nothing to change its own reference count during processing of oper del, such as assigning it to an external variable, or inserting it into an external collection. If you do this, the object will still be deleted and the external reference will be invalid. Future versions of Crack will have some degree of protection against this, but for now - don't do it.

The Special Base Classes

There are three "special" base classes in Crack:

The first two are available from any Crack code, FreeBase must be explicitly imported from crack.lang.

Object is the default base class for all other classes. If you don't specify any base classes, your class will implicitly be derived from Object. (that's not entirely true: there is a bootstrapping mode in which classes have no default base class, but that's another story).

Object supports a general set of functionality that is applicable to most types, including:

VTableBase is the base class for all classes with a vtable, which is the implementation mechanism of virtual functions. It is a special class that is defined by the compiler, and it has no special contents other than a hidden vtable pointer instance variable.

Object is derived from VTableBase, so by default most methods in Object and all of its derived classes are virtual.

FreeBase is a base class that can be used in cases where you don't want to be derived from Object (like when defining a class that mirrors a C structure). FreeBase does not support virtual functions, memory management, or anything you don't put into your derived class. If you're going to use it, you should at minimum figure out how to deal with memory management.

There are situations where you get a base class but you suspect or know that it is a derived cast. Like C++, Crack lets you typecast a base class to a derived class using cast() and unsafeCast().

Typecasting is generally deprecated in object-oriented paradigms. However, there are certain situations where it is necessary, and others where it is just the easiest way to get something done. Consider the case of containers:

    import crack.container Array;
    
    # create an array of coordinates
    coords := Array();
    coords.append(Coord(1, 2));
    coords.append(Coord(3, 4));

We've stored a couple of Coord objects in the array, but we can't use these directly because Array stores an array of objects:

    # gives an error because there is no drawLine(Object, Object) function.
    drawLine(coords[0], coords[1]);

This is the same problem that early versions of Java had - it will be fixed in a later version of Crack through the introduction of generics. But for now we can work around this with a type cast:

    drawLine(Coord.cast(coords[0]), Coord.cast(coords[1]));

The cast() function is defined for all classes that derive from VTableBase (including all classes derived from Object). If you attempt to cast an object to a type that it is not an instance of, the program will abort with a (fairly useless) class cast error.

For classes not derived from VTableBase, you can use unsafeCast():

    import crack.lang FreeBase;
    class Rogue : FreeBase {}
    
    FreeBase f = Rogue();
    Rogue r = Rogue.unsafeCast(f);

Unlike cast(), unsafeCast() does no checking whatsoever - the programmer is responsible for insuring that the object is of the type that he is casting it to. If it's not, unsafeCast() will happily deliver a reference to an invalid object.

For classes derived from VTableBase, you can verify prior to doing an unsafeCast() in the same method that cast() does, by looking at the associated class object:

    Foo obj;
    Coord c = null
    if (obj.class.isSubclass(Coord))
        c = Coord.unsafeCast(c);

Every object derived from VTableBase has a special class attribute - it's like an instance variable, only you can't assign it. It is implemented using a virtual function. The class attribute returns the object's class (recall that classes are also values that exist at runtime). So we could also do something like this:

    Coord c;
    if (c.class is Coord)
        cout `this will always get printed\n`;

Note that you usually don't want to use the is operator to check the class because it's usually acceptable for the class to be either the same as the class you are checking for or derived from the class you are checking for. Use isSubclass() instead.

Special Methods

Certain methods have special meaning within the language or the standard libraries.

final is used to designate methods that are inherently non-virtual - even if the class derives from VTableBase, the method will not be turned into a virtual method. As such, the method can not be overloaded. It may also be invoked on a null value.

oper init

A constructor.

oper del

The destructor.

bool toBool()

(final) If this method is defined, instances of the class can be implicitly converted to null (see Implicit Conversion). Object implements this.

This will be replaced with the more general "oper to type" form in a future version of the language.

bool isTrue()

Returns true if the object is "true" when converted to a boolean. This is a virtual function defined in Object that is called for non-null values by toBool(). It allows derived classes to easily override conversion to bool.

int cmp(Object other)

Compare the object with another object. Return a value that is greater than zero if the receiver is greater than other, returns a value less than zero if it is less than other, and returns zero if the two objects are equal.

If you implement this, all of the normal comparison operators ("==", "!=", "<", ">", "<=" and ">=") will work for you.

void writeTo(Writer writer)

Write the receiver to writer. This is used to allow the object to write itself in its most natural representation - whatever that means for the object type.

void format( type object)

This method is used by the back-tick operator to format objects of specific types in specific ways. See The Formatter Interface.

Operator Overloading

The oper keyword originated as a short form of the "operator" keyword in C++ which is designed to allow you to define your own implementation of the operators (e.g. "+", "-", ">" ...).

The following operators can be overloaded:

oper +( type other)

Binary plus.

oper -()

Unary negate.

oper -( type other)

Binary minus.

oper *( type other)

Binary multiply.

oper /( type other)

Binary divide.

oper %( type other)

Binary remainder.

oper []( type index)

Array element access.

oper []=( type index, type value)

Array element assignment.

oper --()

Unary pre-decrement (post-decrement, pre-increment and post-increment don't exist yet, not sure why this one does).

oper !()

Unary boolean negate.

oper ~()

Unary bitwise negate.

oper ==( type other)

(final) Binary "equals." Object implements this as "cmp(other) == 0".

oper !=( type other)

(final) Binary "not equals." Object implements this as "cmp(other) != 0".

oper <( type other)

(final) Binary "less than." Object implements this as "cmp(other) < 0".

oper <=( type other)

(final) Binary "less than or equal to." Object implements this as "cmp(other) <= 0".

oper >( type other)

(final) Binary "greater than." Object implements this as "cmp(other) > 0".

oper >=( type other)

(final) Binary "greater than or equal to." Object implements this as "cmp(other) >= 0".

oper |( type other)

Bitwise or.

oper &( type other)

Bitwise and.

oper <<( type other)

Left shift.

oper >>( type other)

Right shift.

oper ^( type other)

Exclusive or.

oper op =( type other)

(all Augmented Assignment operators). If these exist, they will be called when the augmented assignment operators are used. If they are not defined, they can still be used if the type defines the simple operator that it is based on. So for example, if a class defines "oper +", you can use the += opertator on an instance without defining it.

As in the last section, final means the method is non-virtual and cannot be overriden.

The primitive types mostly have intrinsic implementations of the operators.

Certain operators can not be overloaded:

Method Resolution in Classes

When resolving an overloaded method, Crack uses the same rules as for normal function resolution: check each method in each namespace in the order defined, then do the same in the parent namespaces. If no result is found, repeat with conversions.

For classes, "parent namespaces" are the base classes. So if we have:

    class Base {
        void func(B b) {}
        void func(A a) {}
    }
    
    class Derived : Base {
        void func(A a) {}
    }

when we try to resolve func(val), the compiler will check:

This is somewhat problematic because, in the case above, if B is derived from A we probably don't want to override the more specific func(B) when we override the more general func(A), but that's what will happen because Derived.func(A) will match calls to func with B as an argument.

This results in even more weirdness when we deal with Base as an abstract interface:

    Derived().func(B());   # calls Derived.func(A)
    Base base = Derived();
    base.func(B());        # calls Base.func(B)!

For these reasons, method resolution will change in a future version of crack so that overrides will not be checked as part of the method set in the override's context - they will only be checked in the base class where they were first defined.

Modules

We've been making casual use of the import statement throughout this document. The import statement is used to import symbols from modules, for example we've use it to import the global variable cout from the crack.io module:

    import crack.io cout;

The general format of the import statement is:

    import  module-name  name-list;

module-name is a dot-delimited module name. name-list is a comma separated list of functions, variables and classes defined in the module that you wish to import into the current namespace.

Module names correspond directly to directory and file names in the Crack "library path." When resolving a module name, the system:

So for example, to load the crack.lang module for the first time we:

The crack library path is specified with the "-l" option values on the command line. By default, the executor inserts the $PREFIX/lib/crack$VERSION path and the current directory into the beginning of the search path.

Variables defined in the module top-level are not released until program termination. Cleanups are called in the reverse order of definition.

The Formatter Interface

As we've shown, the back-tick operator allows us to do formatted output of static data, variables and expression values:

    int a;
    cout `a = $a, a + 1 = $(a + 1)\n`;

Expressions of this form are called "interpolation expressions," because they interpolate values into format strings. The interpolation expression above is equivalent to the following code:

    if (cout) {
        cout.format('a = ');
        cout.format(a);
        cout.format(', a + 1 = ');
        cout.format(a + 1);
        cout.format('\n');
    }

The cout variable is defined in crack.io as an instance of Formatter. Interpolation expressions are not limited to use with Formatter, they can be used on any object that supports conversion to boolean and format() methods for all of the values in the expression. For example, we could create our own formatter that could be used in the expression above:

    class SumOfInts {
        int total;
        
        ## ignore static strings.
        void format(StaticString s) {}
        
        ## make integer formatting add the value to the sum.
        void format(int val) { total = total + val; }
    }
    
    SumOfInts sum;
    sum `a = $a, a + 1 = $(a + 1)\n`;
    
    # sum.total is 2a + 1

More often when doing this, you'll want to derive from formatter and extend its functionality:

    import crack.io Formatter;
    
    ## Formatter that encloses strings in quotes.
    class StrQuoter : Formatter {

        oper init(Writer w) : Formatter(w) {}

        ## implemented so we don't quote StaticString
        void format(StaticString s) { rep.write(s); }
        
        ## Write strings wrapped in quotes.
        void format(String s) {
            rep.write('"');
            rep.write(s);
            rep.write('"');
        }
    }
    
    String s = 'string value';
    
    # wrap standard output's underlying writer with our formatter and use it 
    # to format the value.
    StrQuoter(cout.rep) `value is $s\n`;

Note that we had to reimplement format(StaticString) in the example above. The static content in an interpolation expression is of type StaticString like all string constants in crack. If we had not defined format(StaticString), the normal format method would have been used for the "value is " string (this is because of the current resolution order of methods: it will be changed in a future version of the language so that we don't have this problem).

You can create your own Formatter objects given a Writer object. There are already a few specializations of this class in the crack.io module.

StringFormatter allows you to construct a string using a formatter:

    import crack.io StringFormatter;
    
    f := StringFormatter();
    f `some text`;
    s := f.createString();  # s == "some text"

Reference Counting

Reference counting is a simple form of memory management. Every object is assigned a reference count, which is essentially the number of other objects or variables referencing the object. When a new reference is added, the reference count is increased. When a reference is removed, the reference count is decreased. When the reference count drops to zero, the destructor is called and the object's memory is released.

Crack's reference counting mechanism is actually implemented in the language as part of the implementation of Object in the crack.lang module. The compiler uses two special hooks - the "oper bind" and "oper release" methods - to notify an object when a reference is being added (by calling "oper bind") and released (by calling "oper release"). These methods are implicitly non-virtual: they cannot be overridden by a derived class, do not make use of the vtable and therefore they can be safely applied to null objects.

It is possible to implement the bind and release methods in classes derived from FreeBase or VTableBase to implement your own memory management. For example, the Wrapper class in crack.exp.bindings uses it oper release to always free the Wrapper instance when it is released, allowing it to essentially exist in the scope in which it is defined. Note that if you were to pass such an object out of that scope, the results would be undefined.

For efficiency, Crack does not bind and release every time you might expect: for one thing, objects passed as function arguments are not bound and released for the function call - we know that the external caller has a reference to these objects. The called function can simply borrow them.

Crack also has the notion of "productive" and "non-productive" expressions. A productive expression is one that produces a reference. A non-productive expression simply borrows an existing reference. Variable references are always non-productive. Functions returning values are (almost) always productive.

The compiler will call oper bind when assigning a non-productive value to a reference, or when returning a non-productive value. It will call "oper release" when a variable goes out of scope or when productive temporary value is cleaned up. In general, temporaries get cleaned up at the end of the outermost expression. For the "&&" and "||" operators, temporaries get cleaned up for the secondary expression prior to cleanup of outer expressions.

There's one thing you need to be aware of about reference counting: the mechanism is susceptible to the problem of reference cycles - this is when an object directly or indirectly references itself. When this happens, the entire cycle of objects can become unfreeable, resulting in a memory leak. This is because each object retains a reference from the last object in the sequence, so even when all external references are removed, none of the objects will drop to a reference count of zero.

There's currently no good way around this: you just have to be aware that if you create a reference that can introduce a cycle, you'll need to take certain remedial measures to avoid leaking the objects. This is typically accomplished by breaking the cycles at some point, normally during the destruction of some external object that references the cycle without participating in it.

Primitive Bindings

Crack allows you to directly import and call functions from shared libraries. A special variation of the import statement allows you to import symbols from a shared library:

    # import malloc() and free()
    import "libc.so.6" free, malloc;

After doing this, it is necessary to provide declarations of the functions you've imported:

    byteptr malloc(uint size);
    void free(byteptr val);

You can then use them like any other function:

    mem := malloc(100);
    free(mem);

Many C functions require special arguments like pointers to integers or structures that are not natively supported in Crack. However, we can often get the effect of these kinds of things by making use of the fact that all Crack objects are essentially pointers to the corresponding C structures:


    # import "free()"
    import "libc.so.6" free;
    void free(voidptr mem);

    # define a wrapper around int
    class IntWrapper {
        int val;
        
        # free the structure's memory when we go out of scope.
        oper release() { free(this); }
    };

    # import C function "void doSomething(int *inOutVal)"
    import "somelib.so" doSomething;
    void doSomething(IntWrapper inOutVal);
    
    # call it
    v := IntWrapper();
    v.val = 100;
    doSomething(v);

A set of wrapper types for this sort of thing is already defined in the crack.exp.bindings module. Instead of defining our own IntWrapper, we could have just done:

    import crack.exp.bindings IntWrapper;
    v := IntWrapper(100);

All of the Wrapper classes in this module derive from the Wrapper base class. Wrapper has the oper release() method definition, which frees the object when it goes out of scope. Note that wrappers must not be passed out of scope:

    class Broken {
        IntWrapper wrapper = null;
        
        IntWrapper bad() {
            i := IntWrapper(100);
            wrapper = i;           # BAD.  Instance variable will reference 
                                   # a deleted object.
            return i;              # BAD.  Caller will get a deleted object.
        }
    }

The crack.exp.bindings module also defines an Opaque class. This can be used for structures returned from C functions that contain no user-servicable parts. For example:

    import crack.exp.bindings Opaque;
    import "libFoo.so" Foo_Create, Foo_Destroy;
    class Foo : Opaque {}
    
    # create a Foo instance, then destroy it.
    foo := Foo_Create();
    Foo_Destroy(foo);

Opaque doesn't attempt to free the object like Wrapper derivatives, so it is important that you manage the object correctly yourself.

Some times C functions want to accept a function pointer to use as a callback. You can get this effect by defining a function and using a parameter type of voidptr for the callback parameter:

    import "libFoo.so" Foo_SetCallback;
    void Foo_SetCallback(Foo obj, voidptr callback);
    
    void myCallback(Foo obj) { cout `callback called\n`; }
    Foo_SetCallback(Foo_Create(), myCallback);

This won't work for overloaded functions: the compiler won't be able to tell which overload to use.

Crack's current approach to bindings is not without its problems:

Threading

As it stands, Crack 0.2 is written with little regard for threads. You can attempt to use the normal threading libraries, if you like, but you're likely to run into some problems. In particular, you should be aware that the reference counting mechanism is not thread-safe, so memory management will most likely fail in really hard to debug ways if you share lots of objects between threads.

This will be remedied in a future version through the introduction of atomic operations, which will allow the reference counting mechanism to be implemented safely - with some cost to performance of threaded applications.

Debugging

Crack has only minimal support for debugging in 0.2. If your program seg-faults or aborts, you can at least get a sparse stack-trace by running it under a fairly recent version of GDB (7.0 or later).

Appendices - Libraries

Crack comes with its own (sparse but growing) set of standard libraries. These are loosely organized as the modules under crack and crack.exp. crack.exp includes modules that are designated as "experimental." These have interfaces that are likely to change as Crack matures. The other libraries are also changeable, but should be frozen by version 1.0.

Special Language Types

Strings and Buffer Types

Strings are derived from Buffer. A buffer is just a byteptr (stored in the instance variable buffer) and a size (stored in the instance variable size).

The memory that a buffer references is presumed to be read-only, but nothing in the language enforces this. For cases where a writable buffer is desired, use something derived from a WriteBuffer subclass. WriteBuffer inherits from Buffer, but the read-only requirement is relaxed: it is legal to modify the buffer of a WriteBuffer.

Most output functions accept a Buffer, most input functions accept a WriteBuffer. For cases where you want a WriteBuffer that cleans up after itself, use a ManagedBuffer.

Constant strings are actually of type StaticString. This is a class for strings with buffers that are not to be deleted (because static strings can reference data in read-only memory segments).

Like any object, a buffer can be converted to a bool. A buffer converts to true if it has a non-zero size.

Any two buffers can be compared - a byte-for-byte comparison is performed. If one buffer is identical to the beginning of the other, but is shorter, the shorter buffer is "less than" the larger buffer.

The String class is one of the few types that don't need to be imported. Buffer and its other descendents can be imported from crack.lang.

ManagedBuffer

A managed buffer is a WriteBuffer that manages its underlying buffer memory. These are designed for use with IO operations. The typical use case is to read into it and then convert it to a string, either by copying the underlying buffer or by passing ownership of it to the new string:

    # create a 1K managed buffer.
    ManagedBuffer buf = {1024};
    
    # read it, store the size of what we read
    buf.size = cin.read(buf);
    
    # convert it to a string and take ownership of the buffer (the second 
    # argument of the constructor is "takeOwnership")
    String s = {buf, true};

Note that Reader includes a utility method that does this more intelligently (it only transfers ownership if the amount read is at least 75% of the buffer size):

    # 's' is a string.
    s := cin.read(1024);

SubString

The SubString class provides a lightweight string that references the buffer of an existing string. You can create one by providing the constructor with an existing string, a start position and a size:

    import crack.lang SubString;
    
    s := SubString('this is a test', 10, 4);

SubString is derived from String, so you can use it in the same way that you would a string without the cost of managing and copying a portion of the buffer.

MixIn

The MixIn class allows you to do multiple inheritence safely - without risk of breaking reference counting. The MixIn approach is a stand-in for virtual inheritence, which Crack 0.2 does not yet support.

MixIn defines bind and release operators and a virtual function called getObject() that knows how to get the Object instance that the class is derived from. This allows bind and release to operate on a reference to the MixIn (which is not derived from Object) by convering it to an Object with a getObject() method implemented by the concrete class.

For example, we could use MixIn to define a pure interface like this:

    import crack.lang MixIn;

    class MyIface : MixIn {
        void doSomething() { die('MyIface.doSomething() not implemented.'); }
    }
    
    class MyConcrete : Object, MyIface {
        
        # we have to define getObject() to satisfy a MixIn.
        Object getObject() { return this; }
        
        # implement the interface.
        void doSomething() { cout `something happened\n`; }
    }

Unfortunately, because a class can only exist once in another class's inheritence tree (except for VTableBase), you can only do this trick once in a single class' ancestor list. For example, if Iface1 derives from MixIn, and Iface2 also derives from MixIn, it is illegal to define a MyConcrete derived from both Iface1 and Iface2 because they share the MixIn ancestor.

Because of this single-inheritence restriction, the MixIn strategy is really more useful as a pattern than as a base class. To make a class a MixIn, do something like this:

    class MyIface : VTableBase {
        
        Object getMyIfaceObject() {
            _die('MixIn.getObject() not implemented.');
            return null;
        }
    
        oper bind() { if (!(this is null)) getMyIfaceObject().oper bind(); }
        oper release() {
            if (!(this is null)) getMyIfaceObject().oper release();
        }
        
        void doSomething() { die('MyIface.doSomething() not implemented.'); }
    }
    
    class MyConcrete : Object, MyIface {
        
        Object getMyIfaceObject() { return this; }
        
        void doSomething() { cout `something was done\n`; }
    }

This trick will allow you to define interfaces and other types of mix-ins that can be safely inherited from in arbitrary ways.

Note that all of this trickery is temporary: later versions of Crack will support virtual base classes in a form that will make this functionality a part of the language.

Containers

Containers provide support for the basic sequential and mapping datastructures expected from a modern programming language. Crack's container library is crack.container All container classes are derived from Container.

Crack containers all store objects of type Object. You can store derived types, but in order to do anything useful with them after retrieval you pretty much need to cast them to the type that you expect them to be.

For example, if we have an array of objects of type Foo with method bar(), we need to do something like this to call bar on the zeroth element:

    Foo.cast(arr[0]).bar();

Java-like generics will be supported in a later version of the language, and you'll be able to have an Array[Foo].

All containers have a count() method that returns the number of elements in the container. They also convert to boolean based on whether or not they are empty:

    c := createContainer();
    if (c)
        cout `container has $(c.count()) elements\n`;
    else
        cout `container is empty\n`;

Iterators

Iterators are like iterators in java. All container types have an iter() method which returns an Iterator object for the container.

Iterators allow you to iterate over the set of objects in a container one at a time. As an example, consider an array:

    Array arr = createArray();
    
    i := arr.iter();
    while (i.nx())
        cout `got $(i.elem())\n`;

arr.iter() creates an iterator over the array initialized at the first item. The nx() method forwards the iterator to the next element except when it is called for the first time. It returns true if the iterator is valid after being forwarded. elem() returns the element associated with the iterator.

The nx() method is a stop-gap solution - it exists because Crack doesn't have a for statement yet. nx() lets us do iteration in a while loop. Don't expect it to stick around for very long.

You can also use the more permanent next() method - this forwards the iterator regardless of whether it is pointing to the first element or not.

Arrays

Arrays are a collection that preserves element order with fast random-access assignment and retrieval. Unlike their low-level counterparts, the are safe for general use. They manage both the size of the underlying array and the reference counts of the elements.

Construct an array with the expected number of elements:

    import crack.container Array;
    Array arr = {10};  # 10 element array

The number of elements specified is the array capacity - this is the number of elements that can be added before it is necessary for the array to reallocate its underlying, low-level array.

Like other containers, arrays have a count() method that returns the actual number of elements in the array. At the time of construction, the count is zero.

To add an element to an array, use the append() method:

    arr.append('first element');
    arr.append('second element');
    cout `count = $(arr.count())\n`;  # writes "2"

Note that append() will occasionally be an O(n) operation: if we don't have capacity for a new element, append() will reallocate the array to one twice as big.

To get an element in an array, use the bracket operator:

    cout `elems: $(arr[0]), $(arr[1])\n`;

You can also replace an existing element in an array using element assignment:

    arr[1] = 'new second element';

Note that this only works to replace an element: arr[2] = 'something' would result in a runtime error.

Finally, like all containers, you can iterate over the elements of an array:

    i := arr.iter();
    while (i.nx())
        cout `elem: $(i.elem())\n`;

Linked Lists

The List class implements a linked list. Linked lists aren't very good for random access, but they do support constant time insert and append, making them preferrable to arrays in certain cases.

The List interface looks very much like the Array interface:

    import crack.container List;
    
    List l;
    
    # add some elements
    l.append('first');
    l.append('second');
    
    # iterate over them
    i := l.iter();
    while (i.nx())
        cout `got element $(i.elem())\n`;

Random access lookup, delete and insert are all available, although they are all O(n) operations:

    cout `$(l[1])\n`;       # prints "second"
    l.delete(0);            # deletes 'first'
    l.insert(0, 'first');   # reinserts 'first'

You can also push a new element onto the front of the list (equivalent to insert(0, ...) but terser and slightly faster):

    l.pushHead('zero');

pushTail() is also available as a synonym for append().

Likewise, you can pop items off the head of the list:

    elem := l.popHead();  # after the last operation, elem == 'zero'

popTail() can not be implemented efficiently: if you need this, use a DList.

Doubly-Linked Lists

The DList class implements a doubly-linked list. DList provides some additional functionality over List and can facilitate faster operations, but at the cost of higher memory usage (each DList element requires an additional pointer).

DList supports inserting and deleting the element referenced by an iterator:

    DList dl;
    fillList(dl);

    i := dl.iter();
    while (i.nx()) {
    
        # delete the 'delete me' string from the list
        if (i.elem() == 'delete me')
            dl.delete(i);
        
        # insert 'inserted' before 'before me'
        else if (i.elem() == 'before me')
            dl.insert(i, 'inserted');
    }

After insert, the iterator points to the new element. After delete, it points to the element after the deleted element.

DList also supports a bi-directional iterator:

    # get a bi-directional iterator - an argument of "true" causes 
    # the iterator to be initialized to the end of the list.  "false" would 
    # start it at the beginning.
    i := dl.bidiIter(true);
    while (i) {
        cout `$(i.elem())\n`;
        i.last();
    }

You can use a bi-directional iterator everywhere you can use a normal iterator.

In addition to the popHead() function, DList supports a popTail() function:

    DList dl;
    dl.append('first');
    dl.append('second');
    elem := dl.popTail();  # elem == 'second'

RBTree Maps

A "map" or an "associative array" is a container that stores a set of values each indexed by a key. Both keys and values can be arbitrary objects. The only kind of map that Crack 0.2 supports is red-black trees. Red-black trees provide a natural sort order, and provide O(log n) insertion and lookup with constant time rebalancing.

To create an RBTree map:

    import crack.container RBTree;
    
    map = RBTree();

To add elements:

    map['first'] = '1';
    map['second'] = '2';
    map['third'] = '3';

To access them:

    cout `$(map['first'])\n`;  # writes "1"

Iteration is the same as for arrays, except that the elements are KeyVal objects. As their name suggests, KeyVal objects have a key and val attribute:

    i := map.iter();
    while (i.nx()) {
        kv := KeyVal.cast(i.elem());
        cout `map[$(kv.key)] = $(kv.val)\n`;
    }

Error Handling

Future versions of crack will have real C++/Java like exception handling. But for version 0.2, we at least have the error module. The error module lets you record an error in a manner that can be controlled by the callers. The default action is to abort.

Here's an example of how we might use it to verify a parameter:

    import crack.exp.error err;
    
    Object checkForNull(String context, Object val) {
        if (val is null)
            err.do() `$context got a null argument`;
        return val;
    }

err.do() returns a formatter that writes to a writer stored in the err object. When this formatter is destroyed, it will call err.finish() which calls abort(), terminating the program, if err.fatal is true. If err.fatal is false, err.gotError is set to true and the program continues.

To prevent a function from aborting the program when an error occurs, we can use the ErrorCatcher class:

    import crack.exp.error err, ErrorCatcher;
    
    void myFunc(Object arg) {
        catcher := ErrorCatcher();
        checkForNull("myFunc", arg);
        if (catcher.gotError())
            cout `We got a null and don't want to die: $(catcher.getText())\n`;
    }

When you create an ErrorCatcher, it sets error.fatal to false and sets error.writer to an internal StringWriter, causing the error handler to have no visible effect. The caller can then check to see if there was an error from the catcher, and obtain the error text from the catcher's getText() method. When the catcher is destroyed, it restores the error handler to its previous state.

It is not advisable to use this mechanism for errors that can conceivably indicate something other than a programming error. For example, it would not be advisable for a file interface to call a fatal error for a file not found. Better to use a lightweight return code and let the caller decide whether this represents a programming error.

As a convenience, the crack.exp.error module also provides a strerror() method. strerror() returns the error text of the last error that occurred in the C library.

Program Control (sys)

crack.sys contains symbols for interacting with the executable context. It defines the exit() function and argv variable.

The exit() function can be used to terminate the program at any point with the given exit code:

    import crack.sys exit;

    # terminate with the normal code.
    exit(0);

The argv variable is a StringArray instance that allows you to access the command line arguments:

    import crack.sys argv;

    # write out all of the args except the program name (argv[0])
    uint i = 1;
    while (i < argv.count()) {
        cout `arg $i = $(argv[i++])\n`;
    }

StringArray is a special purpose class just for storing arg lists without having to do annoying casting. You can use it as a general purpose string sequence if you like, it will be supplanted by Array[String] once crack supports generics (post 0.2).

Input/Output

The crack.io module contains crack's basic input/output interfaces. The Reader and Writer classes are the core interfaces for reading and writing data. These make use of the Buffer hierarchy and are implemented using the mix-in pattern so that they can be combined with each other and other classes.

The crack.io module defines three global variables: cin, cout, and cerr. These are wrappers around the system standard input, output and error streams. cin is a kind of Reader (an FDReader, to be precise) and cout and cerr are both Formatters. Formatter, in turn, implements Writer, so we can use these streams to illustrate the Reader and Writer interfaces.

Writer has one salient method called write(). This method writes a buffer to the underlying output object:

    cout.write('some data');

Since String is derived from Buffer, we can use the write() method to write a constant string. Classes implementing Writer must implement the write() method.

Reader provides two methods: an efficient method that reads into a WriteBuffer and a heavier form that reads a buffer of specified size and returns a string.

    ManagedBuffer buf = {1024};  # ManagedBuffer is derived from WriteBuffer
    bytesRead = cin.read(buf);  # read up to 1024 bytes into 'buf'

read(WriteBuffer) doesn't modify the size of its input buffer. It does return the number of bytes read. It is the responsibility of the caller to change the size of the buffer if desired.

We could have done something similar with the heavy-weight form:

    String data = cin.read(1024);

This is more expensive than read(WriteBuffer) because it actually allocates a buffer and a String object.

We could implement the most basic functionality of the cat command like this:

    ManagedBuffer data = {4096};
    while (data.size = cin.read(data)) {
        cout.write(data);
        data.size = 4096;
    }

cin, cout and cerr are implemented using FDReader and FDWriter. These classes implement Reader and Writer to read from and write to a file descriptor. You can obtain a file descriptor from any of the low-level IO functions on your system ("open()", for example).

The crack.io module also provides the Formatter class. See The Formatter Interface for details on how to use this.

Line Readers

The crack.readers module provides a LineReader class that lets you read from a writer a line at a time:

    import crack.readers LineReader;
    
    reader := LineReader(cin);
    String line = null;
    while (line = reader.next()) {
        cout `got line: $line\n`;
    }

The next() method returns null when there is no more data to read.

Files and Directories

crack.exp.file contains two classes that provide a basic interface for manipulating files as objects, including obtaining file information and simple reading and writing.

FileInfo

A FileInfo object can be constructed from a String, which should contain a full path name (where the file does not have to exist). The class will provide methods for obtaining information on a file (if it exists) and manipulating file and path information. If the file exists, information will include size, owner and group information, file permissions, and the like. Regardless of file existence, file information such as such as glob matching, basename, and dirname will be available.

Currently, however, the only implemented methods are matches, for matching a file name against a pattern such as '*.txt', basename to return the file name portion of a full path (optionally stripping extension), and dirname for returning just the directory portion of a full path.

Example:

    fi := FileInfo('/etc/resolv.conf');
    if (fi.matches('*.conf'))
        cout `it's a conf file!\n`;
    basename := fi.basename(true); // strip extension
    // basename contains a string 'resolv'
    basename := fi.basename(false); // don't strip extension
    // basename contains a string 'resolv.conf'
    dirname := fi.dirname();
    // dirname contains '/etc';

File

The File object can be used for reading and writing to files on disk. It extends FileInfo and so all informational functionality available with that class is also available to File.

A File should be constructed with the full path to the file and the mode to open in, which is one of 'r' or 'w' for read or write, respectively. If the file is opened in read mode, a LineReader is automatically created which allows for reading the file line by line via the nextLine method. Note that the trailing newline is include in the return value.

Read Example:

 
    f := File('/etc/resolv.conf', 'r');
    String lineBuf = null;
    lineNum := 1;
    while (lineBuf = f.nextLine()) {
        cout `$lineNum: $lineBuf`;
    }
   

When opened in write mode, a file may be written to by passing a String to the write method:

    
    f := File('/tmp/myfile.txt', 'w');
    f.write("lorem ipsum rock");

Note, there is currently no append mode; if the file already exist, it will be overwritten upon opening in write mode.

File objects may be closed with the close method. This is done automatically when the File object destructor runs (if it hasn't been closed manually).

Directory

crack.exp.dir contains the Directory class, used to iterate through a directory hierarchy. A Directory object can be constructed by passing the full path of a directory to the constructor. If the path is valid, you can iterate through the list of directories and files contained in the directory.

You do this by calling the dirIter and fileInfoIter methods. They will both return an Iterator object. A dirIter Iterator will iterate through Directory objects, while a fileInfoIter will iterate through FileInfo objects. Like all uses of Iterator currently, you must cast these objects yourself in the calling code.

For example, to recursively scan a directory for a file pattern:


void scanForScripts(Directory dir) {

    Iterator i;
    pattern := "*.crk";
    
    FileInfo curFile = null;
    i = dir.fileInfoIter();
    while (i.nx()) {
        curFile = FileInfo.cast(i.elem());
        if (curFile.matches(pattern)) {
            cout `found crack script file: $curFile\n`;
        }
    }

    // recurse to directories
    Directory nextDir = null;
    i = dir.dirIter();
    while (i.nx()) {
        nextDir = Directory.cast(i.elem());
        scanForScripts(nextDir);
    }

}
  

When a Directory object is created, it canonicalizes the name by stripping a trailing path separator (i.e. "/") if it exists. If you need a version with the trailing separator (for example, to build a full path name with string concatenation), you can call nameWithTrailing.

Regular Expressions

Crack provides a regular expression library in the form of crack.exp.regex. You use it by creating a Regex object:

    import crack.io cin, cout;
    import crack.exp.readers LineReader;
    import crack.exp.regex Regex, Match;

    rx := Regex('(\\w+)=(.*)');
    String line = null;
    LineReader src = {cin};
    while (line = src.next()) {
        Match m = rx.search(line);
        if (m)
            cout `got $(m.group(1)) = $(m.group(2))\n`;
    }

The Regex.search() method returns a Match object if there was a matching substring and null if there wasn't. The Match object allows you to get the entire text of the matched substring, the start and end positions of it, and also allows you to get this information for parenthesized subgroups (employed in the example above).

We uses values of 1 and 2 to access the subgroups: the zeroth subgroup is the text of the entire expression.

To get the start and end position of the matched substring:

    uint startPos = m.begin();
    uint endPos = m.end();

Likewise, we can do this with subgroups:

    uint start1 = m.begin(1);
    uint end1 = m.end(1);

Note again that the zeroth subgroup (m.begin(0)) is equivalent to the entire group, so we start with group 1.

We can also use named subgroups. For example, we could have written our expression like this:

    rx := Regex('(?P<var>\\w+)=(?P<val>.*)');

And then the output statement would look like this:

    cout `got $(m.group('var')) = $(m.group('val'))\n`;

Likewise, the begin() and end() functions can be used with group names.

Named subgroups can greatly enhance the readability of regular expression code.

crack.exp.regex is a wrapper around the PCRE library (perl-compatible regular expressions). As the name suggests, the regular expression syntax supported by this library is very close to the syntax supported by perl 5.

Bindings

The crack.exp.bindings library provides some facilities to ease integration with C libraries. It is often possible to model C types and functions using Crack classes and functions providing that you can remove some of the crack facilities (such as garbage collection and vtables).

Opaque

The Opaque class is used as the base class for types that where the crack code does not need to access instance variables. In these cases, the C code must define functions to obtain and manipulate the object. More information on Opaque can be found in Primitive Bindings.

Wrapper

Wrapper is the base class for types where C code requires the address of a primitive value (typically for an output parameter).

The crack.exp.bindings module currently defines two wrappers:

IntWrapper

Used when C code calls for a pointer to a single integer.

ByteptrWrapper

Used when C code calls for the address of a char * (or byteptr in Crack).

You can also derive from Wrapper directly to implement your own pointer functionality.

It is important to note that Wrapper and Opaque are not derived from Object, and as such they have no garbage collection. Opaque objects should be associated with real objects that know how to clean them up using their low-level functions. Wrapper objects must not be assigned to other variables, otherwise they will be garbage collected twice. Values referenced by wrappers conform to whatever lifecycle arrangements are provided by the low-level APIs that they are obtained from.

Networking

The crack.net module includes support for TCP/IP socket level programming.

A socket is a communication endpoint for TCP/IP communications. Sockets can either be connection channels, allowing you to send data to a matching socket elsewhere on the network, or listeners which wait for new connections.

We create a socket as an instance of the Socket class:

    import crack.net sockconsts, Socket;
    
    s := Socket(sockconsts.AF_INET, sockconsts.SOCK_STREAM, 0);

This creates a "stream" (TCP) socket for a reliable connection over the INET protocol.

With a little more code, we can create a very basic server program:

    import crack.net sockconsts, Socket, SockAddrIn;
    
    Socket srv = {sockconsts.AF_INET, sockconsts.SOCK_STREAM, 0};
    
    # allow the socket's port to be immediately reused after the program 
    # terminates (useful for debugging and server restarts)
    srv.setReuseAddr(true);

    # bind to port 1900 on all interfaces
    if (!srv.bind(SockAddrIn(sockconsts.INADDR_ANY, 1900)))
        err.do() `error binding socket`;
    
    # queue up to 5 connections
    srv.listen(5);
    
    # accept loop
    while (true) {
        
        # accept a new client connnection
        accepted := srv.accept();
        if (!accepted)
            err.do() `error accepting new connection`;
        
        # read up to 1K from the new connection
        data := accepted.sock.read(1024);
        
        # write it back to the new connection (we could have also used the 
        # send() and recv() functions)
        accepted.sock.write(data);
    }

A client script might look like this:

    import crack.net sockconsts, SockAddrIn, Socket;
    import crack.io cout, ManagedBuffer;
    import crack.exp.error err, strerror;
    
    Socket s = {sockconsts.AF_INET, sockconsts.SOCK_STREAM, 0};
    
    # connect to localhost (127.0.0.1)
    if (!s.connect(SockAddrIn(127, 0, 0, 1, 1900)))
        err.do() `error connecting: $(strerror())`;
    
    # this time, we'll use send and recv
    if (s.send('some data', 0) < 0)
        err.do() `error sending: $(strerror())`;
    
    # receive from the socket
    ManagedBuffer buf = {1024};
    int rc;
    if ((rc = s.recv(buf, 0)) < 0)
        err.do() `error receiving: $(strerror())`;
    buf.size = uint(rc);
        
    cout `got $(String(buf))\n`;

When writing a server, you usually want to manage multiple connections. You can use the Poller class for this. Poller wraps the POSIX poll interface, which allows you to wait for an event on a set of Pollable objects (Socket is derived from Pollable, so you can use this to manage sockets).

Poller allows you to add any number of Pollables and the events you care about on them. You can then wait for an event (or a timeout) using the wait() method. Here is an example of a simple server program (it just echos back whatever it reads from a client) written using Poller:

    import crack.net sockconsts, Poller, PollEvent, Socket, SockAddrIn;
    import crack.container DList;
    import crack.exp.error err, strerror;
    
    Socket srv = {sockconsts.AF_INET, sockconsts.SOCK_STREAM, 0};
    
    # allow the socket's port to be immediately reused after the program 
    # terminates (useful for debugging and server restarts)
    srv.setReuseAddr(true);

    # bind to port 1900 on all interfaces
    if (!srv.bind(SockAddrIn(sockconsts.INADDR_ANY, 1900)))
        err.do() `error binding socket`;
    
    # queue up to 5 connections
    srv.listen(5);

    # our list of clients    
    DList clients;

    # remove a client from the list
    void removeClient(Socket client) {
        i := clients.iter();
        while (!(client is i.elem()))
            i.next();
        
        if (!i)
            err.do() `unable to remove client: $(client.fd)\n`;
        
        clients.delete(i);
    }
    
    # accept loop
    while (true) {
        Poller p;

        # add the server socket and all of the client sockets        
        events := sockconsts.POLLIN | sockconsts.POLLERR;
        p.add(srv, events);
        i := clients.iter();
        while (i.nx())
            p.add(Socket.cast(i.elem()), events);
        
        # wait indefinitely for the next event
        rc := p.wait(null);
        if (rc < 0)
            err.do() `error during poll: $(strerror())`;
        
        PollEvent evt = null;
        while (evt = p.nx()) {
            cout `got event\n`;
            # if this is the server, do an accept.  Otherwise do a read.
            if (evt.pollable is srv) {
                accepted := srv.accept();
                if (!accepted)
                    err.do() `error accepting new connection`;
                
                clients.append(accepted.sock);
            } else {
                # read some data, write it back (what we should really do is 
                # add sockconsts.POLLOUT to the events that this client is 
                # listening for, and then write when the socket becomes ready 
                # to write, but in the interests of simplifying the example 
                # we'll just write back to it)
                client := Socket.cast(evt.pollable);
                data := client.read(1024);
                if (!data) {
                    cout `removing client $(client.fd)\n`;
                    removeClient(client);
                } else {
                    client.write(data);
                }
            }
        }
    }

GTK

crack.exp.gtk is an experimental (and currently very minimal) GTK module. It currently provides classes for the following GTK objects:

Toplevel

A top-level (window manager level) window.

Tooltips

Used to tie fly-over help text to widgets.

Button

A pushbutton widget.

Entry

A single line text entryfield.

HBox

A container that arranges child widgets horizontally in a row.

VBox

A container that arranges child widgets vertically in a column.

App

Models an application with an init function and a main loop.

Here's a sample program:


    import crack.exp.gtk App, Button, Entry, Handlers, HBox, Label, Toplevel,
        VBox;
    import crack.sys argv;
    import crack.io cout;
    
    App app;
    
    # this is going to be our window.
    class MyWindow : Toplevel {

        # create a bunch of child widgets.        
        Label lbl = {'Your Name:'};
        Entry dataEnt;
        Button doneBtn = {'Close'};
        VBox v1 = {false, 10};
        HBox h1 = {false, 10};

        # print out all of the data collected by the window.
        void getData() {
            cout `Entry text: $(dataEnt.getText())\n`;
        }
        
        # handler for the button class - print out the data and quit.
        class DoneBtnHandler : Handlers {
            # note that this creates a reference cycle - we need to do 
            # something to break the cycle if we care about memory leaks.
            MyWindow win = null;
            oper init(MyWindow win0) : win = win0 {}
            bool onClicked() {
                win.getData();
                app.quit();
                return false;
            }
        }
        
        oper init() {
            # arrange all of the widgets
            add(v1);
            v1.add(h1);
            h1.add(lbl);
            h1.add(dataEnt);
            v1.add(doneBtn);

            # set the done handler.            
            doneBtn.setHandlers(DoneBtnHandler(this));
            doneBtn.handleClicked();

            # show all of the widgets            
            lbl.show();
            dataEnt.show();
            doneBtn.show();
            v1.show();
            h1.show();
            show()
        }
    }
    
    app.init(argv);
    MyWindow win = {};
    app.main();

Crack's GTK libraries are really a toy implementation or proof of concept. You can probably implement minimal programs with them, you can implement larger programs if you care to augment the library.