Thursday, March 30, 2006

Guarded Commands

See here.

Subprograms

Access the pdf here.

Tuesday, March 28, 2006

Iterative Statements

FORTRAN I: do was posttest.
DO label var = init, terminal, stepsize
stepsize defaults to 1
unsigned int constants or simple positive int variables

FORTRAN 90: DO statement is pretest. fortran 77: allows loop var to be int, real, double.
loops can be expressions, with neg and pos values. eval at beginning of DO to arrive at an iteration count. It is this value, not the loop parameters, that control the loop. so one could change them.

no multiple entry, yes multiple exits.

ALGOL 60: a case in flexibility
[for_stmt] -) for var := [list_element] {, [list_element]} do [statement]
[list_element] -) [expression]
| [expression] step [expression] until [expression]
| [expression] while [Boolean_expr]

See this website: http://logos.cs.uic.edu/476/notes/overheads/ch7controlstructures.pdf

Pascal For Statement
for
var = init (to | downto) final do statement

it grows or shrinks in steps of 1. loop var cannot be changed in the body.
after loop, loop var undefined. loop var is an ordinal, with scope of the loop.

init and final checked at the start, and not after, and so may be changed in the loop.
why do this? efficiency, reliability?

Ada For statement
for variable in [reverse] discrete_range loop
...
end loop

and note that loop var has loop scope, will mask variables with wider scope.

C, C++:
We know this.
for (int i = 9; i < style="font-weight: bold;">
any of the elements in the for may be missing.
scope until end of function.

Logically Controlled Loops (rather than counting loops)
posttest, pretest?
should it be special form of a counting loop?

examples of pretest, posttest in C++, Java.
do while vs. while.

how would we implement them (=operational semantics)

FORTRAN 90: has neither.
Ada: pretest, no posttest. why?
Perl: while and until.
how so? see here: http://vergil.chemistry.gatech.edu/resources/programming/perl-tutorial/control.html

  •   while ($food < $stomach_ache) {    $plop++;    $fizz--;    $oh_what_a_relief_it_is *= 5;  } 
  • Like the unless statement above, it is easier to say "until something is true" (Schwartz 61). Thus, Perl provides the until statement:
      until ($x > 6) {
    print "$x is the number!\n";
    $x++;
    }
  • Like C, Perl lets you do do, while loops, but Perl also adds do, until loops:
      # do while example
    do {
    $junk++;
    } while ($junk < aok =" @array_hooray[$x];">

and of course the for loop.
might as well talk about Perl conditionals here as well:
  • if and else statements are used as conditionals as in any other language. For example:
      if ($cow < $sheep) {    print "baaah....\n";  } else {    print "moo....\n";  } 
  • unless statement is specific to Perl, in which you want to always just evaluate the false part instead of doing something when not true. For example:
      unless ($age <>
Pascal has a repeat...until statement, which is the reverse of C's do...while, same as Perl's do...until.

Think about it - when we have break statements, they are the reverse of a standard while conditional. repeat...until maintains that.

Post-tests not freq used. dangerous because loop always executed at least once, and programmer might forget this. put after loop body helps make this a bit clearer.



to be continued...

Thursday, March 23, 2006

Control Structures

Control Statement = control statement + statement(s) it controls
control execution path
without, all have are assignments and evaluating expressions.
two types: selection (among multiple execution paths) and repeated execution.
basically, if and while. but can add more = greater writability, but programmers only learn subset of lang.

how many statements do you allow in the control statement?
one or many. blocks. otherwise, restricted to single statement, or use of a goto. control structures can replace goto.

another q: can we enter into the middle of control structures (multiple entries) - langs differ. in terms of whether can exit multiply, not such a distinction since generally safe to do that - but there is a design philosophy.

Selection
generally 2 way vs. n way selectors (1 way "degenerate" form of the two way. fortran's 3 way selector).

design issue: what type of expression controls the selector? can we select more than a single statement? do we handle nested selection?

Two way: Logical If:
FORTRAN IV
IF (BOOL expression) statement
simple, inflexible. how do you get multiple statments? with goto circumventing.

IF (FLAG .NE. 1) GO TO 20
I = 1
J = 1
20 CONTINUE

but points off for lack of readability.

How about multiple entry? Yes. How?

ALGOL 60 and on (=Java, FORTRAN 90): two-way selector:
if (bool expr) then
statement
else
statement

only one or the other is executed.

and either of these can be compound, in which case it is the then clause and else clause, rather than statement.

how do we generate the degenerate 1-way if? how do we generate the two-way?

Nesting Selectors
Ambiguity.
if (a == 1)
if (b == 2)
result = 5;
else
result = 1;

Python: semantic interpretation based on indentation. other languages, no.

In Java, most other imperative langs - match most recent, so must indent the else.
by a rule, rather than in the syntax. but will the programmer know?

ALGOL 60 - by Syntax rather than by fiat (rule).
no if nested in another if. must nest in a begin/end clause.

How would we then write the code to get each interp? (Use {} as a shorthand.) Can get either effect in Java, but to match first if rather than second, must use {}.

Perl: then and else clauses must always be compound.

in C++, Java, the introduction of the else clause, or else by compounds, {}, marked the end of the statement or compound, and where this was a single statement with no immediate else, was ambiguity. And difference between marker of end of clause, if it is a compound of single statement.

In Ada, VB6, have end if;
how does this help?

Because "end if" closes the nested if, shows which one it matches. how do we write each in Ada?

Multiple Selection
generalization of 2-way, and can implement 2-way as multiple.

design issues
do we allow compound statements, type of expression (2 way selector needed only 2 vals), can we select more than one statement/compound? what if does not eval to one of specified vals?

FORTRAN I - the three-way selector: IF(X) 10, 20, 30

and here can go really anywhere - need not even immediately follow if, or each other. user responsible for gotos, and can omit with compiler not detecting.
Also in FORTRAN I
computed:
GO TO (label1, label2, labeln) expression

assigned:
GO TO var_or_const

C.A.R. Hoare, in ALGOL-W (1966)

case int_expression of
begin
statement1;
...
statementn;
end

and could be compounds.

Pascal similar, except
statement_i becomes:
constant_list_i: statement_i;

yet we cannot say goto caselabel in Pascal.

later Pascals: an else clause, = to default
no fallthru.

in C#:

      switch(n) 
{
case 1:
cost += 25;
break;
case 2:
cost += 25;
goto case 1;
case 3:
cost += 50;
goto case 1;
default:
Console.WriteLine("Invalid selection. Please select 1, 2, or 3.");
break;
}
and they will not allow you to omit a break of a goto.

does not disallow multiple entry. and has fallthru. unless use break, which basically is a restricted goto.
but what if forget the break and get unwanted fallthru?

reliability vs. flexibility.

In Ada, allow ranges: 10..15, as well as 5 | 6 | 19, and have an others clause

this is good for basing on single value at the top. but sometimes, based on various boolean conditions. must use nested ifs. poor readability. some langs (FORTRAN 90, Ada, Perl) give way around this.

elsif vs a bunch of nestings and endifs

Iterative statements
execute statment 0, 1, more times
all prog langs.
sometimes by iterative constructs, sometimes by recursion.

at first, directly related to arrays. why? that was the problem domain.

design questions: how iteration controlled, where in the loop is the control mechanism?

body, pretest, posttest.

counter-controlled loops: inital, terminal values of loop var, + step size.
logically controlled loops more general, but not necessarily more commonly used. could be more complex to implement that counter-controlled.
often supported by machine architecture. sometimes lang features outlive machine architecture (Fortran's 3 way if)

more design qs:
type, scope of loop var. what value does it have at termination, can we change loop vars in the loop (what are effects of this?), are loop params eval once, or for every iteration (let us say the terminal condition changes)

FORTRAN I: do was posttest.
DO label var = init, terminal, stepsize
stepsize defaults to 1
unsigned int constants or simple positive int variables

FORTRAN 90: DO statement is pretest. fortran 77: allows loop var to be int, real, double


Thursday, March 16, 2006

Expressions, Assignment Statements

Overloaded Operators:
operator used for more than one purpose. writability vs. reliability. is this so? depends on how use it.

op overloading in C++. give me an example (<<, >>, +).

by language def, by user.

APL and SNOBOL have binary, unary versions of almost all operators, leads to confusion.

C++: & operator, as address of and bitwise and.
unrelated functions. similarly, * operator.

general problem of reusing symbols. comments, references, etc.
a /*p

unary, binary -.

"distinct symbols increase readability." is this so? + for string appending.
when used correctly, aids in readability.
add(matmul(A, B), matmul(C, D)
vs.
A * B + C * D
on the other hand, need know types of both operands to know meaning of operator.

example from Pascal, vs. C++: find sum of bunch of ints.
avg := sum / count;
avg = sum / count;

in Pascal, / specifies float result of division between two ints. VB as well, / vs
Type conversion
Narrowing vs. Widening conversion. widening generally safer.

mixed mode expressions = op has operands of different types. coercion to appropriate type where operator not defined on that type. also, there are explicit conversions via typecasting.

Diff langs disagree on whether to do coercion.

int a, b, c; float d;
a = b * d; // as an error for b * c

case in flexibility vs. efficiency:
PL/I - string can be combined with an int. string is scanned for numeric value, and if a decimal point, assumed to be a float, and coerce other operand to float. Coercion, and choice of coercion, at run time.

reduce flexibility in favor of reliability
Ada:
cannot mix int and float in arithmetic expression except for **, exponent.

most others, no such restrictions.

Java (byte, short), C++ (short, char). operators such as + first convert to int, and then convert back on assignment, even though both operands of the same type.

Explicit type conversion
Ada:
AVG := FLOAT(SUM) / FLOAT(COUNT)

C:
(int) angle
also C++'s function-like casting

errors can result from coercions. divide by zero when divide by a float.

Relational, Boolean Expressions
for the most part, similar. sometimes slight differences. FORTRAN used .EQ. and .NE. because symbols not on punch cards.
Ada uses = for comparison. /= for not equal.

relational given lower precedence than arithmetic, so that can say
a + b > c + d

Boolean expressions:
In Ada:
**, abs, not
*, /, mod, rem
unary +, -
etc.

lowest:
and, or, xor, and then, or else

these ops non-associative. A > B and A < k =" 0"> b > c.
Python handles this interestingly.

x <> z

This is interpreted by Python as::

if x <> z:
return y > z # 1 for plain comparisons
else:
return y > z # 0 for plain comparisons
else:
return x <>

Note that this requires testing the truth value of the result of
comparisons, with potential "shortcutting" of the right-side
comparison testings. In other words, the truth-value of the
result of a comparison determines the result of a chained
operation.

Which brings us to Short-circuiting.
When know answer already, can ignore rest.
(13 * a) * (b / 13 - 1)
what if a is 0? But not easily detected. By boolean, already considering truth value.
(a >= 0) and (b < style="font-weight: bold;">speed.
can use this shortcircuiting to your advantage.
if (index in range and a[index] > 9)
so as to prevent an error.

on the other hand, sometimes will not expect that it is shortcircuiting, so issue of reliability. calling a function, incrementing a variable.

how decide: FORTRAN 77: up to lang implementor, if part is a func which assigns to a nonlocal variable, that nonlocal becomes "undefined"
difficulty in detecting these side-effects of functions.

In Ada: and then, or else does short-circuiting.

In C++: &&, || do short-circuit. If wish to avoid short-circuit, force 1 and 0 results and use bitwise operators.

Assignment

A = B

what about A = B = C?
depends. in C, yes. In PL/I, Basic, Visual Basic, = also does comparison, so would assign truth value of B = C to A.

ALGOL 60: := for assignment.

Mult targets:
PL/I
: Sum, Total = 0.
other langs, e.g., C, do not.

Conditional targets
?: to give an lval rather than an rval, in C++.

Compound Assignment
a = a + b
a += b

C, C++, Java.
interesting feature of C# that defines these for you.

Unary Assignment
in C, C++, Java, etc. ++a.
when apply two unary ops, one on left side, one on right, works right to left:
- (count ++) even without parens

Assignment has an expression value
in C.
while ((ch = getchar() != EOF)

Mixed mode assignment
Do the source, target need be same type? FORTRAN, C, C++: coercion.
Pascal: depends - int to real, but not real to int.
Ada, Modula2 - nope.

Java: differs from C++ - only if widening results. increases reliability.

Tuesday, March 07, 2006

Data Types - III

Associative Arrays

Perl:
%ages = ("Josh" => 10, "John" = 32);
$ages{"Josh"} = 29;
// and assignment will create an element
delete $ages{"John"};
// or the entire hash:
@salaries = ();

Here is a Perl tutorial on hashes. Read through it. keys and values operator/collections.

    for my $key ( keys %hash ) {
my $value = $hash{$key};
print "$key => $value\n";
}
exists keyword, or else will create the element.
if (exists $ages{"Peter"} ) ...

Possible Implementation: Read this website about implementation of Perl hashes. Linked Lists - but then O(N). Could turn each string into unique index, and use array, but then very sparse. Can use a hash function. Handle collisions how? Better hash function, or else chaining. (Alternatively, red-black trees.) Read article for more info.

C++: Map, multimap. C#: Dictionary, etc.

Record:
Heterogeneous data structure.
First, COBOL. (Not in early Fortrans.) C's struct. C++'s class. Java has class, not struct. C# has both, but with different meanings (struct is treated as a primitive, unboxed).

COBOL:

01 EMPLOYEE-RECORD.
02 EMPLOYEE-NAME.
05 FIRST PICTURE IS X(20).
05 MIDDLE PICTURE IS X(10).
05 LAST PICTURE IS X(20).
02 HOURLY-RATE PICTURE IS 99V99.

Pascal, Ada, use record nesting rather than relative numbers.


EMPLOYEE-RECORD :
record
EMPLOYEE-NAME :
record
FIRST : STRING(1..20);
MIDDLE : STRING(1..10);
LAST : STRING(1..20);.
end record;
HOURLY-RATE : FLOAT;
end record;
Referring to subelements:
COBOL: MIDDLE OF EMPLOYEE-NAME OF EMPLOYEE-RECORD
Others langs: dot, or -> for pointers to structs. Fortran: %

elliptical names - as long as unambiguous, can omit middle stuff.

with in Pascal, VB6. Can do this in C++ using references.

Record Ops
Assignment. types must be same. Ada: comparison for =, <>, initialization with literals.

COBOL: MOVE CORRESPONDING
if subfields have same name, copy it.

Implementation. adjacent memory, and field names associated with offset. descriptor has, for each field: Name, type, offset.

Unions
type that stores diff type values at diff times.
Q: Should we provide type-checking for current type, dynamically? Should unions be embedded in records?

free unions give programmer complete control, no type checking.
Pascal's unions, Ada's unions: have a tag or discriminant that tells current type.
more about this, but can read up on it. Ada adds constrained union such that in declaration, can restrict to one specific type, and thus do static type-checking.

Eval: flexibility vs. reliability.

Implementation:
Use same address for all members of union. enough storage for the largest.

Sets
C++ has but not exactly the same.
Until then, Pascal alone among imperative langs. max size implementation dependent, often < style="font-weight: bold;">type colors = {red, green, blue};
colorset = set of colors.
var set1, set2 : colorset;

set1 := [red, blue];

if ch in ['a', 'e', 'i', 'o', 'u'] ...

Check out ISETLW.

Eval
Ada, based on Pascal, but did not provide sets. rather, added membership op on enums.

other langs, use arrays and implement ops. unweildy, less efficient.

Impl
stored as bit strings. 0 if present, 1 if not present.

(C++ has a bitset.)

Pointers
We know about pointers, dereferencing, assignment to local, off heap, etc.

some problems with pointers:
dangling pointers - points to deallocated heap variable. and what if that memory claimed by s/t else?
lost heap-dynamic variable - lost in terms of no longer accessible. memory leak, vs. garbage collection.

Pascal: create new vars with new and elim with dispose, which sets all pointers pointing at this object to nil. implementation dependent.

Ada: access types. auto dealloc when ptr goes out of scope. elims problems caused by reliance on explicit dealloc. though gives deliberately unweildy alternate method.

C++, Java: ref types.

Handling dangling ptrs:
tombstones. ptrs only point to tombstones. tombstone points to actual memory. when deallocate, tombstone remains and is set to nil. costly in time and space. storage never reclaimed. no prog lang uses.

garbage collection vs. reference counters: lazy, eager approach.

reference counters - keep track how many things pointing to this, when ref count reaches zero, can be reclaimed.

garbage collection - when run out of space, or can explicitly invoke. how do we know when something not pointed to? keep track of each ptr. named ptrs, and following named ptrs as far as they go.


Next: Expressions, Assignment Statements
unary, binary, ternary ops.
usually Infix.

a + b * c

right to left, left to right, or operator precedence.

associativity to left, to right: a - b + c - d

use of parentheses.

side effects -what else happens as a result.
a + fun(a)
functional side-effects
FORTRAN only allows (=legal) if function does not change value. but s/t hard to determine, based on pointers, globals.

operator side-effects
b = a++ + a++;
reference point, e.g. assigning twice between two reference points.


to be continued...

More Prolog

Here are some Prolog programs (written in Strawberry Prolog) and their output:

program

father(avraham, yitzchak).
father(yitzchak, yaakov).

ancestor(X, Y) :- father(X,Y).
ancestor(X, Y) :- father(X,Z), ancestor(Z, Y).

?- father(yitzchak, X), write(X), write("\n").

Output:
Compiling the file:
c:\father
0 errors, 0 warnings.

yaakov
Yes.

program
add1(X, Y) :- Y is X + 1.

?- add1(4, 5).

output
Yes.

program
add1(X, Y) :- Y is X + 1.

?- add1(4, 6).

output
No.

program
add1(X, Y) :- Y is X + 1.

?- add1(4, X), write(X), write("\n").

output
5
Yes.

program
add(X, Y, Z) :- Z is X + Y.

?- add(4, 5, X), write(X), write("\n").

output
9
Yes.

program
add(X, Y, Z) :- Z is X + Y.

?- add(4, "hello", X), write(X), write("\n").

output
4hello
Yes.

program
add(X, Y, Z) :- Z is X + Y.

?- add(4, Y, X), write(X), write("\n").

output

Compiling the file:
c:\father
0 errors, 0 warnings.

Runtime Error at step 3:
The argument 2 of "+"
is "var" but has to be "str" or "int" or "float".

No.

program

add(X, Y, Z) :- Z is X + Y.

?- add(4, 5, 9), write(X), write("\n").

output
_9
Yes.

program
father(avraham, yitzchak).
father(yitzchak, yaakov).

ancestor(X, Y) :- father(X,Y).
ancestor(X, Y) :- father(X,Z), ancestor(Z, Y).

?- ancestor(avraham, X), write(X), write("\n").

output (selecting Run, then Next Answer)
Compiling the file:
C:\father
0 errors, 0 warnings.

yitzchak
Yes.
yaakov
Yes.

program
abs(X, Y) :- X = Y, X >= 0.

?- abs(4, T), write(T), nl.

output

4
Yes.

program
abs(X, Y) :- X = Y, X >= 0.

?- abs(-4, T), write(T), nl.

output
No.

program
abs(X, Y) :- Y = X, X >= 0.
abs(X, Y) :- Y = 0 - X.

?- abs(-4, T), write(T), nl.

output
4
Yes.

program:
append([], L, L).
append([H | T] , L2, [H | L3] ) :- append(T, L2, L3).

1) empty list + L = L
2) recursion step
appending a list [H | T] to any list L2 produces the list [H | L3], on condition that L3 is formed by appending T to L2.

try out an example.

Wednesday, March 01, 2006

Prolog Compilers, Interpreters

There is one that actually runs in your browser window if you have a Java-enabled browser. Click here, scroll down to "the distribution," and click on the button labelled "Start W-Prolog" to start the prolog interpreter Applet.

Copy the text of the facts and rules from the previous post, and run a query to see if terach is the father of yishmael.

There are some free compilers and interpreters here.

Prolog - I

Facts, rules, and goals.

father(terach, avraham).
father(avraham, yitzchak).
father(avraham, yishmael).

ancestor(X, Y) :- father(X, Y).
ancestor(X, Y) :- father(X, Z), ancestor(Z, Y).

try this out. how would you write a brother rule?