Thursday, February 23, 2006

Class Notes - Data Types - 1

Lifetime
Not same as scope. Example – subprogram in C where exists but hidden. Static in C.

Symbolic constants
Why? Readability. Reliability. Ease of changing the program. We saw in the program where can be useful. Two types – lvalues and rvalues. Const vs. #define.
Pascal - simple, Modula-2: complex expressions. manifest constants –simple due to static binding of values.
C++, Java, Ada: constants based on variables – bound at runtime

Initializing Variables
We know C++. int x = 6; FORTRAN has DATA statement
REAL PI
INTEGER SUM
DATA SUM /0/, PI /3.14/
Some langs do not have init statements. Just do assignment. Pascal.
When is this initialization done? Static vs. auto in C.

ALGOL 60
First, ALGOL 58, designed by international committee. 3 goals: close to standard mathematical notation; use to describe computing processes in publications; able to translate to machine code. JOVIAL, MAD, NELIAC. In general, taken to be guidelines for developing langs. IBM interested in promoting FORTRAN.
ALGOL 60, Peter Naur described using BNF, which while elegant, was not well understood back then.
Contributions: many langs based on this. Example: pascal, ALGOL 68, Ada, C++, PL/I, Java. Was only accepted way of describing algorithms. Influenced machine architecture to implement the concepts. Advances: block structure; pass by value and pass by name. recursive procs. Stack dynamic arrays. (now they have them in C, I think). But lacked standard IO = difficulty porting. People already using FORTRAN.

I posted before an initial ALGOL reading – 95 – 121. A summary of material covered there – Algol 58, use of BNF (with further discussion in next chapter), Algol 60. Blocks. Though ALGOL has static scoping, a discussion of dynamic scoping. Pg 118. generalized arrays by allowing lower bounds, dynamic arrays. Pg 119 – strong typing. Look thru the code.

Now 121-140. A lot of what we discussed. Pass by name, value. Recursive procedures. Introduction of for loop with step – very general. The switch.

months
For days := 31, 28, 31, 30, 31, 31, 31, 30, 31, 30 do
For days := 31, if (mod(year, 4) = 0 then 29, else 28, 31, 30, 31, 31, 31, 30, 31, 30

Homework: Print Oscillating sum
Homework: Set up Prolog

Data types:
Early on, model linked list, binary tree, with arrays. Think. How would do it?
COBOL – allowed specification of precision, structured data type for records. PL/I – extended precision to ints and floats. ALGOL 68: primitives and structure-defining ops. (examples of ops – in C: {}, (), *, []) User defined data types – able to associate name w/ data type. Abstract data types – use of a type differs from internal representation and ops available on type
Homogeneous (array) vs. heterogenous (record).

Primitives – not based on other types. Ints reflect storage in hardware. Others – in software.

Numerics – integers: signed (one’s complement, two’s complement), unsigned, short, int, long. Floats: 1 sign bit, 8 bit exponent, 23 bit fraction = 32. Double: 1/11/52 = 64.
Decimal: use byte or nibble per digit. Decimal point does not float. Boolean types. As bit, byte, int. Char type: ASCII, Unicode (2 byte).

String Types: might be built in or built upon array of char. Do we allow array ops on it. Should they be static or dynamic length?
Not primitive: C++, Pascal, Ada.
In Ada: array of char, can reference a single char. Various ops built in.
substr: name1(2:4)
& for concatenation. As in VB6.
C++: ignoring the stl. After all, literal strings are arrays of char, null terminated. Built in library functions – strcpy, strcat, etc.
FORTRAN 70+: they are primitives. Have assignment, relational ops, concatenation, substring.
Java: primitive type. String class: holds constant strings. StringBuffer. Why this is really annoying. String comparisons in Java.
SNOBOL4, Perl, Javascript: pattern matching string op. many other langs: as functions.
In SNOBOL4, patterns can be assigned to variables.
LETTER = ‘abcdefghijklmnopqrstuvwxyz’
WORDPATTERN = BREAK(LETTER) SPAN(LETTER) . WORD
Means skip until find a letter, then span letters, then assign (the .) to the variable WORD
TEXT WORDPATTERN
Perl: Based on regular expressions. /[A-Za-z][ [A-Za-z]+/ /\d+\. ?\d* | \.\d+/

static vs. dynamic string lengths
FORTRAN 90:
CHARACTER (LEN = 15) A, B
Always full, empty chars are blank.
C, C++: limited dynamic. Except for STL
SNOBOL4, JavaScript, Perl: dynamic
Talk about benefits, drawbacks of each scheme

Implementation
Static: at compile time: name of type, the length, the address
Limited dynamic: store capacity, current length. (though C++ needs this not)
No need for dynamic allocation.
Dynamic: as linked list, but have many pointers. Dense list. But what if space afterwards is claimed – must realloc and copy. We can use a mixture of the two approaches.

User-defined Ordinals.
enum. C, Ada
Design: In Pascal, a given enum val (e.g. January) not allowed to be used in more than a single enum, cannot be input or output, but can be assigned, compared, used in for loop.
Some languages allow overloading of these literals, but then need to specify type:
e.g. type letters is (‘A’, ‘B’, ‘C’, ‘D’ ….
type vowels is (‘A’, ‘E’, ‘I’, ‘O’, ‘U’)
for letter in ‘A’ ,,. ‘U’ loop
for letter in vowels(‘A’) ,,. Vowels(‘U’) loop

Ada’s BOOLEAN and CHARACTER are predefined enums

Ops: pred, successor, position. Pascal: pred(blue) is red. Ada, as attributes: Letter’Pred(‘B’) is ‘A’

Improves readability, reliability

Subrange types – continuous subsequence of an ordinal
Pascal:
type
uppercase = ‘A’ .. ‘Z’;
index = 1 .. 100;

In Ada, called subtypes. Assuming already defined an enum:
Subtype WEEKDAYS is DAYS range Mon .. Fri
Subtype INDEX is INTEGER range 1 .. 100

Ada allows assignment from parent to subtype unless it is not defined, so might assign Tuesday but not Sunday.
Recall that diff from Ada’s type, where cannot assign from parent to derived type.

Implementation: As parent types, but with range checking for each assignment.

Array Types

No comments: