Tuesday, January 31, 2006

Starting FORTRAN

A free Fortran compiler: G95.

Download and install the binary under "Self-extracting Windows x86." Install to the C:\g95 directory.

Now, make your first program, to print 'Hello World' ten times. Spacing matters:


PROGRAM HELLO
DO 10, I=1,10
PRINT *,'Hello World'
10 CONTINUE
STOP
END


save this file as hello.f in the c:\g95\bin directory.
from the c:\g95\bin directory, type the following:
g95 hello.f

The compiler will create hello.exe in the same directory. Run it.

Now edit the file and create a syntax error. Change * to *a. Try to recompile and see what happens.

Monday, January 30, 2006

Class I - Homework

  1. Buy the book
  2. Send me an email. JoshWaxman {at} gmail [dot] com.
  3. implement Pascal string in C or C++
  4. write a program that uses logical XOR
  5. Read the first chapter (pg. 9-36)
  6. The Regularity principle, mentioned on pg. 11 in the book, is similar to what concept mentioned in class?
  7. Download and install the G95 Fortran compiler, and write and run the hello world program.

Class I - Preliminaries

  1. Why?
    1. increased capacity to express ideas.
      what can say depends on vocabulary, and prog. language limits what can express. by knowing other langs, know more lang constructs. These language constructs can be similated. e.g. Pascal strings. first homework assignment.
    2. ability to choose appropriate language for the task. else fall back on what you know.
    3. ability to learn new languages. if know data abstraction, easier constructing abstract data types in Java. similar to natural lang - better know grammar of own lang, easier to learn other language. also, learning second lang tells you more about your own language. e.g. logical XOR. second homework assignment.
    4. better understand significance of implementation.
      1. helps understand why designed as is --> better use of the language. know which constructs to use.
      2. figure out bugs
      3. understanding efficiency. how recursion works -> why so slow
    5. Ability to design new languages. input format language.
    6. enhancement of computing via natural selection. e.g. Algol 60 better control statements and other features than Fortran, but difficult to read and understand. Programmers did not appreciate features of Algol 60.
  2. Programming domains -- produce languages that specifically suit it.
    1. Scientific applications: deal with floating point arithmetic, simple data structures. Most common: arrays, matrices; most common control structures: loops. Competition: assembly, so needed to be fast. Fortran, Algol 60.
    2. Business Applications: produce reports, precise ways of describing and storing decimal numbers and chars, decimal arithmetic ops. COBOL. Also, spreadsheet and database systems.
    3. AI: Symbolic rather than numeric computation. Symbols, consisting of names instead of nums, are manipulated. Linked lists more convenient than arrays for this purpose. More flexibility required -- e.g. to create and run code segments during execution. LISP - functional language. Prolog - Logic programming.
    4. Systems Programming: The OS and programming support tools = systems software. Fast execution (since always running) + low level features. Made special languages for the particular systems: IBM: PL/S, dialect of PL/I. Digital made BLISS, language just above assembly. Burroughs: extended ALGOL. UNIX written in C so easy to port - low level, efficient, easy to shoot self in foot.
    5. Scripting Languages: Batch programming. DOS batch? sh = shell=calls to system subprograms. awk = for report generation, then became more general purpose. Perl began as sh + awk.
      tcl - scripting language; tk - method of building X-Window applications.
      Javascript.
    6. Other special purpose languages. RPG - produce business reports, GPSS - systems simulation. Inform - Zork type games.
  3. Lang Evaluation Criteria - What makes a language "good"
    1. Readability - ability to be understood - in terms of problem domain. use ill-suited lang, will be unreadable.
      1. Simplicity
        1. small vs. large num of lang components - learn subset.
        2. feature multiplicity: + 1 vs. ++, which have slightly diff meanings
        3. operator overloading, if misused. + on arrays to return scalar sum rather than vector sum.
        4. but simplicity can go too far. consider assembly. no control statements, complex data structures.
      1. Orthogonality
        1. small # primitives combined in small # of ways, consistently
        2. e.g. pointers should point to any type.
        3. VAX:
          A Reg1, memory_cell
          AR Reg1, Reg2

          IBM:
          ADDL operand_1, operand2
        4. i before e, except after c. confusing
        5. lack of orthogonality in C: can return structs but not arrays. structs cannot hold void or struct of same type. (not true in C#, Java) Array el cannot be void or a function. (but can be func pointer) a + b where A is a pointer does different things depending on pointer type.
        6. Too much orthogonality. ALGOL 68. conditionals can appear on left side of assignment, also declarations, etc., so long as result is location.
        7. Functional vs. Imperative lang. Functional, such as LISP - single construct. But efficiency problems. (I would add also readability/writability)
      2. Control statements: gotos vs. while, for loops. spaghetti code. how to restrict: near targets, limited number, always precede targets except to form loops.
      3. Data types and structures:
        1. bool type in C++. timeout = 1 vs true.
        2. Lack of structures in FORTRAN 77, req parallel arrays and use same subscript.
      4. Syntax:
        1. identifier forms. Basic: E or E3.
        2. keywords that can identify. some lang, like FORTRAN 90, allows these to be used as var names as well.
        3. end vs end if, end loop, {}
        4. code self documenting such that semantics follows syntax. static has 2 or 3 meanings in C, C++. names such as "grep" requires special knowledge to understand.
    2. Writability - how easy to code solution for given problem domain
      1. Simplicity, Orthogonality = less programmer need know, since limited constructs and few limits on how to combine them
      2. Abstraction - allow definition and use of complicated structures and ops allowing details to be ignored. increases naturalness of expression. e.g. function, rather than repeating commands each time with diff variables, creating clutter. data abstraction: example above of parallel arrays. or how to implement a binary tree in fortran vs. in C.
    3. Expressivity.
      1. to make less cumbersome. count++. and then to do short circuit (automatic in C), for loop. syntactic sugar.
    4. Reliability - performs to specs under all conditions
      1. more important in specific domains - control airplanes, medical equipment, etc., but in general.
      2. type checking - C doesn't do it. C++ does. can cause errors
      3. exception handling - Ada, C++, Java.
      4. Aliasing - two items refer to same entity. can cause problems. but also powerful language feature. references, pointers, point to same location. unions.
      5. Readability/Writability. unnaturalness leads to errors. also need maintain/modify.
    5. Cost - diff definitions
      1. training programmers, paying programmers to write it, time to compile, time to execute (and optimize diff things), price of the compiler/interpreter - Java, cost of poor reliability in human terms and in further business, cost of maintaining.
    6. Other ways of evaluating langs exist. portability, generality (to wide range of applications), etc. and depends on perspective.
Influences on Lang Design:
Computer Architecture: von Neumann, variables model memory cells, assignment models piping data to CPU, and iteration is most efficient, because instructions stored in memory in adjacent locations. Recursion, functional langs inefficient.
Programming Methodologies: cost shift from hardware -> software.
type checking problems, lack of control statements.
then, more from process oriented to data oriented - data structures, then object oriented programming.

Language Categories:
Imperative, functional, Logic, Object oriented.
Logic: rule based lang. instead of specifying exactly how things are done algorithmically, rules specified in any order and lang implementation system chooses execution order. Prolog.

Implementation Methods:
cocentric circle diagram - bare machine, Macroinstruction interpreter, OS, C++ compiler, FORTRAN, C, Ada compiler, Assembler, Lisp interpreter, etc., all providing virtual language machine. OS command interpreter.
A) Compilation. diagram
Source program
lexical analyzer ---lexical units--> Syntax analyzer
Syntax analyzer --- parse trees ---> intermediate code generator/semantic analyzer

both lex and syn output to symbol table, which feeds the intermediate code generator/semantic analyzer as well as the code generator

intermediate code generator/semantic analyzer --intermediate code-->code generator
code generator --machine language-->computer
computer takes this and input and generates results.

Linking step to link to system calls (addresses of functions)
Based on von Neumann architechture, a fetch-execute cycle

initialize program counter
repeat forever
fetch instruction pointed to by program counter
program counter++
decode instruction
execute instruction
end repeat

von Neumann bottleneck
a result of having to transfer instructions via the bus to the CPU.

benefits: speed
examples: C, FORTRAN.

B) Pure interpretation.
No compiling. benefits - symbol table present. errors easily found and can give line number. can fix. Examples: Lisp, APL, batch files. C, C++, debug mode. Source program and input data go directly to the interpreter.
drawbacks: slow.

C) Hybrid. All steps up to intermediate code, preserve symbol table, but run the intermediate code. Java runs in both hybrid and full compile mode.

Wednesday, January 25, 2006

The Book For This Course

The book for this course is Principles of Programming Languages: Design, Evaluation, and Implementation, by Bruce J MacLennan. The ISBN number for the book is 0195113063.

The book is available for purchase and partial viewing at a number of locations. It will probably cost about $25, plus shipping and handling.

books.google.com has the book scanned, and has a bunch of links to sellers.
Amazon.com also has the book scanned, and offers the book for purchase.
Alibris.com offers the book used, and seems to have the best price, but for older editions of the book.