UMaine COS 301: Programming Languages

Semester Project

(Get PDF Version)

Introduction

This course has a semester-long individual project in which you will learn a new language and analyze it based on what you are learning in the course. The language will be one with which you are not familiar. Part of the learning process will be completing some simple programming assignments using the language.

Although there will be some programming, this is primarily a written project, where you will hone your writing skills. To that end, Parts 2–5 of the project will require you to write substantial papers. The usual process for each part will be:

  1. Write a draft of the assignment.
  2. Submit the draft for peer review/editing.
  3. Revise draft.
  4. Submit second draft.

Note that (2) implies that you will also be engaging in editing others’ papers throughout the semester. This is an important skill in itself, and it also helps with your own writing by making you think about others’.

Also note that the parts overlap! That is, you will be editing others’ papers, then revising your own, for one part while writing the next part.

Part 5 of the project will in addition to having you write new section also have you tie together all of what you have written into one coherent paper which will be your final project document.

You will be graded on each part, as well as on the project as a whole. See the Paper Grading Rubric on the course website for information about how the papers will be assessed.

Part 1: Choosing a programming language

Assigned: 8/28
Due: 9/6

In this assignment, you will select your project language.

After reading Chapter 1 of the text, do some research on the web and select a language. The web sites listed in the syllabus are good starting points. Also select an alternate language in case your first selection is not approved. You must select languages languages with which you have no (or very little) experience. The languages should not be too specialized (e.g., PostScript, SQL) and they must have a real-world purpose (i.e., no joke/toy languages such as WhiteSpace, etc.)

You are strongly encouraged to select a language that is so far removed from your current experience that it will be a true learning experience (or “the weirder the better”). Python, C, C++ and Java are not acceptable languages for this project (even if you are taking operating systems this semester).

Free compilers and/or interpreters are available for almost all languages; make sure that you are able to obtain what you need before committing to a language!

Turn in: For this assignment, submit a PDF or plain text file (no Word, etc., files) via Blackboard that answers these questions:

  1. Which language is your first choice?
  2. Which language is your alternate choice?
  3. Why did you choose these languages? This should be a well-written paragraph or two explaining the rationale for your selection. (This and all written assignments in this course will be graded for style, grammar, and spelling.)

Part 2: Overview and history

Assigned: 9/6
First draft due: 9/20
Edited version due: 9/27
Revised draft due: 10/4

Write a paper that addresses the following topics for your project language:

  • a brief overview of the language; and
  • a brief history of the language.

Some points that might be covered in the above topics are when and where the language originated, major influences on the language, the purpose of the language, main features, unusual features, strengths and weaknesses, acceptance by the software community, future prospects (if any) or reasons for the lack of future prospects. Remember to support your opinions and statements with logical arguments and citations of other authors.

The length of this part of the paper is somewhat up to you. I would suggest you shoot for 5 or more pages, since you will want to cover the language in sufficient detail. If it is too short, and hence, too sketchy or superficial, then you may lose points. Oddly, the difficult part of writing this paper is to keep it short. The goal is to summarize (and highlight the interesting points), not to provide a reference manual for the language. Your audience for this project is your peers in the class, and you should write with them in mind. For example you should assume that your audience knows what a logical operator is and they do not need a description of standard logical operators such as AND, OR, NOT. If however your language supports (e.g.) XNOR or distinguishes strict equality (e.g., ===) from equality (e.g., ==), it would be worthy of mention. Some languages may have dozens of operators. You may want to include lists of operators, data types, etc., in an appendix.

Paper format. You must use the LATEΧ document preparation system to format your paper so that you gain familiarity with this valuable tool. This means that during this part of the project, you will have to learn LATEΧ if you don’t already know it. If you have never used LATEΧ before, you will need to install it on your computer; versions are readily available for all three of the major operating systems (Windows, MacOS, and Linux). Although LATEΧ is a markup language for documents, there are several free WYSIWYG LATEΧ editors available as well, which can be useful for novices.

For information about LATEΧ, including resources for installing and learning it, see the “LATEΧ information” page under the “Course Documents” menu item on the course website.

Bibliography. In this part of the project, you will begin developing an annotated bibliography. This will each work you reference in the body of your paper as well as notes about that reference, in particular, why the reference is useful. You will add to this bibliography throughout the other parts of the project, and with each part, you will turn in the entire bibliography up to that time. You should only include references to works you actually cite in the paper (either the current part or a prior part). Every bibliography entry must have an annotation describing it.

You will use the BibTEΧ program, which works in conjunction with LATEΧ, to process your references, which means you will use a .bib file in which to store bibliographic information about the references, and you will use the LATEΧ \cite macro (or some variant of it) to insert references in your paper. Documentation for BibTEΧ is available online and on the course website (on the “LATEΧ information” page).

For the citations and references, you may use the IEEE citation style, the APA citation style, or the natural sciences style. BibTEΧ has style files for most common citation/reference styles; for instance, the natural sciences style is implemented by the natbib BibTEΧ style file.

Programming. Write a program in your language that interactively obtains two floating point numbers x and y, and computes and displays integer and fractional parts of each number, and the sum, difference, product and quotient (i.e, x+y, x-y, x*y, and x/y). Also test division by zero to find out how your language handles it.

You do not need to be concerned with determining whether input is correct; you can assume that it is. Some languages may not be well suited to interactive input and the assignment may be modified for non-interactive input if necessary.

If you are working with a web scripting language you can use HTML for input. An example form is given below.

Example HTML Page for Web Scripting Languages:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"   
        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>COS 301 Fall 2016 Project 1</title> </head>
<body>
<h1>COS 301 Fall 2016 Project 1</h1>
<form id="form1" name="form1" method="get" action="">
<p>
Enter x:&nbsp;
<input type="text" id="textx" name="textx" style="text-align: right;" />
<br/>
Enter y:&nbsp;
<input type="text" id="texty" name="texty" style="text-align: right;" />
<br/>
<input type="submit" value="Submit" name="submit" id="submit" />
</p>
</form>
</body>
</html>

Other issues:

  • Remember that the page count doesn’t count the cover page or the bibliography.
  • Remember to review the paper format checklist on the website before submitting!
  • An amusing and interesting website for programming languages is the “99 Bottles of Beer” site, www.99-bottles-of-beer.net, which has the song implemented in 1500 different programming languages. For popular languages you can usually find at least half-a-dozen different programs varying widely in style. In particular, see the “bottled” version of the Perl program.

Turn in: Turn in your assignment using Blackboard, following instructions found there.

Part 3: Syntax, scope, data types, and operators

Assigned: 9/20
First draft due: 10/11
Edited version due: 10/18
Revised draft due: 10/23

In this part of the project, you will be learning and describing your language. You will write a paper that discusses the following for your language:

  1. Basic syntactic structure, including statement terminators or separators, block structure, syntactic peculiarities, etc.
  2. The units or levels of scope and the nature and type (runtime or compile-time) of name bindings within the different levels of scope.
  3. Primitive data types available, including range limitations or lack thereof.
  4. Operators for primitive data types and their precedence and associativity.

“Primitive” data types are data types that are not defined in terms of other data types. What is primitive in one language may not be primitive in another. For example, many languages will include strings as a primitive data type; however, some languages, such as C, do not (since they are defined as arrays of characters).

Advice on the length of this part of the paper is the same as for Part 2.

Bibliography. As part of this part of the project, you will begin developing an annotated bibliography. This will include works you reference in the body of your paper as well as notes about those references. This bibliography will be added to throughout the other parts of the project, and with each part, you will turn in the entire bibliography up to that time. You should only include references to works you actually cite in the paper (either the current part or a prior part). Every bibliography entry must have an annotation describing it.

Programming. Using your language, write a lexical analyzer (“lexer”, scanner) and a recursive descent parser for a small grammar. The recursive descent parser will simply be a recognizer that determines whether or not a string is in the language; it will not construct an actual parse tree. Note that the calling sequence corresponds exactly to the parse tree. Output should be similar to the output from the example parser in the text (see below).

The grammar is a grammar of Boolean expressions (remember that the meta-character | is not the language’s OR operator):

<bool_expr> ::= <and_term> { | <and_term> }
<and_term> ::= <bool_factor> { & <bool_factor> }
<bool_factor> ::= <bool_literal> | !<bool_factor> |
                 ( <bool_expr> ) | <relation_expr>
<relation_expr> ::= <id> { <relop> <id> }
<id> ::= letter { letter | digit }
<bool_literal> ::= true | false

Constructing the lexical analyzer is actually the more difficult part of this assignment. You can always construct the parser first by creating an input stream of tokens (e.g., 1 = Left Paren, 2 = Right Paren, etc); then create a lexer that simply returns the next integer token. Then worry about actually parsing characters in a “program” later. Include output for the following expressions in an appendix:

foo & !( a2 > bar & w < foo | x < y)
A1 & B1 | A2 & B1 | (! C | A <> B )

Example output from the Sebesta parser:

Next token is: 25 lexeme is (
Enter <expr> 
Enter <term> 
Enter <factor> 
Next token is: 11 lexeme is sum
Enter <expr> 
Enter <term> 
Enter <factor> 
Next token is: 21 lexeme is +
Exit <factor> 
Exit <term> 
Next token is: 10 lexeme is 47
Enter <term> 
Enter <factor>

Turn in: Turn in your assignment using Blackboard, following instructions found there.

Part 4: Data types, expressions, assignment

Assigned: 10/11
First draft due: 11/1
Edited version due: 11/8
Revised draft due: 11/13

Write a paper that addresses the following topics for your selected language:

  1. Data types available in the language beyond primitive data types (such as ints, floats, etc.). This could include arrays, vectors, structures, strings, pointers, objects, and classes. You should include higher-level structures, such as hashs, stacks, etc., if they are part of the language itself and not just part of a standard or other library.
  2. Summary of standard libraries or classes that extend the language’s type system.
  3. Semantics of expression evaluation.
  4. Coercions/implicit type conversions that are performed in expression evaluation.
  5. Semantics of assignment statements.

The same comments about paper length as for Part 2 apply here as well.

Programming. Write a function that accepts an HTML document (a string) and a keyword (also a string). The function will find all occurrences of the keyword in the HTML string after the <body> element unless the keyword appears within an HTML tag, then surround the string found with tags to ``highlight’’ the keyword. For example,

<span style="background-color: blue; color: white">keyword</span>

You will have to be careful not to highlight strings occurring within an HTML tag. For example, if the keyword is ``table’’, you wouldn’t want to mark up this:

<table width="100%" border="0">

The result would hardly be valid HTML!

There are at least two approaches to this problem:

  1. String processing. Treat the HTML as one long string and program a lexer that returns the next chunk, which would either be a tag, text, or a comment. (An HTML comment looks like this: <!--hi there-->.) If it is a tag or a comment, then just append it to the output. If it is text, then search the string for the keyword and apply the new markup if found.

    This may not be the best way to approach the problem. HTML and XHTML can be quite complex to parse. In addition, if we were really doing this right, the text between certain tag pairs would need to be skipped, for example, <script></script>, etc.; we won’t worry about that for this assignment, however. The approach also treats one abstraction (HTML) as another (a string), which may not be the best approach.

    If you do use the string approach, don’t start within the <head> tag, but start searching after the <body> tag.

  2. DOM document processing. An HTML document is handled by browsers and other document processors by transforming the text into a tree structure defined by the W3C Document Object Model (DOM). For our purposes, each node in a DOM document is either an HTML element, a comment, a script, or text. (We’ll ignore other things, like CDATA nodes.) To handle string markup, traverse the DOM tree and check the text nodes. You can also use an XPath expression to do actual string search and return a set of nodes. Either way, if you find a string to mark up, remove the text node, then create and append a new subtree (left text, markup with child text, and right text).

    Although this sounds difficult, it’s actually pretty easy to do with the DOM parser libraries that can be obtained for virtually any language. Just search the Web for “[language name] DOM parser”.

You can use either approach. If you are not familiar with HTML and the DOM, then the string processing method is likely easier in the short run. However, the DOM approach is much easier in general, and if you make the effort to understand the DOM, you will acquire a foundation of skills that will be useful for a long time to come.

Write a program that demonstrates and tests your function. If your language is a web scripting language, you can simply use a form that posts to your program, which will then output the HTML back to the browser. Or you can write a program that reads data from an HTML file and outputs the revised HTML to a different file.

Turn in: Turn in your assignment using Blackboard, following instructions found there.

Part 5: Control constructs and subprograms

Assigned: 11/1
First draft due: 11/20
Edited version due: 11/27
Revised draft due: 12/1

In this part of the project, you will write about: (1) control flow constructs in your language (conditionals, loops, gotos, etc.); and (2) how subprograms are defined and called, including scoping rules, parameter passing, recursion, and whether subprograms themselves can be passed as parameters.

Programming. Write two versions of a non-trivial program in your language, one using iteration and the other using recursion.

Turn in: Turn in your assignment using Blackboard, following instructions found there.

Part 6: Conclusion/evaluation

Assigned: 11/27
PROJECT FINAL DRAFT DUE: 12/8

In this part of the project, you will also combine all prior parts into a coherent paper about your language. This is a chance to redeem yourself on prior work if you didn’t get a grade you liked. You will turn in your entire project as a single paper. There will be a single abstract, and each piece will now be a section, what were sections will be subsections, etc. (Note: in LATEΧ, this is pretty easy – if \subsection is your current deepest level, find and replace all of those with \subsubsection. Then find \section and replace with \subsection, etc.) The conclusion will be a new section. Your complete annotated bibliography will be at the end. Any appendices (other than code) will come after the bibliography.

You may either just reuse what you turned in before with no changes to the writing, or you may edit what you turned in and improve it. You will let me know on the title page which you’ve done, either a note saying ``as is’’ or ``edited’’ for parts 1–4.

Programming. None. You do not need to turn in prior programs again, unless you need to because you discuss them in your paper.

Turn in: Turn in your assignment using Blackboard, following instructions found there.