※ Download: Compiler lab manual
Each of the NFA will be displayed is sequential order. CUP stands for Constructor of Useful Parsers. Note the initial dot. Type java MyLexer The program expects input from the keyboard, so type 123 + 456 789 The program should nd the ve tokens in this expression.
The purpose of this is twofold. Go to the Java web page, access the String page, and look at the String constructors. This is all governed by the timestamps on the les.
- Then right-click on the title bar and select Properties. Now we can test our program.
Laboratory Manual for Compiler Design Robb T. Keep all of your work for this course in this folder. Copy the folder Lab 01 from the Compiler Design CD to your folder. The purpose of this is twofold. You will be more aware of the setup that we will be using and you will be able to set up the same software on your own computer. You may use the DOS command window if you want, but I think it would be better if you gained experience with a UNIX-type system. An exception to this will be editing source text. The UNIX editors are truly awful. Even though there is some bene t in learning how to use them, I recommend that you use CodeWarrior to edit your Java source les in this course. Cygwin creates a UNIX-type environment for Windows. A large number of the standard UNIX commands are available. Cygwin has already been installed on your computer. You will see the Cygwin icon on the desktop, so we will not download it now. When you run the setup program, one of the familiar installer programs will start up, asking youseveral questions. Generally, you should go with the defaults. You will get a minimal install plus the development tools. The program will go through three stages: downloading, installing, and executing. These stages should take roughly 25 minutes, 10 minutes, and 1 minute, respectively. If the installation fails, then try again. After Cygwin is installed, double-click on the Cygwin icon on the desktop to start the Cygwin window. Then right-click on the title bar and select Properties. You may change the font, the size of the window, and the colors. My preference is to choose a small font 12 pt and then make the window as wide and as tall as possible. Type the command pwd print working directory to see the pathname of the current directory. Choose the name of one of the subdirectories in the current directory and type the command cd directory-name where directory-name is the name of the subdirectory that you chose. Now repeat the commands pwd and dir. Type the command cd.. For example, to move up two levels and then down to a directory called programs, you could type cd.. Use pwd to see what it is. You will nd it very convenient to set the HOME environment variable to your Coms 480 folder. Then Cygwin will always start there when you open a Cygwin window and you can always return there by typing cd To make this the home directory, bring up the System control panel. Click on the Advanced tab and then click on Environment Variables. In the top section, named User Variables, click New. Enter HOME as the Variable name and type the exact pathname of your Coms 480 directory as the Variable value. If you open a window to that directory, then you should be able to copy and paste the pathname. Be sure to use the backslash as a separator between directories. Then click OK on all three windows to save the settings. Now close the Cygwin window and open a new one. This is necessary in order to reinitialize the environment variables. This window should have opened to your folder. You can con rm that by typing pwd. From now on, Cygwin will begin in this folder. Follow the instructions through the next two web pages. Do not download the Sun Download Manager. When you are nished there should be a Java folder in the Program Files folder of the C drive. Inside one of the subfolders is the javac. It is in the Lab 01 folder that you downloaded. Type the command javac Hello. Open the System program on the Control Panel and go the Environment Variables window again. This time we must add a PATH variable. The PATH variable tells the computer where to nd executable les, including the Java compiler. You may have to search for the Java compiler. Indeed, there may be more than one on the computer. We will use the one in the folder named C:nProgram FilesnJavanjdk1. Once you know where it is, then create the PATH variable with this pathname as its value, as you did the HOME variable earlier. Close the Cygwin window and open a new one. Now try again to compile the program. This time the program should compile. Type dir to see that the le Hello. This is the compiled Java program. If you know C++, then you should have no trouble picking up Java since it is quite similar. The two languages use mostly the same keywords, same constructs, and the same syntax. One major di erence, however, is that Java is heavily object-oriented. Every function must be a member function of some class. Another di erence is that Java comes with an extensive library of classes. You will want to return here many times later in the course. This web site contains the documentation for all Java classes. For example, in the upper left frame, scroll down and click on java. In the window below, the names of all the classes in the java. To the right you see the information about the Integer class. Another di erence between Java and C++ is that Java is weak on operators. Oper- ators are de ned only for the primitive objects: int, float, double, char, etc. Remember, there are no operators + or for the Integer class 1. GETTING STARTED This is the web site for JLex, the Java lexical analyzer generator. Click on Installa- tion Instructions. Read these instructions carefully. Create a subfolder named JLex. You can either do this in Windows or you can type the command mkdir JLex When you are ready, click on Source Code and save it in your JLex folder. The Java code for JLex Main. Move to the JLex directory and compile JLex by typing javac Main. On the web page, click on Sample Input. Copy and paste the contents of the page into a new le in CodeWarrior and save it in Lab 01 as sample. This le must be edited slightly in order to work on our PCs. In UNIX les, each line ends with only a newline character. Go to line 116 in sample. Now type java JLex. Main means the Main. The program javac could not nd the le Main. To solve this problem, we must de ne the CLASSPATH variable. The CLASSPATH variable tells the Java compiler where to look for Java source code les. Set this variable as before, setting the value of CLASSPATH to. Be sure to replace your-name with the name of your workspace in the Students directory. Note the initial dot. This refers to the current directory. Thus, Cygwin will search the current directory rst. The semicolon is a separator. This creates the le sample. Then type javac sample. You will get an error message 7 of them about the assert keyword. This program creates an assert function, but the latest versions of Java use assert as a keyword. Go back to the le and change assert to myAssert. Then recreate the le sample. Now we can test our program. Type java Sample Enter various lines of C code and see what the output is. When you are satis ed, type CTRL-Z as many times as necessary to indicate end of le. The program should terminate. CUP stands for Constructor of Useful Parsers. It also is a play on the java theme. We will save the CUP les in a CUP directory. Create a CUP directory now as a subdirectory of your Coms 480 directory. This is the most recent stable version. Then click on Save. Save it in your CUP folder. This will download a zip le to be unzipped. All the les needed for CUP will be extracted. Double-click on20 LABORATORY 1. GETTING STARTED the le java cup v10k. Follow the Zip instructions. You should save the extracted les in your CUP folder. The Java classes in CUP are already compiled. We will now test CUP using a slightly modi ed version of the sample program that appears in the CUP User's Manual. In the Lab 01 folder, there are the les scanner. The lename java cup. Main refers to the le Main. Java does not know to look in the CUP directory for Java class les. Therefore, we must add the pathname of the CUP directory to the paths to be searched in the CLASSPATH environment variable. Make this change, close the Cygwin window, and open a new one. This will create the Java source les parser. Compile these two les and then compile the les scanner. Run the evaluator program by typing java Evaluator The program accepts keyboard input. Type in an integer expression such as 2 + 3 4; Be sure to end the expression with a semicolon. The program will print the value of the expression. When you are nished, type CTRL-Z. Rather than drag each one individually to the dropbox, you will place your work in a folder, zip the folder into a zip le, and drop the zip le in the dropbox. I will unzip it and test it. We will zip the les used in the last two examples, namely, the les Sample. To do this, start up WinZip and follow the instructions. Use the Wizard version of WinZip. Select Create a new Zip le. Give it the name Lab 01. Add the speci ed les by repeatedly clicking Add les. Have the output directed to your Coms 480 folder. Then click Zip Now and exit WinZip. Next create a folder named Test in which to put the extracted les. Double-click on the zip le and follow the instructions to extract the les. Direct the output to folder Test. Open Test to verify that the original les are there. Now test the results by recompiling the les be sure to change the directory to Test in Cygwin and running sample and Evaluator again. Preliminary Copy the folder Lab 02 from the Compiler Design CD to your Coms 480 folder. Once we understand how this is done, we will be able to create a lexer for a larger set of C tokens. WRITING A LEXICAL ANALYZER 2. A make le consists mainly of a list of dependencies and actions. A dependency is written in the form target: sources action where target is a le name and sources is a list of le names. The tab before the action part is mandatory. This means that the target le depends on the source les. Whenever any of the source les is updated, then the target le will be updated by performing the action. This is all governed by the timestamps on the les. The make le for Lab 2 contains the dependencies among the les used by this program. In this case, it is very simple: there are only four les and two depen- dencies. That means that whenever MyLexer. In the make le, the line MyLexer. The line below that, javac MyLexer. Note that this line begins with a mandatory tab. A similar pair of lines appears for Token. As our programs become more and more complicated, you will come to appreciate the make les more and more. In Lab 3 the make le will be more sophisticated. To invoke the make le, type the command make Try this now. You should see that the Java compiler is invoked and MyLexer and Token are compiled. Type the command again and you will see that it says that MyLexer is up to date, so it does not recompile it. Now we will delete the le MyLexer. Type the command27 rm MyLexer. Now execute the make command. Let's do it one more time. This time we will not remove MyLexer. The touch command will change the timestamp of a le to the current time. This is a Java program that nds certain tokens in the input stream. Currently if nds only positive integers, plus signs + , and times signs. Look in the Lab 02 folder and see that there is now a le named MyLexer. This is the compiled bytecode version of MyLexer. Now we will run MyLexer. Type java MyLexer The program expects input from the keyboard, so type 123 + 456 789 The program should nd the ve tokens in this expression. Then type CTRL-Z and press return again. CTRL-Z is interpreted as end-of- le. This program does not recognize any grammar rules; those will come later. There- fore, any string of legal tokens will be processed correctly. WRITING A LEXICAL ANALYZER 2. First we will look at the tokens. Open the le Token. The purpose of the Token class is to provide a list of symbolic constants to be used by the lexer. The Token class also provides a set of strings so that we can print the name of the token in an readable form. Now look in the le MyLexer. The program creates a BufferedReader object named source. Look at the function getNextChar. It gets an integer iVal from source and then converts it to a character cVal. Look at the function advance. Once a character has been processed, we place it in a character bu er and read another character. To clear the bu er, we simply set charCnt to 0. The main function initializes the lexer and then processes tokens by repeatedly calling next token until it returns the EOF token. Note the use of the expression new String buffer, 0, charCnt to convert the contents of the bu er to a String. Go to the Java web page, access the String page, and look at the String constructors. You should develop the habit of referring to the Java API pages as often as necessary, i. You will nd the answers to many of your Java questions there. The heart of the MyLexer class is the next token function. By looking at the current character cVal, next token is able to decide which type of token is being read. It processes the token and returns the token type. After you have nished the lexer, test it on the le testfile. The lexer uses standard input keyboard and standard output monitor , but you may redirect them to les. To read input from the le testfile, type java MyLexer testfile To redirect output to a le named, say, outfile. When the output is complicated, this method allows you to inspect it at your leisure. Or you can print it and inspect it later. Zip the les MyLexer. Also read the JLex User's Manual. Preliminary Copy the folder Lab 03 from the Compiler Design CD to your folder. JLex will use these rules to build a Java program that will be a lexical analyzer. The rules in the le tokens.
Then click Zip Now and exit WinZip. Run the evaluator program by typing java Evaluator The program accepts keyboard input. From now on, Cygwin will begin in this folder. People seeking this manual can easily download it fro here. Enter HOME as the Variable name and type the exact pathname of your Coms 480 directory as the Variable value. Use pwd to see what it is.