Introduction to Fundamental Programming Concepts
Before we examine the process of planning a speech synthesis or text-to-speech project, we will first examine some basic programming concepts.
There are five steps in program development:-
- Defining the problem
- Planning the solution
- Coding the program
- Testing the program
- Documenting the program
The first step in this process is not a trivial step, particularly when you are writing a program for someone else, or when you are getting someone to write a program for you. The types of input and output, and the intervening procedures which will be carried out on the data must be well defined. If more than one party is involved than they must all be sure that they understand the problem in the same way. The problem must be defined in detail.
The planning stage is largely the problem of the programmer, although from time to time it will be necessary to check back with the person who commissioned the program to ensure that the program (as it evolves into greater detail) still matches the user's requirements. There are a number of ways of developing a program. One method is a graphical technique calledflowcharting which will be examined briefly here. Another method is the definition of the problem in pseudocode. Pseudocode is a formal language which resembles natural English but which is sufficiently constrained and formalised to allow a precise statement of the program's structure.
The coding stage is the translation of the program from flowcharts or pseudocode to the programming language of choice.
The testing stage is always necessary. What you are looking for are errors (bugs) in your program and you will nearly always find some. The most common way to test for errors (often called debugging or looking for and removing bugs) is to pass the program through a translatorprogram. The translator program expects the program to follow very rigid syntactic rules. If this is not so the translator will send you an error message (ie. it will tell you, by displaying a message on the screen, that you have made a syntax error). For many languages, the translator examines the entire program at one time and translates the program into a code that the machine can understand (machine code). Such a translator program is called a compiler and most programmers can often be heard talking about "compiling" their programs. The process of compiling a program results in a file called an executable. When you run the program, you use this executable file rather than directly using the source code. In scripting languages such as Perland Tcl/Tk and languages such as BASIC, source code is translated line by line when youexecute the code and any syntax errors are reported at that time. Such a run-time translator program is called an interpreter program.
Even if you get the syntax right, however, there is no guarantee that your program will do what you want it to do. The logic (or semantics) of the program might be wrong just as a syntactically correct English sentence might be meaningless or may mean something that was not intended.
For example:-
The sentence, "Colourless green ideas sleep furiously", is syntactically well-formed but is quite meaningless.
The sentence, "The boy hit the girl", is both syntactically well-formed and is also meaningful, but may still be in error if the intended message was "The girl hit the boy."
The final stage in program development is the documentation stage. This stage is essential because you may forget the details of your program after a few months and may be required to modify it for some reason (eg. a well hidden bug may have been found). How can you alter the program if you don't remember how it works? Even more importantly, you may not be the person called upon to alter the program. How can someone else be expected to understand your program if you have not carefully documented it?
Conditions
A condition is any logical or mathematical statement (or expression). Such a statement can be either true or false.
The following are examples of conditions:-
STATEMENT/EXPRESSION | TRUE/FALSE | |
1. | The sky is green with red spots. | FALSE |
2. | Aristotle was a man. | TRUE |
3. | 1 + 2 = 3 | TRUE |
4. | 1 + 2 = 4 | FALSE |
5. | 4 > 3 | TRUE |
6. | 4 < 3 | FALSE |
7. | A = 3 | unknown |
8. | A = B | unknown |
9. | A > B | unknown |
Of the above, we only know if a statement is correct if we know the value of each item. We know, for example that the sky is blue (on a cloudless day), therefore the first statement is false. We know that Aristotle was a man, therefore the second statement is true. The numbers 1, 2, 3, and 4 have fixed values and therefore we can readily evaluate expressions 3 to 6 as true or false. If an item in an expression has a fixed value (eg. a number or sky=blue, Aristotle=man) then such an item is referred to as a constant.
If an item does not have a fixed value (ie. if its value can be changed) such an item is called avariable. When algebraic variables (such as "A" and "B" in expressions 7 to 9) occur in a statement they cause that statement's truth to be unknown unless the values of the variables have been previously defined. Statements 7 to 9 need to be qualified in the following ways before the conditions can be evaluated. In other words the variables need to be (temporarily) allocated a value.
7. | LET A = 3, A = 3 ? | TRUE |
LET A = 5, A = 3 ? | FALSE | |
8. | LET A = 10, LET B = -3.6, A = B ? | FALSE |
LET A = 8, LET B = 8, A = B ? | TRUE | |
9. | LET A = 99.99, LET B = -5000, A > B ? | TRUE |
Variables
In most programming languages there are at least two types of variables, numerical variables and alphanumeric (or string) variables. String variables have text values rather than number values (eg. B$ = 'FRED')
Flowcharts
Flowcharts are graphical methods of outlining program structure and generally utilise a set of standard symbols known as the ANSI (American National Standards Institute) symbols. The ones that we will mostly use are shown in figure 1.
Figure 1: Some standard flowcharting symbols. |
Structured Programming
One of the most important features of structured programming is that all program structures have only one entry point and one exit point.
There are three main types of control structures
- Sequence (see figure 2)
- Selection (IF-THEN), (IF-THEN-ELSE), see figure 3; IF-THEN-ELSEIF-THEN-ELSE, see figure 4)
- Iteration (DO-WHILE, see figure 5; DO-UNTIL, see figure 6)
A sequence of instructions in a sequence structure must all be carried out.
In a selection structure certain instructions will only be carried out IF a certain condition is found to be true.
In an iteration structure certain instructions will be repeatedly carried out until a condition is no longer true (DO-WHILE) or alternatively until a condition is no longer false (DO-UNTIL). Such structures consist of a loop which takes the program back to the top of the instructions to be repeated.
Figure 2: An example of a sequence control structure. |
Figure 3: An example of a selection (IF-THEN-ESLE) control structure. This flowchart means:- IF "Condition" is true THEN perform "Procedure 2", ELSE perform "Procedure 3". |
Figure 4: An example of a selection (IF-THEN-ELSEIF-THEN-ELSE) control structure. This flowchart means:- IF "Condition 1" is true THEN perform "Procedure 2", ELSE IF "Condition 2" is true THEN perform "Procedure 3" ELSE perform "Procedure 4". |
Figure 5: An example of an iteration (DO-WHILE) control structure. Note that the loop condition (in this case the question "is i < 10?") is tested before proceding with the rest of the procedures in the loop and that the loop control variable ("i") is incremented at the end of the loop. If the loop condition is true then the loop is executed. The loop only terminates when the loop condition is false. If the loop condition is false upon entry to the loop then the loop isn't executed at all. |
Figure 6: An example of an iteration (DO-UNTIL) control structure. Note that the loop condition (in this case the question "is i > 10?") is tested at the end of the loop. The loop control variable (in this case "i") can be updated anywhere in the loop and here is incremented just before the loop condition is tested. If the loop condition is true then the loop is terminated. The loop only repeats when the loop condition is false. Even if the loop condition is true upon entry to the loop the loop will be executed at least once. |
Pseudocode
An example of pseudocode is given here as a brief illustration of its structure. This example is a pseudocode version of the IF-THEN-ELSE flowchart in figure 4. Some developers prefer to use pseudocode rather than flowcharts.
IF condition is true THEN | |
true statement 1 | |
true statement 2 | |
... | |
true statement n | |
ELSE | |
false statement 1 | |
false statement 2 | |
... | |
false statement n | |
ENDIF |
Unified Modeling Language (UML)
"The UML is a modelling language for specifying, visualizing, constructing, and documenting the artifacts of a system-intensive process."
Sinan Si Alhir, UML in a Nutshell: A Desktop Quick Reference, O'Reilly, 1998.
UML has become very popular in recent years as a standard way of planning, constructing and documenting software projects. It has the advantage of facilitating these processes using graphical visualisation tools. A number of popular UML-based tools (eg. Visual Rose, and Microsoft's Visio) are also able to generate from a visualised project structure the basic classes that the programmer will need to elaborate in order to complete the project.
UML uses diagrams, some of which resemble traditional flowcharts, to visualise the way objects and people will interact when the project is completed. UML is an object-oriented tool that is particularly suited to the design of projects that use object-oriented languages such as C++ or Java.
Traditional flowcharts and pseudocode, on the other hand, are suited to procedural approaches to programming. However, once the classes for an object-oriented project have been designed, class member functions can still be profitably designed using flowcharts.
Some speech technology companies utilise UML during the development and documentation of their products. Poseidon for UML is being used in our research centre for certain speech and hearing software projects.
The Object-oriented Approach to Programming
Traditional programming techniques concentrated on the structural and dynamic characteristics of a problem during project development. Such approaches can focus either on the behaviour (process or function driven) or on the data (data-driven) aspects of a problem.
Object-oriented programming techniques treat both the data-like and process-like elements of a problem as parts of a complete unit. Central to the object-oriented paradigm is the concept of the "class". A class defines both the structure of the data and the processes that act upon that data for a logical component of a project. A class contains both member data variables and member functions. The functions are specialised to operate only on variables that are members of their own class.
A class is a set functions and related data structures found in programming code, whilst an objectis an instance of that class that is being used to interact with actual data.
Typical non-object-oriented procedural languages include Fortran, Cobol, and C (but note that recently there has been an effort to develop object-oriented Fortran as part of the Fortran 2000 and Fortran 2003 standards).
Typical object-oriented languages are C++ (which evolved from C) and Java.
C++ is probably the most commonly used language in the development of commercial speech technology.
Python is also an object oriented language. Python can be embedded into a project (which is often written in another language, such as C++) so that end users can write Python scripts that interact with the built in classes of that host project. Python is a popular language for various language processing tasks and is the scripting language of choice for the Natural Language Toolkit (NLTK).
No comments:
Post a Comment