031.1 Lesson 1

Certificate:

Web Development Essentials

Version:

1.0

Topic:

031 Software Development and Web Technologies

Objective:

031.1 Software Development Basics

Lesson:

1 of 1

Introduction

The very first computers were programmed through the grueling process of plugging cables into sockets. Computer scientists soon started a never-ending search for easy ways to tell the computer what to do. This chapter introduces the tools of programming. It discusses the key ways that text instructions—programming languages—represent the tasks a programmer wants to accomplish, and the tools that change the program into a form called machine language that a computer can run.

Note	In this text, the terms program and application are used interchangeably.

Source Code

A programmer normally develops an application by writing a textual description, called source code, of the desired task. The source code is in a carefully defined programming language that represents what the computer can do in a high-level abstraction humans can understand. Tools have also been developed to let programmers as well as non-programmers express their thoughts visually, but writing source code is still the predominant way to program.

In the same way that a natural language has nouns, verbs, and constructions to express ideas in a structured way, the words and punctuation in a programming language are symbolic representations of operations that will be performed on the machine.

In this sense, source code is not very different from any other text in which the author employs the well-established rules of a natural language to communicate with the reader. In the case of source code, the “reader” is the machine, so the text cannot contain ambiguities or inconsistencies—even subtle ones.

And like any text that discusses some topic in depth, the source code also needs to be well structured and logically organized when developing complex applications. Very simple programs and didactic examples can be stored in the few lines of a single text file, which contains all the program’s source code. More complex programs can be subdivided into thousands of files, each with thousands of lines.

The source code of professional applications should be organized into different folders, usually associated with a particular purpose. A chat program, for example, can be organized into two folders: one that contains the code files that handle the transmission and reception of messages over the network, and another folder that contains the files that build the interface and react to user actions. Indeed, it is common to have many folders and subfolders with source code files dedicated to very specific tasks within the application.

Moreover, the source code is not always isolated in its own files, with everything written in a single language. In web applications, for example, an HTML document can embed JavaScript code to supplement the document with extra functionality.

Code Editors and IDE

The variety of ways in which source code can be written can be intimidating. Therefore, many developers take advantage of tools that help with writing and testing the program.

The source code file is just a plain text file. As such, it can be edited by any text editor, no matter how simple. To make it easier to distinguish between source code and plain text, each language adopts a self-explanatory filename extension: .c for the C language, .py for Python, .js for JavaScript, etc. General-purpose editors often understand the source code of popular languages well enough to add italics, colors, and indentation to make the code understandable.

Not every developer chooses to edit source code in a general-purpose editor. An integrated development environment (IDE) provides a text editor along with tools to help the programmer avoid syntactic errors and obvious inconsistencies. These editors are particularly recommended for less experienced programmers, but experienced programmers use them as well.

Popular IDEs such Visual Studio, Eclipse, and Xcode intelligently watch what the programmer types, frequently suggesting words to use (autocompletion) and verifying code in real-time. The IDEs can even offer automated debugging and testing to identify issues whenever the source code changes.

Some more experienced programmers opt for less intuitive editors such as Vim, which offer greater flexibility and do not require installation of additional packages. These programmers use external, standalone tools to add the features that are built-in when you use an IDE.

Code Maintenance

Whether in an IDE or using standalone tools, it’s important to employ some kind of version control system (VCS). Source code is constantly evolving because unforeseen flaws need to be fixed and enhancements need to be incorporated. An inevitable consequence of this evolution is that fixes and enhancements can interfere with other parts of applications in a large code base. Version control tools such as Git, Subversion, and Mercurial record all changes made to the code and who made the change, allowing you to trace and eventually recover from an unsuccessful modification.

Furthermore, version control tools allow each developer on the team to work on a copy of the source code files without interfering with the work of other programmers. Once the new versions of source code are ready and tested, corrections or improvements made to one copy can be incorporated by other team members.

Git, the most popular version control system nowadays, allows many independent copies of a repository to be maintained by different people, who share their changes as they desire. However, whether using a decentralized or centralized version control system, most teams maintain one trusted repository whose source code and resources can be relied on. Several online services offer storage for repositories of source code. The most popular of these services are GitHub and GitLab, but the GNU project’s Savannah is also worth mentioning.

Programming Languages

A wide variety of programming languages exist; each decade sees the invention of new ones. Each programming language has its own rules and is recommended for particular purposes. Although the languages show superficial differences in syntax and keywords, what really distinguishes the languages are the deep conceptual approaches they represent, known as paradigms.

Paradigms

Paradigms define the premises on which a programming language is based, especially concerning how the source code should be structured.

The developer starts from the language paradigm to formulate the tasks to be performed by the machine. These tasks, in turn, are symbolically expressed with the words and syntactic constructions offered by the language.

The programming language is procedural when the instructions presented in the source code are executed in sequential order, like a movie script. If the source code is segmented into functions or subroutines, a main routine takes care of calling the functions in sequence.

The following code is an example of a procedural language. Written in C, it defines variables to represent the side, area and volume of geographical shapes. The value of the side variable is assigned in main(), which is the function invoked when the program is executed. The area and volume variables are calculated in the square() and cube() subroutines that precede the main function:

#include <stdio.h>

float side;
float area;
float volume;

void square(){ area = side * side; }

void cube(){ volume = area * side; }

int main(){
  side = 2;
  square();
  cube();
  printf("Volume: %f\n", volume);
  return 0;
}

The order of actions defined in main() determines the sequence of program states, characterized by the value of the side, area, and volume variables. The example ends after displaying the value of volume with the printf statement.

On the other hand, the paradigm of object-oriented programming (OOP) has as its main characteristic the separation of the program state into independent sub-states. These sub-states and associated operations are the objects, so called because they have a more or less independent existence within the program and because they have specific purposes.

Distinct paradigms do not necessarily restrict the type of task that can be performed by a program. The code from the previous example can be rewritten according to the OOP paradigm using the C++ language:

#include <iostream>

class Cube {
  float side;
  public:
  Cube(float s){ side = s; }
  float volume() { return side * side * side; }
};

int main(){
  float side = 2;
  Cube cube(side);
  std::cout << "Volume: " << cube.volume() << std::endl;
  return 0;
}

The main() function is still present. But now there is a new word, class, that introduces the definition of an object. The defined class, named Cube, contains its own variables and subroutines. In OOP, a variable is also called an attribute and a subroutine is called a method.

It’s beyond the scope of this chapter to explain all the C++ code in the example. What’s important to us here is that Cube contains the side attribute and two methods. The volume() method calculates the cube’s volume.

It is possible to create several independent objects from the same class, and classes can be composed of other classes.

Keep in mind that these same features can be written differently and that the examples in this chapter are oversimplified. C and C++ have much more sophisticated features that allow much more complex and practical constructions.

Most programming languages do not rigorously impose one paradigm, but allow programmers to choose various aspects of one paradigm or another. JavaScript, for example, incorporates aspects of different paradigms. The programmer can decompose the entire program into functions that do not share a common state with each other:

function cube(side){
  return side*side*side;
}

console.log("Volume: " + cube(2));

Although this example is similar to procedural programming, note that the function receives a copy of all the information necessary for its execution and always produces the same result for the same parameter, regardless of changes that happen outside the function’s scope. This paradigm, called functional, is strongly influenced by mathematical formalism, where every operation is self-sufficient.

Another paradigm covers declarative languages, which describe the states you want the system to be in. A declarative language can figure out how to achieve the specified states. SQL, the universal language for querying databases, is sometimes called a declarative language, although it really occupies a unique niche in the programming pantheon.

There is no universal paradigm that can be adopted to any context. The choice of language may also be restricted by which languages are supported on the platform or execution environment where the program will be used.

A web application that will be used by the browser, for example, will need to be written in JavaScript, which is a language universally supported by browsers. (A few other languages can be used because they provide converters to create JavaScript.) So for the web browser—sometimes called the client side or front end of the web application—the developer will have to use the paradigms allowed in JavaScript. The server side or back end of the application, which handles requests from the browser, is normally programmed in a different language; PHP is most popular for this purpose.

Regardless of paradigm, every language has pre-built libraries of functions that can be incorporated into code. Mathematical functions—like the ones illustrated in the example code—don’t need to be implemented from scratch, as the language already has the function ready to use. JavaScript, for example, provides the Math object with the most common math operations.

Even more specialized functions are usually available from the language vendor or third-party developers. These extra resource libraries can be in source code form; i.e., in extra files that are incorporated into the file where they will be used. In JavaScript, embedding is done with import from:

import { OrbitControls } from 'modules/OrbitControls.js';

This type of import, where the embedded resource is also a source code file, is most often used in so-called interpreted languages. Compiled languages allow, among other things, the incorporation of pre-compiled features in machine language, that is, compiled libraries. The next section explains the differences between these types of languages.

Compilers and Interpreters

As we already know, source code is a symbolic representation of a program that needs to be translated into machine language in order to run.

Roughly speaking, there are two possible ways to do the translation: converting the source code beforehand for future execution, or converting the code at the moment of its execution. Languages of the first modality are called compiled languages and languages of the second modality are called interpreted languages. Some interpreted languages provide compilation as an option, so that the program can start faster.

In compiled languages, there is a clear distinction between the source code of the program and the program itself, which will be executed by the computer. Once compiled, the program will usually work only on the operating system and platform for which it was compiled.

In an interpreted language, the source code itself is treated as the program, and the process of converting to machine language is transparent to the programmer. For an interpreted language, it is common to call the source code a script. The interpreter translates the script into the machine language for the system it’s running on.

Compilation and Compilers

The C programming language is one of the best-known examples of a compiled language. The C language’s greatest strengths are its flexibility and performance. Both high-performance supercomputers and microcontrollers in home appliances can be programmed in the C language. Other examples of popular compiled languages are C++ and C# (C sharp). As their names suggest, these languages are inspired by C, but include features that support the object-oriented paradigm.

The same program written in C or C++ can be compiled for different platforms, requiring little or no change to the source code. It is the compiler that defines the target platform of the program. There are platform-specific compilers as well as cross-platform compilers such as GCC (which stands for GNU Compiler Collection) that can produce binary programs for many distinct architectures.

Note

There are also tools that automate the compilation process. Instead of invoking the compiler directly, the programmer creates a file indicating the different compilation steps to be performed automatically. The traditional tool used for this purpose is make, but a number of newer tools such as Maven and Gradle are also in widespread use. The entire build process is automated when you use an IDE.

The compilation process does not always generate a binary program in machine language. There are compiled languages that produce a program in a format generically called bytecode. Like a script, bytecode is not in a platform-specific language, so it requires an interpreter program that translates it into machine language. In this case, the interpreter program is simply called a runtime.

The Java language takes this approach, so compiled programs written in Java can be used on different operating systems. Despite its name, Java is unrelated to JavaScript.

Bytecode is closer to machine language than source code, so its execution tends to be comparatively faster. Because there is still a conversion process during the execution of the bytecode, it is difficult to obtain the same performance as an equivalent program compiled into machine language.

Interpretation and Interpreters

In interpreted languages such as JavaScript, Python, and PHP, the program does not need to be precompiled, making it easier to develop and modify it. Instead of compiling it, the script is executed by another program called an interpreter. Usually, the interpreter of a language is named after the language itself. The interpreter of a Python script, for example, is a program called python. The JavaScript interpreter is most often the web browser, but scripts can also be executed by the node program outside a browser. Because it is converted to binary instructions every time it is executed, an interpreted language program tends to be slower than a compiled language equivalent.

Nothing prevents the same application from having components written in different languages. If necessary, these components can communicate through a mutually understandable application programming interface (API).

The Python language, for example, has very sophisticated data mining and data tabulation capabilities. The developer can choose Python to write the parts of the program that deal with these aspects and another language, such as C++, to perform the heavier numeric processing. It is possible to adopt this strategy even when there is no API that allows direct communication between the two components. Code written in Python can generate a file in the proper format to be used by a program written in C++, for example.

Although it is possible to write almost any program in any language, the developer should adopt the one that is most in line with the application’s purpose. In doing so, you benefit from the reuse of already tested and well-documented components.

Guided Exercises

What kind of program can be used to edit source code?
What kind of tool helps to integrate the work of different developers into the same code base?

Explorational Exercises

Suppose you want to write a 3D game to be played in the browser. Web apps and games are programmed in JavaScript. Although it is possible to write all the graphic functions from scratch, it is more productive to use a ready-made library for this purpose. Which third-party libraries provide capabilities for 3D animation in JavaScript?
Besides PHP, what other languages can be used on the server side of a web application?

Summary

This lesson covers the most essential concepts of software development. The developer must be aware of important programming languages and the proper usage scenario for each. This lesson goes through the following concepts and procedures:

What source code is.
Source code editors and related tools.
Procedural, object-oriented, functional, and declarative programming paradigms.
Characteristics of compiled and interpreted languages.

Answers to Guided Exercises

What kind of program can be used to edit source code?

In principle, any program capable of editing plain text.
What kind of tool helps to integrate the work of different developers into the same code base?

A source or version control system, such as Git.

Answers to Explorational Exercises

Suppose you want to write a 3D game to be played in the browser. Web apps and games are programmed in JavaScript. Although it is possible to write all the graphic functions from scratch, it is more productive to use a ready-made library for this purpose. Which third-party libraries provide capabilities for 3D animation in JavaScript?

There are many options for 3D graphics libraries for JavaScript, such as threejs and BabylonJS.
Besides PHP, what other languages can be used on the server side of a web application?

Any language supported by the HTTP server application used on the server host. Some examples are Python, Ruby, Perl, and JavaScript itself.