*************** **Session 01** *************** Introduction ============ Installation of PyCharm IDE for Python -------------------------------------- Using http://jetbrains.com/, we can download a free and open-source **Education** version of PyCharm: .. image:: images/1.JPG :width: 600 ----------------------------------------------------------------------------- Under the **Tools** menu, we choose *Education --> PyCharm Edu* .. image:: images/33.jpg :width: 600 ------------------------------------------------------------------------------ Click on download: .. image:: images/445.jpg :width: 600 ------------------------------------------------------------------------------- Choose the approriate PyCharm download according to your personal OS: .. image:: images/446.jpg :width: 600 ------------------------------------------------------------------------------- Once you downloaded your version, you can install it to your computer. Create New Project --> Location: change *untitled* to any title (e.g. ``IT1``) --> Create a virtual environment: .. image:: images/22.jpg :width: 600 ------------------------------------------------------------------------------- On the upper left you see the files inside your project. Create a subfolder and call it how you want (e.g. ``Praktika``). You will create your source code in seperate folders under this subfolder. .. image:: images/pycharm_ui.png :width: 600 ------------------------------------------------------------------------------- **Congratulations, the first task is done and PyCharm is successfully installed** ------------------------------------------------------------------------------- Word Counter in Python ====================== Read and print file ------------------- We are going to build a code that gets the number of lines or words in a file. It is more interesting than just printing *hello world* to the screen. So, how can we do it? Go to`GIT FH Aachen `_. Search for **it1-unterlagen** and download the file *birds.txt* from folder *Praktikum 1*. Create a new folder in ``Praktika``. Call it ``word_counter``. Move *birds.txt* here. We are working with a python file *read_file.py*, which you can also find in the folder **Praktikum 1** on GitLab: .. code-block:: python #!/usr/bin/python f = open("birds.txt", "r") data = f.read() f.close() print(data) Let’s start looking to the code: The first line in this file is called **"shebang"** line and it starts with #! (! is called the “bang” and # is called “hash”). When you execute a file from the shell, the shell tries to run the file using the command specified on the shebang line. It also tells which interpreter to use (Python in our case). The shebang line will be ignored by Python by default (*#* defines a comment in Python). Working with text files is easy in Python. The first step is to create a file object corresponding to a file (birds.txt in our case) on disk. This is done by using the open function. Usually a file object is immediately assigned to a variable like this: .. code-block:: python f = open ("birds.txt", "r") # = open (, ) We simply open a file called *birds.txt*. It must exist in the current directory (i.e., the directory you are running the code from). The r means the file will be opened in a read only mode. **Hint** Other mode parameters are **w** for writing to the file and **a** for appending to the file. After opening the file, we read its contents into a variable called **data** and close the file: .. code-block:: python data = f.read() f.close() Afterwards, we print the file: .. code-block:: python print(data) After running the file, we get: .. code-block:: text STRAY BIRDS BY RABINDRANATH TAGORE STRAY birds of summer come to my window to sing and fly away. And yellow leaves of autumn, which have no songs, flutter and fall there with a sigh. Now, we can read a file and print it on the screen. Our first python program is done. ------------------------------------------------------------------------------- Count words and lines --------------------- In this part, our task is to count the number of words. There is a *count_words.py* file in the **WordCount** folder. Until now, we are able to open the file and read it. These lines should be clear for everybody: .. code-block:: python #!/usr/bin/python f = open("birds.txt", "r") data = f.read() f.close() There are several built-in functions for strings (textual data) in Python. One of them is the ``split()`` function, which splits the string on the given parameter. The split operation turns the original string into a list of several substrings, using a certain character to split on as a parameter: .. code-block:: python words = data.split(" ") Here, we are splitting the **data** on a space. The function returns a list of substrings of the string split on a space. First, we take a sentence “I am a boy” and split it on an empty step between the words, in other words on a space. Python returns a list with four elements:  .. code-block:: text In: "I am a boy".split(" ") Out: ['I', 'am', 'a', 'boy'] We take another sentence “The birds, they are flying away, he said”. This time, we split it on a **comma**. Python should return a list of three substrings: .. code-block:: text In: "The birds, they are flying away, he said".split(",") Out: ['The birds', 'they are flying away', 'he said'] We should understand what we are doing actually. We are splitting the file on spaces or commas or whatever character. This gives us the number of words, as in English, words are separated by a space. We print the *words* what we found: .. code-block:: python print("The words in the text are:") print(words) Next, we call another function ``len()``, which returns the length of a list. We will be able to find out how many elements the list has, hence the number of words, **num_words**. .. code-block:: python num_words = len(words) print("The number of words is ", num_words) Next, by using the same method, we find out the number of lines. We do the same thing, except here we split on the newline character **\n**. The newline character is the code that tells the editor to insert a new line, a return. By counting the number of newline characters, we can get the number of lines in the program, ``len(lines)``: .. code-block:: python lines = data.split("\n") print("The lines in the text are:") print(lines) num_lines = len(lines) print("The number of lines is", num_lines) Run the *count_words.py* file and see the results: .. code-block:: text The words in the text are: ['STRAY', 'BIRDS', '\nBY', '\nRABINDRANATH', 'TAGORE', '\n\nSTRAY', 'birds', 'of', 'summer', 'come', 'to', 'my', '\nwindow', 'to', 'sing', 'and', 'fly', 'away.', '\n\nAnd', 'yellow', 'leaves', 'of', 'autumn,', 'which', '\nhave', 'no', 'songs,', 'flutter', 'and', 'fall', 'there', \nwith', 'a', 'sigh.'] The number of words is 34 The lines in the text are: ['STRAY BIRDS ', 'BY ', 'RABINDRANATH TAGORE ', '', 'STRAY birds of summer come to my', 'window to sing and fly away. ', '', 'And yellow leaves of autumn, which ', 'have no songs, flutter and fall there ', 'with a sigh.'] The number of lines is 10 Now open the file *birds.txt* and count the number of lines by hand. You will find the answers are different. That is because there is a bug in our code. It is counting empty lines as well. We need to fix that now. ------------------------------------------------------------------------------- Count lines fixed ----------------- This is an old code, we need to correct it as well: .. code-block:: python #!/usr/bin/python f = open("birds.txt", "r") data = f.read() f.close() lines = data.split("\n") print("Wrong: The number of lines is", len(lines)) We use here loops. They are needed to execute a sequence of statements multiple times in succession. We will deal with a particular a counted loop. It is built using a Python ``for`` statement. A Python ``for`` loop has this general syntax: .. code-block:: python for in : - is the body part of the loop and it can be any sequence of Python statements. - is called the loop index, it takes on each successive value in the sequence and the statements in the body are executed once for each value. - portion consists of a list of values. There is a colon (:) after the instruction. In Python, there are no curly braces {}. If you come from C/Java world, you had to use curly brackets instead of colon sign: .. code-block:: python for(i=0; i <10; i++) { } The curly braces tell the compiler that this code is under the for loop. Instead, in Python we use an indentation. Usually four spaces as an indentation is recommended. If we do not use indentation, we will get an error: .. code-block:: python for i in range(5): print(i) .. code-block:: text print(i) ^ IndentationError: expected an indented block The correct way to do so is: .. code-block:: python for i in range(5): print(i) We would like to use the looping over our lines. The counter l will contain each line as Python is looping over them. .. code-block:: python for l in lines: Now, we have each line. Moreover, we should also check the emptiness of the line. The *not* keyword in Python will automatically check for the emptiness: .. code-block:: python if not l It is the same with the command .. code-block:: python if len(l) == 0 If the line is empty, we should remove it from the list using the ``remove()`` command: .. code-block:: python if not l: lines.remove(l) At the end, it will look like: .. code-block:: python for l in lines: if not l: lines.remove(l) When we run *count_lines_fixed.py*, we see the corrected result as well: .. code-block:: text Wrong: The number of lines is 10 Right: The number of lines is 8 ------------------------------------------------------------------------------- Bringing it all together ------------------------ Now we need to tie it all together and call our final file as a *word_count.py*. Python lets us put a sequence of statements together to create **function**. Firstly, we are defining *(def)* a new function and we are naming it *foo*. The following lines are indented to show that they are part of the foo function: .. code-block:: text def foo():  return Our first function counts the number of words: .. code-block:: python def count_words(data): words = data.split(" ") num_words = len(words) return num_words Second function counts the lines: .. code-block:: python def count_lines(data): lines = data.split("\n") for l in lines: if not l: lines.remove(l) num_lines = len(lines) return num_lines We read the data from the file: .. code-block:: python f = open(birds.txt, "r") data = f.read() f.close() We call our functions to count the numbers of words and lines and print the results. .. code-block:: python num_words = count_words(data) num_lines = count_lines(data) print("The number of words: ", num_words ) print("The number of lines: ", num_lines) We should get: .. code-block:: text The number of words: 34 The number of lines: 8 ------------------------------------------------------------------------------- **Congratulations, you wrote your first python script!**