Lesson-18: Python Character Sequences
Python Character Sequences
Python character sequences, as the name suggests, are an array formed by a combination of characters. Character arrays are a special data type that is shown in single, double, or three quotes, and is distinguished from other data types through these quotes. Technically speaking, when we query an object using the type() function, if we output , this object is a string of characters.
In addition to what we have learned about character sequences so far, we will also talk about the methods of character sequences.
Accessing Elements Of Character Sequences
Under this heading, We will access the elements of character arrays in the string structure individually or as a group. For example:
kelime="Python Öğreniyorum"
for i in kelime:
print i
You may remember the code structure above from our previous lessons. For loops had the ability to circulate (iterate) within the word. in the value of the word variable, we created a letter-by-letter loop and printed each letter on the screen.
Well, if I just wanted to print the first letter or any letter:
kelime="Python Öğreniyorum"
print( kelime[0])
print( kelime[1])
print( kelime[8])
print( kelime[9])
print( kelime[10])
As seen above, I get the zero element of the word with the word[0], that is, the first letter. The zero element of the word variable is the letter “P”. Note that the value sequence of arrays starts from zero. The screen output of the above codes will be as follows.
P
y
Ö
ğ
r
In general, we can define a formula like this.
karakter_dizisi[öğe_sırası]
kelime="Python Öğreniyorum"
print(len(kelime))
the len function gives the length of the sequence of characters. Since space is also a character, we must remember that it will be counted in space. Output of the above code:
>>18 .
A lack of 1 in the length of a character array gives the last element of that character array.
kelime="Python Öğreniyorum"
print(kelime[len(kelime)-1])
Bu kodun ekran çıktısı:
>> m
olacaktır...
If we give a character array a number with a minus value as an order of elements, Python will start reading that character array from the end to the beginning. If we change the above code as follows, we can achieve the same result.
kelime="Python Öğreniyorum"
print(kelime[-1])
screen shot
>> m
kelime="Python Öğreniyorum"
print(kelime[-2])
screen shot
>> u
Slicing Character Sequences
In the previous chapter, we saw how we can get the element we want of a sequence of characters by specifying the order of that element. We’ll do something similar in this episode. But what we’re going to do here will be a little more thorough than what we did in the previous episode.
In this episode, we will talk about ‘slicing’ character sequences. So, what do we mean by’ slice’? Here we will try to get a certain part of the character sequences. Take a look at the example below.
site = "www.btogrenme.com"
>>> site[4:13]
'btogrenme'
>>> site[15:18]
'com'
>>> site[0:3]
'www'
As you can see, we divided this character array by Slice by Slice by giving the character array some values in square brackets. And how did we do that? In the above examples, a structure like this catches our eye:
karakter_dizisi[alınacak_ilk_öğenin_sırası : alınacak_son_öğenin_sırasının_bir_fazlası]
Let’s write an application using this information as follows:
site1 = "www.google.com" site2 = "www.btogrenme.com" site3 = "www.milliyet.com" site4 = "www.yahoo.com" for isim in site1, site2, site3, site4: print("site: ", isim[4:-4])
This example gives us a lot of information about the structure and properties of slicing operations in Python. As you can see, we can use both positive and negative numbers. As you will remember from the previous episode, if the given number is negative, Python will read the character sequence from right to left (i.e. from end to beginning). In the example above, using the name[4:-4] structure, we sliced the character sequences site1, site2, site3, site4, excluding the first four and the last four characters. So we have all the characters between the first four and the last four characters left. So “google”, “btogrenme”, “nationality”and “yahoo”.
Another example
Let’s give another example to better understand all this:
ata1 = "Akıllı bizi arayıp sormaz deli bacadan akar!" ata2 = "Ağa güçlü olunca kul suçlu olur!" ata3 = "Avcı ne kadar hile bilirse ayı da o kadar yol bilir!" ata4 = "Lafla pilav pişse deniz kadar yağ benden!" ata5 = "Zenginin gönlü oluncaya kadar fukaranın canı çıkar!" Here we gave five Proverbs. Our task is to eliminate exclamation marks found at the end of these proverbs:
for ata in ata1, ata2, ata3, ata4, ata5:
print(ata[0:-1])
Here’s what we do: we named each of the variables named ata1, ata2, ata3, ata4 and ata5 as ata, and then sliced and took the part of the variable named ata from the beginning to the end. In other words, we have obtained all the remaining characters between ata[0] and ata[-1]. So, after we remove these exclamation points, What do we do if we want to replace them with a dot?
It is also a very simple process:
for ata in ata1, ata2, ata3, ata4, ata5:
print(ata[0:-1] + ".")
As you can see, the only thing we do to replace it with a dot after discarding the exclamation point, which is the last character, is the sequence of characters we slice, with the help of a plus sign ( + ). it’s about adding character.
One more example…
Let’s have four website addresses like this:
site1 = "www.google.com" site2 = "www.istihza.com" site3 = "www.yahoo.com" site4 = "www.btogrenme.org"
Our goal is to add http:// to the head of each of these addresses. For this purpose, we can again use the operations of combining a string of characters. Examine carefully:
site1 = "www.google.com" site2 = "www.istihza.com" site3 = "www.yahoo.com" site4 = "www.gnu.org" for i in site1, site2, site3, site4: print("http://", i, sep="")
If www. if you want to discard parts of it, you must also use the slicing method along with string join operations:
for i in site1, site2, site3, site4: print("http://", i[4:], sep="")
One more example…
Let’s say you write a program where you have to separate the vowels and consonants in a word. For example, your goal is to collect the letters ‘I’, ‘a’ and ‘u’ in the Word ‘istanbul’ in one place, and the letters ‘s’, ‘t’, ‘n’, ‘b’ and ‘l’ in a separate place. For this purpose, you can write a program like this:
sesli_harfler = "aeıioöuü"
sessiz_harfler = "bcçdfgğhjklmnprsştvyz"
sesliler = ""
sessizler = ""
kelime = "istanbul"
for i in kelime:
if i in sesli_harfler:
sesliler += i
else:
sessizler += i
print("sesli harfler: ", sesliler)
print("sessiz harfler: ", sessizler)
Here, first of all, we define vowels and consonants in Turkish with the help of the following codes:
sesli_harfler = "aeıioöuü"
sessiz_harfler = "bcçdfgğhjklmnprsştvyz"
Then we define an empty sequence of characters for vowels and consonants in the word, where we will extract vowels and consonants:
sesliler = ""
sessizler = ""
In our program, we will assign the corresponding letters to the variable to which that letter belongs.
Kelimemiz “istanbul”:
kelime = "istanbul"
Now we build a for loop on this word and look at each letter passing through the word one by one. From the letters that pass in the word, we throw what passes in the sequence of characters defined in the vowel_harfs variable into a variable called vowel. Otherwise, we send the letters passing through the word to the variable called consonants, i.e. those passing through the sequence of characters defined in the voiceless variable:
for i in kelime:
if i in sesli_harfler:
sesliler += i
else:
sessizler += i
For this, you see that we define a simple ‘if-else’ block in the for loop. Also note that in doing so, we send a new letter to variables called vowels and consonants at each turn of the for loop and define these variables from the beginning each turn of the loop. Because, as we say, character arrays are data types that cannot be changed. If we want to make changes to a character array, we need to define that character array from the beginning.