As you know, in Python a string data type can hold a character, word, phrase, or longer stretches of text. You have learned how to gather string input from the user and print formatted string messages to the screen. In this chapter, we'll take a closer look at how strings are formed and explore some of the Python functions that allow you to search, extract and transform string content. Let's begin by closely examining the characters that make up a string value.

Strings as Lists of Characters
A Python string is made of a series of individual characters, stored back-to-back in a certain order. You can visualize a string just like a Python list, where each element in the list is simply one character like "G". Just like list elements, string characters can be identified by an integer index that starts at 0 and goes up by 1 for each additional character in the string. For example, the string "Giraffe" has 7 characters, so the matching index values are 0 through 6.

Giraffe string with matching index values

Remember, you can read individual list elements using the list variable name, square brackets, and the numeric index of the element. The same type of access can be used to get individual letters of a string! In the example below, we initialize a string to hold the value "Giraffe". We then use the variable name, square brackets, and index values to read some of the characters from the string and print them to the screen. Can you predict which three characters will be displayed?

Try It Now


Of course, you need to be careful not to use an index value that greater than or equal to the string length. Try editing the code above and using the numeric index 7 inside the square brackets. When you run that code, you will see an IndexError exception at run-time.

Traceback (most recent call last):
File "code1.py", line 3, in
letter0 = myString[7]
IndexError: string index out of range
Strings are Immutable
In Python, once you create a string value, it is immutable, which means that object can't change. So, while you can update individual list elements by index, don't try to do the same thing to Python strings! The code below will throw a run-time exception when you try to set a character at a specific index equal to some other character.

myString = "Giraffe"
myString[0] = "X" # try to create "Xiraffe" - will not work!
Copy
This does NOT mean that you can never change a string variable, of course. You can always assign a completely different string to a variable. In that case, you are replacing one string value with another, which is perfectly fine.

myString = "Giraffe"
myString = "Xiraffe" # replace one string value with another - will always work
Copy
String Length with len()
You have already used the len() function to tell you how many elements are in a list or tuple. This function works the same way on strings and will tell you how many characters are in the string. Try experimenting with some different values for myString in the example below; do you get the expected length printed each time?

Try It Now


Reminder - String Concatenation
As you know from an earlier chapter, strings can easily be combined with concatenation. This simply means you use the plus sign (+) to join together two smaller strings to make one larger string. The two smaller strings are glued end-to-end without any automatic spacing. Can you change the code below so the first and last names are printed with a space between them? You could include a space at the end of "Daffy" or at the beginning of "Duck" or add a space " " between the firstname and lastname in the print() statement.

Try It Now


Escape Sequences
It's easy to create strings with simple alphanumeric characters like "a"-"z" or "0"-"9". Simply type the characters you want in-between double quotes. However, if the string you want to build actually contains a double-quote, what would happen? Consider the phrase below from Edgar Allen Poe's poem, "The Raven":

Quoth the Raven, "Nevermore."
Let's try to print this to the screen as a Python string. We might first try to write code as follows:

print("Quoth the Raven, "Nevermore."") # COMPILE ERROR - Will not work
Copy
However, this won't work! Python uses double quotes to mark the beginning and end of strings. So, Python would recognize the characters between the first two double quotes as a string, followed by some unrecognized and invalid characters, and finally find another (empty) string at the end.

"Quoth the Raven, "Nevermore.""

To successfully create a string that includes special characters like double quotes, you need to use escape sequence inside the double-quoted string. An escape sequence is actually a combination of characters that, when present, represent a single character that might otherwise not be easy to type into a Python string.

Most escape sequences start with a backslash (\). To place a double-quote inside a string, you would write a backslash followed by the double quote (\"). Python will see this sequence as a single double-quote character. Let's fix our example so the quote will be printed correctly on the screen. Try it and verify the results.

Try It Now


Python defines escape sequences for many characters that can't be easily typed into a string within your code. Tabs, the "Enter" key, and similar special characters all have escape sequences. For example, the sequence "\n" is a string of length 1 that holds the "new line" character. This causes your console output to skip down to the next line. The print() function will automatically add a "\n" to the end of your output message (unless you tell it otherwise) - that's why each of your print() messages normally appear on a different line!

The table below describes some common Python escape sequences.

Escape Character

Description

\n

New Line

\r

Carriage Return

\t

Horizontal Tab

\’

Single Quote (‘)

\”

Double Quote (“)

\\

Backslash (\”)

Notice that the backslash character itself (\) has an escape sequence! Why is this needed? Python will normally assume that a backslash inside a string is the beginning of an escape sequence. If it is not, then you need to use an escape sequence just to add a normal backslash. The example below shows the right and wrong ways to create the string "\I like backslashes\".

print("\I like backslashes\") # Will not work - \I and \" treated as escape sequences
print("\\I like backslashes\\") # Correct approach with escaped backslashes
Copy
Take some time to experiment with escape sequences in the example code below. You can run the code first to see sample output from each sequence, then change the strings on your own.

Try It Now


Notice that the new line character (\n) and the carriage return (\r) seem to have the same effect. Historically, different operating systems have used either a new line (\n) alone or a combination of carriage return and new line (\r\n) to mark the end of lines in a text file. The print() statement will use either character to advance to the next line in a Python console. However, if you are actually writing data to a file that will be used on a specific operating system, you will want to be sure to match the End-of-Line (EOL) marker required by that operating system.

String Methods

Python provides a number of built-in string functions that allow you to manipulate and work with string values. These functions can be used to search for a specific substring, change the case of characters, remove leading or trailing whitespace, split a string into parts, and more.

For example, the lower(), upper(), and capitalize() functions allow you to change the case of the characters in a string. The example below shows how these functions can be used:

```python
myString = "Hello, World!"
print(myString.lower()) # prints "hello, world!"
print(myString.upper()) # prints "HELLO, WORLD!"
print(myString.capitalize()) # prints "Hello, world!"
```

The find() function can be used to locate the position of a specific substring within a string, while the replace() function allows you to replace all occurrences of a substring with a different substring. The example below demonstrates how to use these functions:

```python
myString = "I like Python programming"
print(myString.find("Python")) # prints the index of "Python" in the string
newString = myString.replace("Python", "Java") # replaces "Python" with "Java"
print(newString) # prints "I like Java programming"
```

The strip() function can be used to remove leading and trailing whitespace from a string, while the split() function allows you to divide a string into multiple parts based on a specified separator. Here's how these functions can be used:

```python
myString = " Hello, World! "
print(myString.strip()) # prints "Hello, World!"
print(myString.split(",")) # splits the string at ","
```

These are just a few examples of the many string functions available in Python. By experimenting with these functions and exploring the Python documentation, you can discover a wide range of tools for working with and manipulating string values in your programs.

Read the text and answer the following questions

1. What is the first and last numeric index for a string with a given length?
2. How do you use square brackets to read a specific character from a string?
3. Can you change the contents of a specific string object?
4. Can you replace the contents of a string variable with another string?
5. How do you use len() to get the number of characters in a string?
6. Why are escape sequences necessary when building strings?
7. What are some common characters that need escaping, and what are their escape sequences?
8. Why do you need to escape backslashes when creating strings that should have these plain characters?

1. The first numeric index for a string with a given length is 0, and the last numeric index is length-1.

2. Square brackets are used to read a specific character from a string by using the string variable name followed by square brackets containing the index of the character.
3. No, the contents of a specific string object cannot be changed as strings are immutable in Python.
4. Yes, you can replace the contents of a string variable with another string by assigning a new string value to the variable.
5. You can use the len() function to get the number of characters in a string by passing the string as an argument to len().
6. Escape sequences are necessary when building strings to represent special characters like double quotes that might otherwise be interpreted as string delimiters or cause syntax errors.
7. Some common characters that need escaping and their escape sequences are:
- New Line: \n
- Carriage Return: \r
- Horizontal Tab: \t
- Single Quote ('') : \'
- Double Quote ("") : \"
- Backslash (\) : \\
8. Backslashes need to be escaped when creating strings that should have plain backslashes because Python interprets a backslash followed by certain characters as an escape sequence. To include a plain backslash in a string, you need to escape the backslash itself.