How To Index and Slice Strings in Python
The Python string data type is a sequence comprised of at least one individual characters that could comprise of letters, numbers, whitespace characters, or images. Since a string is a sequence, it tends to be gotten to in the very manners that other sequence-based data types are, through ordering and slicing.
This instructional exercise will control you through getting to strings through ordering, slicing them through their character sequences, and go over some checking and character area techniques.
How Strings are Indexed
Like the list data type that has items that correspond to an index number, each of a string’s characters also correspond to an index number, starting with the index number 0.
For the string
Sammy Shark! the index breakdown looks like this:
As you can see, the first
S starts at index 0, and the string ends at index 11 with the
We also notice that the whitespace character between
Shark also corresponds with its own index number. In this case, the index number associated with the whitespace is 5.
The exclamation point (
!) also has an index number associated with it. Any other symbol or punctuation mark, such as
*#$&.;?, is also a character and would be associated with its own index number.
The fact that each character in a Python string has a corresponding index number allows us to access and manipulate strings in the same ways we can with other sequential data types.
Accessing Characters by Positive Index Number
By referencing index numbers, we can isolate one of the characters in a string. We do this by putting the index numbers in square brackets. Let’s declare a string, print it, and call the index number in square brackets:
ss = "Sammy Shark!" print(ss)
When we refer to a particular index number of a string, Python returns the character that is in that position. Since the letter
y is at index number 4 of the string
ss = "Sammy Shark!", when we print
ss we receive
y as the output.
Index numbers allow us to access specific characters within a string.
Accessing Characters by Negative Index Number
If we have a long string and we want to pinpoint an item towards the end, we can also count backwards from the end of the string, starting at the index number
For the same string
Sammy Shark! the negative index breakdown looks like this:
By using negative index numbers, we can print out the character
r, by referring to its position at the -3 index, like so:
Using negative index numbers can be advantageous for isolating a single character towards the end of a long string.
We can likewise get down on a scope of characters from the string. Let’s assume we might want to simply print the word Shark. We can do as such by making a slice, which is a sequence of characters inside a unique string. With slices, we can call different character esteems by making a scope of file numbers isolated by a colon
When constructing a slice, as in
[6:11], the first index number is where the slice starts (inclusive), and the second index number is where the slice ends (exclusive), which is why in our example above the range has to be the index number that would occur just after the string ends.
When slicing strings, we are creating a substring, which is essentially a string that exists within another string. When we call
ss[6:11], we are calling the substring
Shark that exists within the string
If we want to include either end of a string, we can omit one of the numbers in the
string[n:n] syntax. For example, if we want to print the first word of string
ss — “Sammy” — we can do so by typing:
We did this by omitting the index number before the colon in the slice syntax, and only including the index number after the colon, which refers to the end of the substring.
To print a substring that starts in the middle of a string and prints to the end, we can do so by including only the index number before the colon, like so:
By counting just the file number before the colon and avoiding the subsequent record number with regards to the sentence structure, the substring will go from the personality of the list number called to the furthest limit of the string.
You can also use negative index numbers to slice a string. As we went through before, negative index numbers of a string start at -1, and count down from there until we reach the beginning of the string. When using negative index numbers, we’ll start with the lower number first as it occurs earlier in the string.
Let’s use two negative index numbers to slice the string
The substring “ark” is printed from the string “Sammy Shark!” because the character “a” occurs at the -4 index number position, and the character “k” occurs just before the -1 index number position.
Specifying Stride while Slicing Strings
String slicing can acknowledge a third boundary notwithstanding two index numbers. The third boundary specifies the stride, which alludes to the number of characters to push ahead after the main character is recovered from the string. Up until this point, we have excluded the stride boundary, and Python defaults to the stride of 1, so that each character between two index numbers is recovered.
Let’s look again at the example above that prints out the substring “Shark”:
We can obtain the same results by including a third parameter with a stride of 1:
So, a stride of 1 will take in every character between two index numbers of a slice. If we omit the stride parameter then Python will default with 1.
If, instead, we increase the stride, we will see that characters are skipped:
Specifying the stride of 2 as the last parameter in the Python syntax
ss[0:12:2] skips every other character. Let’s look at the characters that are printed in red:
Note that the whitespace character at index number 5 is also skipped with a stride of 2 specified.
If we use a larger number for our stride parameter, we will have a significantly smaller substring:
Specifying the stride of 4 as the last parameter in the Python syntax
ss[0:12:4] prints only every fourth character. Again, let’s look at the characters that are printed in red:
In this example the whitespace character is skipped as well.
Since we are printing the whole string we can omit the two index numbers and keep the two colons within the syntax to achieve the same result:
Omitting the two index numbers and retaining colons will keep the whole string within range, while adding a final parameter for stride will specify the number of characters to skip.
Additionally, you can indicate a negative numeric value for the stride, which we can use to print the original string in reverse order if we set the stride to -1:
The two colons without specified parameter will include all the characters from the original string, a stride of 1 will include every character without skipping, and negating that stride will reverse the order of the characters.
Let’s do this again but with a stride of -2:
In this example,
ss[::-2], we are dealing with the entirety of the original string as no index numbers are included in the parameters, and reversing the string through the negative stride. Additionally, by having a stride of -2 we are skipping every other letter of the reversed string:
The whitespace character is printed in this example.
By specifying the third parameter of the Python slice syntax, you are indicating the stride of the substring that you are pulling from the original string.
While we are pondering the significant index numbers that relate to characters inside strings, it merits experiencing a portion of the strategies that check strings or return index numbers. This can be valuable for restricting the quantity of characters we might want to acknowledge inside a client input structure, or contrasting strings. Like other successive data types, strings can be checked through a few strategies.
We’ll first look at the
len() method which can get the length of any data type that is a sequence, whether ordered or unordered, including strings, lists, tuples, and dictionaries.
Let’s print the length of the string
The length of the string “Sammy Shark!” is 12 characters long, including the whitespace character and the exclamation point symbol.
Instead of using a variable, we can also pass a string right into the
print(len("Let's print the length of this string."))
len() method counts the total number of characters within a string.
If we want to count the number of times either one particular character or a sequence of characters shows up in a string, we can do so with the
str.count() method. Let’s work with our string
ss = "Sammy Shark!" and count the number of times the character “a” appears:
We can search for another character:
Though the letter “S” is in the string, it is important to keep in mind that each character is case-sensitive. If we want to search for all the letters in a string regardless of case, we can use the
str.lower() method to convert the string to all lower-case first. You can read more about this method in “An Introduction to String Methods in Python 3.”
str.count() with a sequence of characters:
likes = "Sammy likes to swim in the ocean, likes to spin up servers, and likes to smile." print(likes.count("likes"))
In the string
likes, the character sequence that is equivalent to “likes” occurs 3 times in the original string.
We can also find at what position a character or character sequence occurs in a string. We can do this with the
str.find() method, and it will return the position of the character based on index number.
We can check to see where the first “m” occurs in the string
The first character “m” occurs at the index position of 2 in the string “Sammy Shark!” We can review the index number positions of the string
Let’s check to see where the first “likes” character sequence occurs in the string
The first instance of the character sequence “likes” begins at index number position 6, which is where the character
l of the sequence
likes is positioned.
What if we want to see where the second sequence of “likes” begins? We can do that by passing a second parameter to the
str.find() method that will start at a particular index number. So, instead of starting at the beginning of the string, let’s start after the index number 9:
In this second example that begins at the index number of 9, the first occurrence of the character sequence “likes” begins at index number 34.
Additionally, we can specify an end to the range as a third parameter. Like slicing, we can do so by counting backwards using a negative index number:
print(likes.find("likes", 40, -6))
This last example searches for the position of the sequence “likes” between the index numbers of 40 and -6. Since the final parameter entered is a negative number it will be counting from the end of the original string.
The string methods of
str.find() can be used to determine length, counts of characters or character sequences, and index positions of characters or character sequences within strings.