Python string manipulation (2.7.x)

While in other programming languages you have to write your own functions to deal with basic string processing tasks, in python 2.7.x there are several already built-in for you. They’ve already been optimised for performance, so are usually the much better option than writing your own implementations.

 

Built-In String Functions

 

split

 

split() takes a string and breaks it up into a list of strings, using a string you provide as the point at which it splits the original string.

 

example_string = "cat-dog-parrot-mouse"
split_list = example_string.split("-")
print split_list
['cat', 'dog', 'parrot', 'mouse']

 

 

Faster split

 

split() will process the entire string if you don’t tell it otherwise, and you might only want to split on the first few instances of a given string. If this is the case then you can avoid the performance overhead of splitting on every single instance of the string and instead tell it to split only n times, then stop.

 

example_string = "cat-dog-parrot-mouse"
split_list = example_string.split("-", 1)
print split_list
['cat', 'dog-parrot-mouse']

 

In the above example, it splits on the first instance of "-" then stops, putting the rest of the unprocessed string in split_list[-1].

 

example_string = "cat-dog-parrot-mouse"
split_list = example_string.split("-", 2)
print split_list
['cat', 'dog', 'parrot-mouse']

 

 

join

 

join() takes a list of strings and creates a new string from its elements, inserting a given string in-between each element in the new string.

 

list_of_strings = ["cat", "dog", "parrot", "mouse"]
new_string = "-".join(list_of_strings)
print new_string
cat-dog-parrot-mouse

 

 

replace

 

replace() will create a new string from the string you specify, but will replace instances of a given substring with another substring.

 

example_string = "cat-dog-parrot-mouse"
new_string = example_string.replace("-", "+")
print new_string
cat+dog+parrot+mouse

 

 

strip, lstrip and rstrip

 

What these three functions are really useful for is removing trailing whitespace characters from your data, but you can also use them to remove other characters.

However unlike replace(), these functions only operate on either end of the string. lstrip() removes the characters from the left hand side of the string, rstrip() removes them from the right hand side, and strip() removes them from both ends.

To clarify the expected behaviour of each function, in the examples below they have been called with a non-whitespace string as the parameter ("-"). Whitespace characters can be removed by calling with no parameter.

 

 

strip

 

example_string = "-cat-dog-parrot-mouse-"
new_string = example_string.strip("-")
print new_string
cat-dog-parrot-mouse

 

 

lstrip

 

example_string = "-cat-dog-parrot-mouse-"
new_string = example_string.lstrip("-")
print new_string
cat-dog-parrot-mouse-

 

 

rstrip

 

example_string = "-cat-dog-parrot-mouse-"
new_string = example_string.rstrip("-")
print new_string
-cat-dog-parrot-mouse

 

 

upper

 

Converts all lower-case letters in a string to their upper-case equivalents.

 

example_string = "heLLO AnD wELcomE"
new_string = example_string.upper()
print new_string
HELLO AND WELCOME

 

 

lower

 

Converts all upper-case letters in a string to their lower-case equivalents.

 

example_string = "heLLO AnD wELcomE"
new_string = example_string.lower()
print new_string
hello and welcome

 

 

More Useful String Functions

 

Length of a string

 

Strings are just a list of characters, so you can call len() to get the length of a string.

 

sample_string = "Count the characters in this string!"
length = len(sample_string)
print length
"36"