Indexing

Python for Linguists

Axel Bohmann

Language data as strings and lists

mystring = "I am a text, but in Python does not know that. To Python, I am just a sequence of characters."
myWordList = ["I", "am", "a", "list", "of", "words."]
  • Python does not understand language; that task is yours
  • Both strings and lists are ordered sequences:
    • Position matters: Python understands that "a text" comes after "I am " in mystring and "list" is the fourth item in myWordList.
    • We can access elements in an ordered sequence via their index position.

Indexing

  • Get item at index position 1 in the string "abcd":
"abcd"[1]
  • The first item in an ordered sequence is in index 0.
  • What happens when you try this?
"abcd"[4]
  • What about this?
"abcd"[-1]

Index positions

  • Indexing from the left:
element A B C D E F
index 0 1 2 3 4 5
0 len(s)-1
  • Indexing from the left:
element A B C D E F
index -6 -5 -4 -3 -2 -1
-len(s) -1

Slicing

  • Not just one element, but a sub-sequence:
"Mississippi"[3:8]
  • The left index is inclusive: It is the position of the first element in the slice.

  • The right index is exclusive: It is the index position of the first element outside of the slice.

  • Try these:
"Mississippi"[3:1000]
"Mississippi"[2:-2]
"Mississippi"[-3:-1]

Slicing rules

  • Trying to access a single index outside of the actual range of of indices throws an IndexError: string/list index out of range.

  • Going outside the range when slicing is more forgiving.

  • Also, when your slice coincides with the beginning/end of a sequence, you can omit the corresponding index:

"Mississippi"[3:]
"Mississippi"[:3]
"Mississippi"[:-2]

Let’s practice

Exercise

  • Define a string called test containing the word "hullaballoo".

  • Use slicing to extract the sequence "ball" from test.

  • Define a list called chomsky containing the words in the sentence "colorless green ideas sleep furiously".

  • Using slicing methods and other list operations, can you produce the list ["coloress", "green", "sleep", "furiously"]?

  • Challenge: What about ["furious", "sleep"]?

Better than sliced bread!

Image Credits

Thank you


For the presentation slides, visit

https://pylx-4-indexing.netlify.app/

or scan QR code.