Python for Loop: A Beginner’s Tutorial

https://ift.tt/NUDryo5 When you're working with data in Python, for loops are a powerful tool that can speed things up. But they can ...

https://ift.tt/NUDryo5

When you're working with data in Python, for loops are a powerful tool that can speed things up. But they can also be a little bit confusing when you're just starting out. In this tutorial, we're going to dive headfirst into for loops and learn how they can be used to do all sorts of interesting things when you're doing data cleaning or data analysis in Python.

Please check out our Comprehnsive Python Guide for a structured learning experience.

What Are `for` Loops?

In most data science tasks, Python for loops let you "loop through" an iterable object (such as a list, tuple, or set), processing one item at a time. During each iteration (i.e., each pass of the loop), Python updates an iteration variable to represent the current item, which you can then use within the loop's code block. For example, you could loop through a list of numbers, accessing each number individually to perform calculations or other actions.

An iterable object is any Python object that you can iterate over, accessing its items one by one. For example, lists and tuples are iterables that let you access each item in the order they appear. Strings are also iterable, allowing you to loop through each character individually, from start to finish. Dictionaries are iterable too, but looping through them gives you their keys, which you can then use to access corresponding values. Many objects in Python are considered iterable, but in this beginner's tutorial, we'll focus on the most common types.

Creating a `for` Loop

To create a Python for loop, you start by defining an iteration variable and the iterable object you want to loop through. The iteration variable temporarily holds each item from the iterable during each loop. Then, inside the loop, you specify the actions you want to perform using this variable. For example, when looping through a list, you first choose a variable name to represent each item, specify the list itself, and then decide what you'd like to do with each list item.

Let's look at a quick example. Imagine we have a Python list of names, and we want to print each name individually. Below, we'll create our list and use a for loop to iterate through it, printing each item in order.

our_list = ['Lily', 'Brad', 'Fatima', 'Zining']

for name in our_list:
    print(name)

Lily
Brad
Fatima
Zining

You might notice we didn't explicitly define the name variable before our loop—so where does it come from? This variable, called the iteration variable, is automatically created by the for loop. During each iteration, Python updates it to represent the current item from the iterable. You can actually name it anything you like—Python doesn't mind—but it's best to choose something clear and descriptive.

So, in the example above:

name refers to 'Lily' during the first iteration of the loop...
...then 'Brad' on the second iteration...
...then 'Fatima', and so on until the loop ends.

This works exactly the same no matter what we name our variable. If we rewrite our code using x instead of name, we'll get identical results:

for x in our_list:
    print(x)

Lily
Brad
Fatima
Zining

This looping technique applies to any iterable object. Strings, for example, are also iterable, so you can loop through them to access each character individually:

for letter in 'Hello':
    print(letter)

H
e
l
l
o

Using `for` Loops with Lists of Lists

In actual data analysis work, it's unlikely that we're going to be working with short, simple lists like the one above, though. Generally, we'll have to work with data sets in a table format, with multiple rows and columns. This kind of data can be stored in Python as a list of lists, where each row of a table is stored as a list within the list of lists, and we can use for loops to iterate through these as well.

To learn how to do this, let's take a look at a more realistic scenario and explore this small data table that contains some US prices and US EPA range estimates for several electric cars.

vehicle	range	price
Tesla Model 3 LR	310	49900
Hyundai Ioniq EV	124	30315
Chevy Bolt	238	36620

We can express this same data set as a list of lists, like so:

ev_data = [['vehicle', 'range', 'price'], 
           ['Tesla Model 3 LR', '310', '49900'], 
           ['Hyundai Ioniq EV', '124', '30315'],
           ['Chevy Bolt', '238', '36620']]

You may have noticed that in the list above, our range and price values are actually stored as strings rather than integers. It's not uncommon that you'll get data stored in this way, but for analysis, we'd want to convert those strings into integers so we can do some calculations with them. Let's use a for loop to iterate through our list of lists, selecting the price entry in each list and changing it from a string to an integer.

To do that, we need to do a few things. First, we need to skip the first row in our table, since those are the column names and we will get an error if we attempt to convert a non-numerical string like 'range' into an integer. We skip the first row using list slicing to select each row after the first row using ev_data[1:]. (If you need to brush up on this, or any other aspects of lists, check out our interactive course on Python programming fundamentals).

Then, we'll loop through the list of lists, and for each iteration we'll select the element in the range column, which is the second column in our table. We'll assign the value found in this column to a variable called ev_range. To do this, we'll use the index number 1 (in Python, the first entry in an iterable is at index 0, the second entry is at index 1, etc.).

Finally, we'll convert the range numbers to integers using Python's built-in int() function, and replace the original strings with these integers in our data set.

for row in ev_data[1:]:          # loop through each row in ev_data starting with row 2 (index 1)
    ev_range = row[1]           # each car's range is found in column 2 (index 1)
    ev_range = int(ev_range)    # convert each range value from a string to an integer
    row[1] = ev_range           # assign ev_range, which is now an integer, back to index 1 in each row

print(ev_data)

[['vehicle', 'range', 'price'], ['Tesla Model 3 LR', 310, '49900'], ['Hyundai Ioniq EV', 124, '30315'], ['Chevy Bolt', 238, '36620']]

Now that we've got those values stored as integers, we can also use a for loop to do some calculations. Let's say, for example, that we want to figure out the average range of an EV on this list. We'd need to add the range numbers together, and then divide them by the total number of cars in our list.

Again, we can use a for loop to select the specific column we need within our data set. We'll start by creating a variable called total_range where we can store the sum of the ranges. Then we'll write another for loop, again skipping the header row, and again identifying the second column (index 1) as the range value.

After that, all we need to do is add this value to total_range within our for loop, and then calculate the value using total_range divided by the number of cars after the loop has completed.

Note: We'll calculate the number of cars by counting the length of our list, minus the header row, in the code below. With a list as short as ours, we could also simply divide by 3, since the number of cars is very easy to count, but that would break our calculation if additional car data was added to the data set. For that reason, it's better to use len() to calculate the length of our car list in code so that if additional entries are added to our data set in the future, we can re-run this code and it will still produce the correct answer.

total_range = 0                     # create a variable to store the sum of range values

for row in ev_data[1:]:             # loop through each row in ev_data starting with row 2 (index 1)
    ev_range = row[1]               # each car's range is found in column 2 (index 1)
    total_range += ev_range         # add this number to the number stored in total_range

number_of_cars = len(ev_data[1:])   # calculate the length of our list, minus the header row

print(total_range / number_of_cars) # print the average range

224.0

Python for loops are powerful, and you can nest more complex instructions inside of them. To demonstrate this, let's repeat the above two steps for our 'price' column, this time within a single for Loop.

total_price = 0                     # create a variable to store the total range number

for row in ev_data[1:]:             # loop through each row in ev_data starting with row 2 (index 1)
    price = row[2]                  # each car's price is found in column 3 (index 2)
    price = int(price)              # convert each price number from a string to an integer
    row[2] = price                  # assign price, which is now an integer, back to index 2 in each row
    total_price += price            # add each car's price to total_price

number_of_cars = len(ev_data[1:])   # calculate the length of our list, minus the header row

print(total_price / number_of_cars) # print the average price

38945.0

We can also nest other elements, like if or else statements and even other for loops, within our for loops.

As an example, imagine we wanted to find every car with a range of greater than 200 miles in our list. We can start by creating a new empty list to hold our long-range car data. Then, we'll use a for loop to iterate through ev_data, the list of lists containing car data we created earlier, appending a car's row to our long-range list only if the its range value is above 200:

long_range_car_list = []       # creating a new list to store our long range car data

for row in ev_data[1:]:        # iterate through ev_data, skipping the header row
    ev_range = row[1]          # assign the range number, which is at index 1 in the row, to the range variable
    if ev_range > 200:         # append the whole row to long-range list if range is higher than 200
        long_range_car_list.append(row)

print(long_range_car_list)

[['Tesla Model 3 LR', 310, 49900], ['Chevy Bolt', 238, 36620]]

These operations would also be simple to perform by hand with such a tiny data set, of course. But these same techniques will work on data sets with thousands and thousands of rows, which can make cleaning, sorting, and analyzing huge datasets into very quick work.

Other Useful Techniques: Range, Break, and Continue

You can get a surprising amount of mileage out of for loops just by mastering the techniques described above, but let's dive even deeper and learn a few other things that may be helpful, even if you use them a bit less frequently in the context of data science work.

Range

for loops can be used in tandem with Python's range() function to iterate through each number in a specified range. For example:

for x in range(5, 9):
    print(x)

Note that Python doesn't include the maximum value of a range in the range count, which is why the number 9 doesn't appear above. If we wanted this code to count from 5 to 9 including 9, we'd need to change range(5, 9) to range(5, 10):

for x in range(5, 10):
    print(x)

If you only specify a single number in your range() function, Python will treat that as the maximum value, and assign a default minimum value of zero:

for x in range(3):
    print(x)

0
1
2

You can even add a third argument to the range() function to specify that you'd like to count in increments of a specific number. As you can see above, the default value is 1, but if you add a third argument of 3, for example, you can use range() with a for loop to count up in threes:

for x in range(0, 9, 3):
    print(x)

0
3
6

Break

By default, a Python for loop will loop through each possible iteration of the interable object you've assigned it. Normally when we're using a for loop, that's fine, because we want to perform the same action on each item in our list (for example).

Sometimes, though, we may want to stop your loop if a certain condition is met. In that circumstance, the break statement is useful. When used with an if statement inside of a for loop, break allows us to break out of that loop before its conclusion.

Let's take a look at a quick example first, using the list of names we created earlier called our_list):

for name in our_list:
    break
    print(name)

When we run this code, nothing is printed. That's because the break statement comes before print(name) in our for loop. When Python sees break, it stops executing the for loop and code that appears after break in the loop doesn't get run.

Let's add an if statement to this loop, so that we break out of the loop when Python gets to the name Zining:

for name in our_list:
    if name == 'Zining':
        break
    print(name)

Lily
Brad
Fatima

Here, we can see that the name Zining wasn't printed. Here's what's happening with each loop iteration:

Python checks to see if the first name is 'Zining'. It isn't, so it continues executing the code below our if statement, and prints the first name.
Python checks to see if the second name is 'Zining'. It isn't, so it continues executing the code below our if statement, and prints the second name.
Python checks to see if the third name is 'Zining'. It isn't, so it continues executing the code below our if statement, and prints the third name.
Python checks to see if the fourth name is 'Zining'. It is, so break is executed and the for loop ends.

Let's return to the code we wrote for collecting long-range EV car data and work through one more example. We'll insert a break statement that stops the look as soon as it encounters the string 'Tesla':

long_range_car_list = []    # creating our empty long-range car list again

for row in ev_data[1:]:     # iterate through ev_data as before looking for cars with a range > 200
    ev_range = row[1]          
    if ev_range > 200:         
        long_range_car_list.append(row)
    if 'Tesla' in row[0]:   # but if 'Tesla' appears in the vehicle column, end the loop
            break

print(long_range_car_list)

[['Tesla Model 3 LR', 310, 49900]]

In the code above, we can see that the Tesla was still added to long_range_car_list, because we appended it to that list before the if statement where we used break. The Chevy Bolt was not added to our list, because although it does have a range of more than 200 miles, break ended the loop before Python reached the Chevy Bolt row.

(Remember, for loops execute in sequential order. If the Bolt was listed before the Tesla in our original data set, it would have been included in long_range_car_list).

Continue

When we're looping through an iterable object like a list, we might also encounter situations where we'd like to skip a particular row or rows. For simple situations like skipping a header row, we can use list slicing, but if we want to skip rows based on more complex conditions, this quickly becomes impractical. Instead, we can use the continue statement to skip a single iteration ("loop") of a for loop and move to the next.

When Python sees continue while executing a for loop on a list, for example, it will stop at that point and move on to the next item on the list. Any code that comes below the continue will not be executed.

Let's go back our list of names (our_names) and use continue with an if statement to end a loop iteration before printing if the name is 'Brad':

for name in our_list:
    if name == 'Brad':
        continue
    print(name)

Lily
Fatima
Zining

Above, we can see that Brad's name was skipped, and the rest of the names in our list were printed in sequence. That illustrates the difference between break and continue in a nutshell:

break ends the loop entirely. When Python executes break, the for loop is over.
continue ends a specific iteration of the loop and moves to the next item in the list. When Python executes continue it moves immediately to the next loop iteration, but it does not end the loop entirely.

To get some more practice with continue, let's make a list of short-range EVs, using continue to take a slightly different approach. Instead of identifying the EVs with less than 200 miles of range, we'll write a for loop that adds every EV to our short-range list, but with a continue statement before we append to the new list that runs if the range is greater than 200:

short_range_car_list = []               # creating our empty short-range car list 

for row in ev_data[1:]:                 # iterate through ev_data as before 
    ev_range = row[1]          
    if ev_range > 200:                  #  if the car has a range of > 200
        continue                        # end the loop here; do not execute the code below, continue to the next row
    short_range_car_list.append(row)    # append the row to our short-range car list

print(short_range_car_list)

[['Hyundai Ioniq EV', 124, 30315]]

That's probably not the most efficient and readable way to create our short-range car list, but it does demonstrate how continue works, so let's walk through precisely what's happening here.

On its first loop, Python is looking at the Tesla row. That car does have an EV range of more than 200 miles, so Python sees the if statement is True, and executes the continue nested inside that if statement, which makes it immediately jump to the next row of ev_data to begin its next loop.

On the second loop, Python is looking at the next row, which is the Hyundai row. That car has a range of under 200 miles, so Python sees that the conditional if statement is not met, and executes the rest of the code in the for loop, appending the Hyundai row to short_range_car_list.

On the third and final loop, Python is looking at the Chevy row. That car has a range of more than 200 miles, which means the conditional if statement is True. Thus, Python once again executes the nested continue, which concludes the loop and, since there are no more rows of data in our data set, ends the for loop entirely.

Additional Resources

Hopefully at this point, you're feeling comfortable with for loops in Python, and you have an idea of how they can be useful for common data science tasks like data cleaning, data preparation, and data analysis.

Ready to take the next step? Here are some additional resources to check out:

Python tutorials — Our ever-expanding list of Python tutorials for data science.
Data Science Courses — Take your studies to the next level with fully interactive programming, data science, and stats courses, right in your browser.
Python's official documentation on For Loops - The official documentation doesn't go into as much depth as this tutorial, but it does review the basics of for loops explain some related concepts like While Loops.
Dataquest's Python Fundamentals for Data Science course - Our Python fundamentals course offers a from-scratch introduction to coding in Python for data science. It covers lists, loops, and a whole lot more, and you can code iteractively right from within your browser.
Dataquest's Intermediate Python for Data Science course - When you feel like you've mastered for loops and other core Python concepts, this is another interactive course that'll help you take your Python skills to the next level.
Free Datasets for Projects - Practice for loops on your own by grabbing a free data set from one of these sources and applying your new skills to large, real-world data sets. The data sets in the first section (for data visualization) should work particularly well for practice projects since they should already be relatively clean.

Best of luck, and happy looping!

from Dataquest https://ift.tt/QoekqBp
via RiYo Analytics

Page Nav

Ads Place

Python for Loop: A Beginner’s Tutorial

https://ift.tt/NUDryo5 When you're working with data in Python, for loops are a powerful tool that can speed things up. But they can ...

What Are for Loops?

Creating a for Loop

Using for Loops with Lists of Lists

Other Useful Techniques: Range, Break, and Continue

Range

Break

Continue

Additional Resources

Related Posts

ليست هناك تعليقات

Connect WIth Us

Top of the month

Step-by-Step Guide: Building and Presenting Your Data Portfolio (Data Coach Insights)

Building a Medical Prescription Scanner Using PaliGemma 2 Mix

Building AI-Powered Apps with DeepSeek-V3 and Gradio Using Prompt Engineering

Favorites, Scandals and History-Makers: Your 6-Point Guide to the Oscars

Latest Posts

Cloud Labels

بحث هذه المدونة الإلكترونية

الإبلاغ عن إساءة الاستخدام

المساهمون

Happy To Help You

Popular Tag

Latest Articles

Building a Medical Prescription Scanner Using PaliGemma 2 Mix

Building AI-Powered Apps with DeepSeek-V3 and Gradio Using Prompt Engineering

How to Measure the ROI of Your Data Team?

Qwen Chat: The AI Chatbot that’s Better than ChatGPT and Grok

Popular Posts

Spider-Man: No Way Home Torrents May Contain Crypto Malware, Cybersecurity Firm Warns

Onecoin Victims Petition Bulgaria for Seizure of Assets and Compensation

3air Leverages Blockchain Technology to Deliver Extensive Broadband Connectivity in Africa

AI Applications for Border Transportation

What Are `for` Loops?

Creating a `for` Loop

Using `for` Loops with Lists of Lists