# Iterating through tables in text file

everyone.

I would say this is the first task I have not a clear idea where to start with:

Create a text file (using an editor, not necessarily Python) containing two tab- separated columns, with each column containing a number. Then use Python to read through the file you’ve created. For each line, multiply each first number by the second, and then sum the results from all the lines. Ignore any line that doesn’t contain two numeric columns.

so far I wrote a couple of lines, but I am not sure where would I need to go next:

``````filename = 'path'

def sum_columns(filename):
sum = 0
multiply = 0
with open (filename) as f:
``````

Should I split my file with 2 columns and create a list of them, or should I do something else?

You can pretty much do a lot of things, given the exercise text. In my opinion, the best way would be to do something like this:

``````filename = 'path'

def sum_columns(filename):
sum = 0
multiply = 0
with open (filename) as f:
f.close()
for line in all_lines:
splitted = line.split("\t")
sum += int(splitted[0]) * int(splitted[1])
return sum
``````

You'll get all lines of the file listed into `all_lines`, then you can iterate through every line and split them from the tab, then multiply them and sum them to the `sum` variable you initialized to 0, which you'll return at the end. As hinted by someone else, you could also read the file line by line without memorizing every line into a list, but if the file is relatively small, you can go with my option.

If you have a file like this:

``````1   2
2   4
4   8
``````

You can do the following:

``````from functools import reduce

def is_int(s):
try:
int(s)
return True
except ValueError:
return False

filename = 'path'

def sum_columns(filename):
with open (filename) as f:
return sum([
reduce(lambda x, y: x * y, map(int,line.split("\t")))
for line in lines
if len(list(filter(is_int, line.split("\t")))) == 2
])
``````

## Explanation:

At the top I define a helper function, that determins if a string can be converted into an int or not. This will be used later to ignore lines that don't have 2 numbers. It's based on this answer

``````def is_int(s):
try:
int(s)
return True
except ValueError:
return False
``````

Then, we open the file, and read all lines into a variable. This is not the most efficient, as it can be processed line by line without storing the while file, however, for smaller files this is negligable.

``````with open (filename) as f:
``````

Next, is a single operation to perform your query, but let's break it down:

First, we iterate through all the lines:

``````for line in lines
``````

Next, we only keep the lines that have exactly two numbers separated by tabs:

``````if len(list(filter(is_int, line.split("\t")))) == 2
``````

Finally, we turn each number in the line into `int`s, and multiply them all together:

``````reduce(lambda x, y: x * y, map(int,line.split("\t")))
``````

We then sum all of these and return the result

## Performance consideration

If performance is a concern, you can achieve the same thing, reading the contents line by line, instead of pulling the whole file into a variable. It is less elegant, but more efficicient:

``````def sum_columns(filename):
total = 0
with open (filename) as f:
for line in f:
if len(list(filter(is_int, line.split("\t")))) != 2:
continue
total += reduce((lambda x, y: x * y), map(int,line.split("\t")))
``````

(Note, that you still need the import and helpers from the above example)

input.txt

``````1 3
2 6
3 7
7 12
8
``````

script.py

``````with open('input.txt') as f:
total = 0
for line in f:
try:
line_value = int(numbers[0]) * int(numbers[1])
except IndexError as e:
# the line doesn't contain two numbers
continue
except ValueError as e:
# a value couldn't be converted to a number
continue
total += line_value
``````

Here is a short solution:

``````def sum_columns(filename):
counter = 0
with open(filename) as file:
for line in file:
try:
a, b = [int(x) for x in line.split('\t')]
counter += a * b
except ValueError:
continue
return counter

file_name = 'myfile.txt'
print(sum_columns(file_name))
``````

This is what a lot of people (@martineau to be the first) suggested to use in comments (also this is something I learned just now) so I decided to put it in an answer.

Basically what happens, the loop iterates over each line and for each line creates a list of two integers (the list comprehension is for just that since otherwise both numbers are strings which will raise a `ValueError` if you try multiplying them), then also unpack the two values, which is great since then you only need one `except` since the only reasonable error thrown is `ValueError` (either because couldn't unpack or character couldn't be converted to integer) then multiply both values and add to the counter and at the end of the loop return the counter