5 Replies - 1291 Views - Last Post: 12 March 2013 - 11:26 AM Rate Topic: -----

#1 cannesMajor  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 7
  • Joined: 25-June 12

An equivalent list comprehension?

Posted 11 March 2013 - 01:51 PM

I have successfully created a Python script that compares 2 csv files and combines each list based on a match of the first column of each row.

Here's the script:
import csv

with open('file1.csv', newline='') as f1, open('file2.csv', newline='') as f2:
    reader1 = list(csv.reader(f1))
    reader2 = list(csv.reader(f2))
    for row1 in reader1:
         for row2 in reader2:
              if row2[0] == row1[0]:
                   row2.pop(0)
                   for item in row2:
                        row1.append(item)

with open('merged.csv', 'w', newline='') as fout:
    writer1 = csv.writer(fout)
    for row in reader1:
        print(row)
        writer1.writerow(row)



file1.csv:
ssid,loc,date,time
00:13,loc1,"Jan 1, 2013",1330
01:14,loc2,"Feb 2, 2013",1440
01:15,loc3,"Mar 3, 2013",1550
01:16,loc4,"Apr 4, 2013",1660
02:AA,loc5,"May 5, 2013",1700
02:BB,loc6,"Jun 6, 2013",1800
01:CC,loc7,"Jul 7, 2013",1900

file2.csv:
ssid,user,IP,mac
00:13,user1,192.168.1.2,AA:B1
01:14,user2,192.168.1.3,BB:C2
01:15,user3,192.168.1.4,CC:D3
01:16,user4,192.168.1.5,DD:E4
01:CC,user5,192.168.1.6,EE:F5
01:dd,user6,192.168.1.7,FF:G6

The script works, but it has occurred to me that the for loop that compares row2[0] with row1[0] and appends the items in row2 to row1 might be more efficient as a list comprehension. However being a python neophyte I'm not quite up to it yet. Any suggestions (in the form of code, or algorithms/pseudo code as a teaching aid) would be appreciated.

P.S. this isn't homework, I'm teaching myself python and using this script at work

Is This A Good Question/Topic? 0
  • +

Replies To: An equivalent list comprehension?

#2 andrewsw  Icon User is offline

  • It's just been revoked!
  • member icon

Reputation: 3608
  • View blog
  • Posts: 12,393
  • Joined: 12-December 12

Re: An equivalent list comprehension?

Posted 11 March 2013 - 03:26 PM

Rather than using a list comprehension, which I can't see the need for (at the stage in your code that you indicate), you can append the list items without looping through them:

              if row2[0] == row1[0]:
                   row2.pop(0)
                   row1[len(row1):] = row2
                   #for item in row2:
                        #row1.append(item)

It would be possible, I believe, to use a list comprehension to loop through both files simultaneously, without the nested for-loops. However, I believe/guess that it would still be performing the same nested looping behind the scenes and would be less easy to read (and modify later) the code.
Was This Post Helpful? 1
  • +
  • -

#3 andrewsw  Icon User is offline

  • It's just been revoked!
  • member icon

Reputation: 3608
  • View blog
  • Posts: 12,393
  • Joined: 12-December 12

Re: An equivalent list comprehension?

Posted 11 March 2013 - 03:33 PM

You can, alternatively, use extend:

# rather than (from my previous code)
# row1[len(row1):] = row2
row1.extend(row2)

Was This Post Helpful? 1
  • +
  • -

#4 baavgai  Icon User is offline

  • Dreaming Coder
  • member icon

Reputation: 5874
  • View blog
  • Posts: 12,754
  • Joined: 16-October 07

Re: An equivalent list comprehension?

Posted 11 March 2013 - 04:45 PM

I'm not sure a comprehension is aplicable. I'd probably do something like:
with open('file1.csv', newline='') as f1, open('file2.csv', newline='') as f2:
	d1 = dict((row[0], row) for row in list(csv.reader(f1)))
	l1 = list(csv.reader(f2))
# now it's clear the files are closed
for row in l1:
	if row[0] in d1:
		d1[k].extend(row[1:])

for row in d1.values():
	print(row)



However, if your heart is set on it:
def extendAndReturn(i,x):
	i.extend(x)
	return i
with open('file1.csv', newline='') as f1, open('file2.csv', newline='') as f2:
	l1 = list(csv.reader(f1))
	l2 = list(csv.reader(f2))
merged = [ extendAndReturn(i, [j[1:] for j in l2 if j[0] in i[0]]) for i in l1 ]


Was This Post Helpful? 0
  • +
  • -

#5 Mekire  Icon User is offline

  • D.I.C Head

Reputation: 116
  • View blog
  • Posts: 212
  • Joined: 11-January 13

Re: An equivalent list comprehension?

Posted 11 March 2013 - 05:19 PM

Maybe I'm being overly pedantic, but it seems bad form to load the entirety of both your files into memory all at once by converting them to lists. You should be able to keep them as generators. If those files are massive it could make a significant difference.

import csv

with open('file1.csv',newline='') as f1, open('file2.csv', newline='') as f2:
    reader1,reader2 = csv.reader(f1),csv.reader(f2)
    with open("merged.csv",'w',newline='') as fout:
        writer = csv.writer(fout)
        for row in reader1:
            for other in reader2:
                if row[0] == other[0]:
                    row += other[1:]
                    break
            f2.seek(0)
            print(row)
            writer.writerow(row)
Bit messy but at least we aren't loading the entire files into memory nor keeping a list of all the adjusted rows.

Of course I may just be making things unnecessarily complicated,
-Mek

Edit: Changed my code as I realized it wasn't doing what I wanted. Back to using an imbedded for loop. This should be an improvement memory-wise, but a loss performance-wise compared to the methods not involving a second loop proposed by others.

This post has been edited by Mekire: 11 March 2013 - 08:18 PM

Was This Post Helpful? 0
  • +
  • -

#6 cannesMajor  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 7
  • Joined: 25-June 12

Re: An equivalent list comprehension?

Posted 12 March 2013 - 11:26 AM

Thanks for all the response, they were all helpful. I did start off keeping the generators at first but ran into problems (I must have done something wrong, as I said I'm a neophyte). Here's my "final" version:

import csv

try:
    fin1 = input("What is the first file? ")
    fin2 = input("What is the second file? ")
    with open(fin1, newline='') as f1, open(fin2, newline='') as f2:
        reader1, reader2 = csv.reader(f1), csv.reader(f2)
        with open('merged.csv', 'w', newline='') as fout:
            writer1 = csv.writer(fout)
            for row in reader1:
                for other in reader2:
                     if row[0] == other[0]:
                           row += other[1:]
                           break
                f2.seek(0)
                writer1.writerow(row)
except IOError as err:
     print('File Error: ' + str(err))






View PostMekire, on 11 March 2013 - 05:19 PM, said:

Maybe I'm being overly pedantic, but it seems bad form to load the entirety of both your files into memory all at once by converting them to lists. You should be able to keep them as generators. If those files are massive it could make a significant difference.

import csv

with open('file1.csv',newline='') as f1, open('file2.csv', newline='') as f2:
    reader1,reader2 = csv.reader(f1),csv.reader(f2)
    with open("merged.csv",'w',newline='') as fout:
        writer = csv.writer(fout)
        for row in reader1:
            for other in reader2:
                if row[0] == other[0]:
                    row += other[1:]
                    break
            f2.seek(0)
            print(row)
            writer.writerow(row)
Bit messy but at least we aren't loading the entire files into memory nor keeping a list of all the adjusted rows.

Of course I may just be making things unnecessarily complicated,
-Mek

Edit: Changed my code as I realized it wasn't doing what I wanted. Back to using an imbedded for loop. This should be an improvement memory-wise, but a loss performance-wise compared to the methods not involving a second loop proposed by others.

Was This Post Helpful? 0
  • +
  • -

Page 1 of 1