4 Replies - 623 Views - Last Post: 13 April 2013 - 04:34 AM Rate Topic: -----

#1 cannesMajor  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 7
  • Joined: 25-June 12

Iteration problem

Posted 12 April 2013 - 01:23 PM

I am writing a program that does some basic analysis on a csv file. In the first step the program iterates through the list of customer ID's and populates a set, called custid. Using a set ensures there will be no duplicates. The program then takes each custid and iterates through the csv file to count the number of times the custid appears in the report, storing the total in "hitcount". The problem is the program only iterates through the csv file with the first custid in the set.
import csv

def hit_count(file):
    try:
        with open(file, newline='') as fin:
            reader = csv.reader(fin)
            custid = set()
            count = 0
            hitcount = 0

#the following loop pulls out unique customer id's and puts them into a set called "custid"
            for row in reader:
                if row[0] != "0":
                    continue
                #print(row[13])
                custid.add(row[13])
                count +=1
            print("There were", count, "events in the collection.\n")
            print("There were", len(custid), "unique customer's collected\n")
            stats = list(custid)
            print(stats)

#the following loop counts the number of times each unique custid was collected and appends the count to "stats"
        with open(file, newline='') as fin:
            reader = csv.reader(fin)
            for item in range(len(stats)):
                item = stats.pop(0)
                #print("Checking", item)
                for row in reader:
                    #print("Checking", item, "against", row[13])
                    if item == row[13]:
                        hitcount += 1
                    stats.append((item, hitcount))
            for item in stats:
                print(item)
    except IOError as err:
        print("File Error", str(err))

hit_count("Austin_12032013.csv")



I created "stats" because I'm planning on further functionality that will add to the list of analysis items. So eventually I will have a list of lists that include customer id, number of times each customer came to the store, which store they visited, etc.

At first the second loop wouldn't work at all. When I added the redundant "with open..." and csv reader lines then the loop would work with the first custid in the set, but stops after that. I can't figure out why the program won't iterated through the entire csv file for each of the customer id's in the custid set.

Is This A Good Question/Topic? 0
  • +

Replies To: Iteration problem

#2 cannesMajor  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 7
  • Joined: 25-June 12

Re: Iteration problem

Posted 12 April 2013 - 01:59 PM

I changed the second part of the script to the below with no success:

with open(file, newline='') as fin:
            reader = csv.reader(fin)
            for item in custid:
                #print("second for loop running against:", item)
                for row in reader:
                    if row[13] == item:
                        hitcount +=1
                        print("comparing", row[13], "with", item)
                stats.append([item, hitcount])
                hitcount = 0
            print(stats)
        
    except IOError as err:
        print("File Error: " + str(err))

Was This Post Helpful? 0
  • +
  • -

#3 woooee  Icon User is online

  • D.I.C Head

Reputation: 25
  • View blog
  • Posts: 104
  • Joined: 21-November 12

Re: Iteration problem

Posted 12 April 2013 - 03:45 PM

You can use a dictionary and do both at once (obviously we don't have the file so this code is not tested)
            custid = {}
            for row in reader:
                if row[0] == "0":
                    if row[13] not in custid:
                        custid[row[13]]=0
                    custid[row[13]] += 1 

This post has been edited by woooee: 12 April 2013 - 03:47 PM

Was This Post Helpful? 0
  • +
  • -

#4 Mekire  Icon User is offline

  • D.I.C Head

Reputation: 116
  • View blog
  • Posts: 212
  • Joined: 11-January 13

Re: Iteration problem

Posted 13 April 2013 - 04:04 AM

Your problem appears to be in misunderstanding how iterating through a file works. Once you itterate through a file the generator is exhausted. It won't automatically start at the beginning of the file the next custid. You should use seek to reset the position in the file you wish to read from.

Try something like this maybe:
with open(file, newline='') as fin:
    reader = csv.reader(fin)
    for item in custid:
        for row in reader:
           #Do your stuff
        fin.seek(0)


@Wooee: When doing stuff like that, dict.get is your best friend.

This:
if row[13] not in custid:
    custid[row[13]]=0
custid[row[13]] += 1
should be equivalent to this:
custid[row[13]] = custid.get(row[13],0) + 1

-Mek
Was This Post Helpful? 0
  • +
  • -

#5 cannesMajor  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 7
  • Joined: 25-June 12

Re: Iteration problem

Posted 13 April 2013 - 04:34 AM

Thanks to everyone for the tips and answers. The fin.seek(0) solved the problem. I have been considering using a dictionary and will work it out, but since I'm a newbie I thought building a list would be easy. Once I accomplish that then I will work on the dictionary building in version 2.

Since I'm doing this for a friend I wasn't comfortable putting his data file on the forum. Thanks for understanding. Even without it y'all came through with the answers!
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1