Getting and Merging data into tsv file help

  • (2 Pages)
  • +
  • 1
  • 2

19 Replies - 2834 Views - Last Post: 16 March 2013 - 05:44 PM Rate Topic: -----

#16 baavgai  Icon User is offline

  • Dreaming Coder
  • member icon

Reputation: 5846
  • View blog
  • Posts: 12,705
  • Joined: 16-October 07

Re: Getting and Merging data into tsv file help

Posted 16 March 2013 - 03:55 AM

Just "doesn't work" is not a diagnostic. Which part doesn't work?

Examine your data. Does the first call work? Do you get regions? Does the second work, do you get countries? Is data full or empty?

If data is loaded, then what does lookup look like? If that looks fine, it's time to examine file lines being read from the file, the compontents from the split, etc.
Was This Post Helpful? 0
  • +
  • -

#17 jellyworms  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 11
  • Joined: 15-March 13

Re: Getting and Merging data into tsv file help

Posted 16 March 2013 - 11:59 AM

Oh sorry, I meant the lookup line is giving me the error so the program stops when it runs to that line. The region, country still prints like so [(u, 'Africa', u, 'Algeria'), (u, 'Africa', u, 'Angola')...etc)].

But when it gets to make a dict, at the lookup line with lookup = dict((c.upper(), r) for r, c in data.data), I get attribute error: list has no attribute data.

However, if I change the line to lookup = dict((c.upper(), r) for r, c in data), then the program runs and hits country, area, population = line.split('\t') and gives me the ValueError: too many values to unpack

Sorry again, I'm still new so I'm trying to understand what's going on as well.
Was This Post Helpful? 0
  • +
  • -

#18 baavgai  Icon User is offline

  • Dreaming Coder
  • member icon

Reputation: 5846
  • View blog
  • Posts: 12,705
  • Joined: 16-October 07

Re: Getting and Merging data into tsv file help

Posted 16 March 2013 - 03:59 PM

View Postjellyworms, on 16 March 2013 - 01:59 PM, said:

However, if I change the line to lookup = dict((c.upper(), r) for r, c in data), then the program runs


So, stop right there. Using the original code offered generates a valid dictionary: check.

View Postjellyworms, on 16 March 2013 - 01:59 PM, said:

hits country, area, population = line.split('\t') and gives me the ValueError: too many values to unpack


Translation, your data file that was supposed to have three tab separted fields doesn't always.

Guess you should check that.

Perhaps.
fields = line.split('\t')
if len(fields)==3:
	country, area, population = fields
	# ...
else:
	print("Shipped: only {0} fields found in: '{1}'\n".format(len(fields), line))


Was This Post Helpful? 0
  • +
  • -

#19 jellyworms  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 11
  • Joined: 15-March 13

Re: Getting and Merging data into tsv file help

Posted 16 March 2013 - 05:00 PM

Hi again, sorry for being such a slow noob TT.TT

So I tried what you suggested
lookup = dict((c.upper(), r) for r, c in data)
#print lookup
for line in open("data.tsv", "r"):
	fields = line.split('\t')
	if len(fields)==3:
		country, area, population = fields
		print "\t".join([country, lookup[country], area, population])
	else:
		print("Shipped: only {0} fields found in: '{1}'\n".format(len(fields), line))

And got this output:
GREENLAND      2166086 57695' AS)        12173    31409190population

But I honestly have no idea what that means though...

even without this line:
print "\t".join([country, lookup[country], area, population])
, I still get the same above output...

here's a link to my data.tsv file if you would like to see the output for yourself.
http://wikisend.com/...430700/data.tsv
Was This Post Helpful? 0
  • +
  • -

#20 baavgai  Icon User is offline

  • Dreaming Coder
  • member icon

Reputation: 5846
  • View blog
  • Posts: 12,705
  • Joined: 16-October 07

Re: Getting and Merging data into tsv file help

Posted 16 March 2013 - 05:44 PM

Another quick test:
>>> lines = [ line for line in open("data.tsv", "r") ]
>>> len(lines)
1
>>> lines
["country\tarea\tpopulation\rMACAU\t28.2\t578025\rMONACO\t2\t30510...



You know, your methodolgy of doing for line in open is rather unique. Perhaps is your properly opened a file then read it:
>>> count = 0
>>> with open("data.tsv", "r") as fh:
...     for line in fh.readline():
...             count += 1
... 
>>> count
5752



Now you're onto something.
Was This Post Helpful? 0
  • +
  • -

  • (2 Pages)
  • +
  • 1
  • 2