Parsing a .csv file format

I have a .csv file, need to transfer it to create a Jtable, and then u

  • (2 Pages)
  • +
  • 1
  • 2

20 Replies - 12029 Views - Last Post: 28 November 2008 - 07:29 PM Rate Topic: -----

#1 Ironwarrior  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 27
  • Joined: 26-November 08

Parsing a .csv file format

Post icon  Posted 28 November 2008 - 09:41 AM

I went for the multi-dimentional array approach as follows:

public static void parseCSVFile(String userInputFile)throws Exception
	{
			String tempString;
			Scanner fileScanner;
			fileScanner = new Scanner(new File(userInputFile));
			int fileScannerLength = 0;
			while(fileScanner.hasNext())
			{
				fileScannerLength = fileScannerLength + fileScannerLength;
			}
			while(fileScanner.hasNext())
			{
				tempString = fileScanner.nextLine();
				StringTokenizer tokenizer = new StringTokenizer(tempString, ",");
				int numberOfColumns = 4;
				Object[][] parsedInfomation = new Object[fileScannerLength][numberOfColumns];
				for(int index = 0; index < fileScannerLength; index++)
				{
					while (tokenizer.hasMoreTokens())
					{
						parsedInfomation[index][0] = tokenizer.nextToken();
						parsedInfomation[index][1] = tokenizer.nextToken();
						parsedInfomation[index][2] = tokenizer.nextToken();
						parsedInfomation[index][3] = tokenizer.nextToken();
						parsedInfomation[index][4] = tokenizer.nextToken();
						parsedInfomation[index][5] = tokenizer.nextToken();
						parsedInfomation[index][6] = tokenizer.nextToken();
						parsedInfomation[index][7] = tokenizer.nextToken();
						parsedInfomation[index][8] = tokenizer.nextToken();
						parsedInfomation[index][9] = tokenizer.nextToken();
						parsedInfomation[index][0] = tokenizer.nextToken();
					}
				} 
			}
		}



But not sure if its a good way to do it, this is the format that i am trying to process:

team one name, score for team one, score for team two, team two name.


I realise that I could indeed make a class with these properties... and then make an instance of each.

Just need some guidelines on the correct way of doing things.

Is This A Good Question/Topic? 0
  • +

Replies To: Parsing a .csv file format

#2 g00se  Icon User is offline

  • D.I.C Lover
  • member icon

Reputation: 2833
  • View blog
  • Posts: 12,000
  • Joined: 20-September 08

Re: Parsing a .csv file format

Posted 28 November 2008 - 10:43 AM

I would use the following

http://ostermiller.org/utils/CSV.html
Was This Post Helpful? 0
  • +
  • -

#3 cfoley  Icon User is offline

  • Cabbage
  • member icon

Reputation: 2071
  • View blog
  • Posts: 4,307
  • Joined: 11-December 07

Re: Parsing a .csv file format

Posted 28 November 2008 - 10:58 AM

This bit doesn't look right to me. Don't you need to move the pointer back to the start of the file when you've read through it?

            while(fileScanner.hasNext())
            {
                fileScannerLength = fileScannerLength + fileScannerLength;
            }
            while(fileScanner.hasNext())
            {
                // All the rest of your code
            }


Anyway, to answer your question, it really depends what you need for your program. Is this just an exercise to read a CSV file? Is the CSV parsing just something you need to do to get the data into your program so you can do something with it? I'm going to assume you want to do both.

You've already written a CSV parser. Let's look at that Results class. You already know you want 4 methods to get the results out. You need a constructor. How about: public Results(String nameA, int scoreA, String nameB, int scoreB) Write the class and fill in the blanks. Then you'll have a useful, meaningful class for the rest of your program.

Let's look at your CSV parser. There's one thing missing. How do you get the data out of that class? You need a return type on the method:

public static String[][] parseCSVFile(String userInputFile)throws Exception


Reading CSV files you're always going to get Strings. Probably best to abandon the Objects you're using and go with Strings instead. You're good to go now but with only a little modification, your CSV class will be useful for any program you want to write CSV files for. The biggest problem is the limit on the number of columns. You could do a similar thing as before and count each line then declare that line's array... or you could simplify the whole code and use Vectors. (Vectors can be thought of as arrays that auto resize -- so you can just keep adding items and watch them grow)

A bit like your 2D array, you could have a Vector of Vectors. Each sub-vector holds a line.

Hope that helps! Let me know what you decide to do.
Was This Post Helpful? 1
  • +
  • -

#4 Ironwarrior  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 27
  • Joined: 26-November 08

Re: Parsing a .csv file format

Posted 28 November 2008 - 03:27 PM

Sorry for me not quoting the above topic, but dont see the need for the duplication.

So, I have having code that I cannot fully comprehend. I have taken it down to the bare basics:
public static void parseCSVFile(String userInputFile)throws Exception
	{
			String tempString;
			Scanner fileScanner;
			fileScanner = new Scanner(new File(userInputFile));
			int fileScannerLength = 0;
			while(fileScanner.hasNext())
			{
				fileScannerLength = fileScannerLength + fileScannerLength;
			}
			StringTokenizer tokenizer = new StringTokenizer(userInputFile, ",");
			ArrayList<String> teamOneName = new ArrayList<String>();
			ArrayList<String> teamOneScore = new ArrayList<String>();
			ArrayList<String> teamTwoScore = new ArrayList<String>();
			ArrayList<String> teamTwoName = new ArrayList<String>();

			while(fileScanner.hasNext())
			{
			   int index = 0;
			   while(tokenizer.hasMoreTokens())
			   {
				   if (index == 0)
				   {
					   teamOneName.add(tokenizer.nextToken());
				   }
				   if (index == 1)
				   {
					   teamOneScore.add(tokenizer.nextToken());
				   }
				   if (index == 2)
				   {
					   teamTwoScore.add(tokenizer.nextToken());
				   }
				   if (index == 3)
				   {
					   teamTwoName.add(tokenizer.nextToken());
				   }
				   index++;
			   }
			}
	}


Just not sure why its not working, it compiles but doesnt parse the file, just hangs.

A nice an easy method... no?

This post has been edited by Ironwarrior: 28 November 2008 - 04:02 PM

Was This Post Helpful? 0
  • +
  • -

#5 cfoley  Icon User is offline

  • Cabbage
  • member icon

Reputation: 2071
  • View blog
  • Posts: 4,307
  • Joined: 11-December 07

Re: Parsing a .csv file format

Posted 28 November 2008 - 04:07 PM

I didn't pick up on it last time but this line doesn't do anything. You're always adding zero to zero:

fileScannerLength = fileScannerLength + fileScannerLength;


Anyway, if you're using ArrayLists it doesn't matter. You don't need to know the length of the file.

I don't really see how you could be lost. I didn't run your code before but it was definitely along the right lines. If I were you, I'd stick with whatever way you were doing it before. Get it working and finish the program. If you've got time later then come back and improve your CSV code.
Was This Post Helpful? 0
  • +
  • -

#6 Ironwarrior  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 27
  • Joined: 26-November 08

Re: Parsing a .csv file format

Posted 28 November 2008 - 04:12 PM

Thanks for all your help (gave you a helpful thing).

Just through this would be the most straight forward way of doing it, just not sure whats wrong with it... ??

Realised i can remove the file length code, we have to explain all code, so its no good having over complex code.
Was This Post Helpful? 0
  • +
  • -

#7 cfoley  Icon User is offline

  • Cabbage
  • member icon

Reputation: 2071
  • View blog
  • Posts: 4,307
  • Joined: 11-December 07

Re: Parsing a .csv file format

Posted 28 November 2008 - 04:22 PM

OK, you edited that in while I was typing. You don't need all thise ifs. Just:

teamOneName.add(tokenizer.nextToken());
teamOneScore.add(tokenizer.nextToken());
teamTwoScore.add(tokenizer.nextToken());
teamTwoName.add(tokenizer.nextToken());


The first item in a line is always team 1's name
The nextis always teh score... etc

This way is going to work, and if speed is what you're after then that's fine.

A more general way to read the file would be into this structure:

ArrayList<ArrayList<String>>

This is just like a 2D array. This is an ArrayList of ArrayLists just like a 2D array is an array of arrays.

Something like this would do it:

		ArrayList<ArrayList<String>> data = new ArrayList<ArrayList<String>>();

		while(fileScanner.hasNext())
		{
			ArrayList<String> thisLine = new ArrayList<String>();
			StringTokenizer tokenizer = new StringTokenizer(fileScanner.next(), ",");
			while(tokenizer.hasMoreTokens()) {
				thisLine.add(tokenizer.nextToken());
			}
			data.add(thisLine);
		}



And thanks for the help thing. I appreciate it. :)
Was This Post Helpful? 1
  • +
  • -

#8 Ironwarrior  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 27
  • Joined: 26-November 08

Re: Parsing a .csv file format

Posted 28 November 2008 - 04:32 PM

Another helpful post, kind of got it working. But throws an error here:

				  try 
					{
						CSVParser.parseCSVFile(userSelectedFile.toString());
					} catch (Exception ex) 
					{
						System.out.println("An error has occured");
					}



Final method code:
  public static void parseCSVFile(String userInputFile)throws Exception
	{
			Scanner fileScanner;
			fileScanner = new Scanner(new File(userInputFile));
			StringTokenizer tokenizer = new StringTokenizer(userInputFile, ",");
			ArrayList<String> teamOneName = new ArrayList<String>();
			ArrayList<String> teamOneScore = new ArrayList<String>();
			ArrayList<String> teamTwoScore = new ArrayList<String>();
			ArrayList<String> teamTwoName = new ArrayList<String>();

			while(fileScanner.hasNext())
			{
			   while(tokenizer.hasMoreTokens())
			   {
				  teamOneName.add(tokenizer.nextToken());
				  teamOneScore.add(tokenizer.nextToken());
				  teamTwoScore.add(tokenizer.nextToken());
				  teamTwoName.add(tokenizer.nextToken());
			   }
			}
			System.out.println(teamOneName.get(1));
	}

Was This Post Helpful? 0
  • +
  • -

#9 cfoley  Icon User is offline

  • Cabbage
  • member icon

Reputation: 2071
  • View blog
  • Posts: 4,307
  • Joined: 11-December 07

Re: Parsing a .csv file format

Posted 28 November 2008 - 04:54 PM

I've not seen the exception or run your code or seen your file. It could be something else but this needs to be fixed anyway:

Every line you read in has 4 items. There's no need for the while loop or even to check if the tokenizer has any tokens in it. Also, every time you read a line in, you need to read another one. You can change your loops to this:

		while(fileScanner.hasNext())
		{
			StringTokenizer tokenizer = new StringTokenizer(fileScanner.next(), ",");
			teamOneName.add(tokenizer.nextToken());
			teamOneScore.add(tokenizer.nextToken());
			teamTwoScore.add(tokenizer.nextToken());
			teamTwoName.add(tokenizer.nextToken());
		}



You also don't need the line where you declare tokenizer up top. In fact, get rid of it. If you leave it there, you will always skip the first line.

Because your current program only reads in one line (because it never updates tokenizer) your println() throws the exception. You've only got one value in the ArrayList. You ask for item 1 but since ArrayLists start counting from zero, it looks for the second item. There isn't one so it throws an exception
Was This Post Helpful? 1
  • +
  • -

#10 Ironwarrior  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 27
  • Joined: 26-November 08

Re: Parsing a .csv file format

Posted 28 November 2008 - 05:13 PM

Right feel that we are getting there, have updated the code as follows:

	public static void parseCSVFile(String userInputFile)throws Exception
	{
			Scanner fileScanner;
			fileScanner = new Scanner(new File(userInputFile));
			ArrayList<String> teamOneName = new ArrayList<String>();
			ArrayList<String> teamOneScore = new ArrayList<String>();
			ArrayList<String> teamTwoScore = new ArrayList<String>();
			ArrayList<String> teamTwoName = new ArrayList<String>();

			while(fileScanner.hasNext())
			{
				StringTokenizer tokenizer = new StringTokenizer(fileScanner.next(), ",");
				teamOneName.add(tokenizer.nextToken());
				teamOneScore.add(tokenizer.nextToken());
				teamTwoScore.add(tokenizer.nextToken());
				teamTwoName.add(tokenizer.nextToken());
			}
			System.out.println(teamOneName.get(1));
	}



And got it to print the thrown exception and its a no such element exception.

Does it make a difference that teamOneScore.add will be an Int?

Pretty sure that you can have an int, but declare it as a String, aslong as you dont want to do arithmic operators.

Thanks,

Iron Warrior
Was This Post Helpful? 0
  • +
  • -

#11 cfoley  Icon User is offline

  • Cabbage
  • member icon

Reputation: 2071
  • View blog
  • Posts: 4,307
  • Joined: 11-December 07

Re: Parsing a .csv file format

Posted 28 November 2008 - 05:38 PM

Actually, it will be a String, something like "7", and to do any operation with it, you'd have to convert it to an int with something like Integer.parseInt();

It's perfectly fine storing it as a String (like you are now) if you don't want to use it for calculations.

OK, time to do some debugging. I've got a sneaking suspicion the loop isn't being entered at all.

Delete your current tokenizer and replace it with these:

		String currentLine = fileScanner.next();
		System.out.println("***" + currentLine + "***"); // This line should be removed when the program is working
		StringTokenizer tokenizer = new StringTokenizer(currentLine, ",");



This will tell you what the computer is reading in for each line. I suspect it won't display anything which means the loop isn't being entered. The likely cause of that is the file not being opened properly (wrong path or something).
Was This Post Helpful? 1
  • +
  • -

#12 Ironwarrior  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 27
  • Joined: 26-November 08

Re: Parsing a .csv file format

Posted 28 November 2008 - 05:46 PM

 
run:
***Barnsley,0,0,Bristol***
***City***
***Birmingham***
***City,3,1,Sheffield***
***Wednesday***
***Blackpool,2,2,Crystal***
***Palace***
***Charlton***
***Athletic,1,1,Burnley***
***Coventry***
***City,1,1,Derby***
***County***
***Norwich***
***City,2,1,Doncaster***
***Rovers***
***Nottingham***
***Forest,0,1,Cardiff***
***City***
***Plymouth***
***Argyle,1,3,Ipswich***
***Town***
***Reading,0,0,Queens***
***Park***
***Rangers***
***Sheffield***
***United,1,0,Preston***
***North***
***End***
***Swansea***
***City,3,0,Southampton***
***Watford,2,3,Wolverhampton***
***Wanderers***
java.lang.IndexOutOfBoundsException: Index: 1, Size: 0
BUILD SUCCESSFUL (total time: 30 seconds)


Was This Post Helpful? 0
  • +
  • -

#13 cfoley  Icon User is offline

  • Cabbage
  • member icon

Reputation: 2071
  • View blog
  • Posts: 4,307
  • Joined: 11-December 07

Re: Parsing a .csv file format

Posted 28 November 2008 - 06:04 PM

Something is interpreting spaces as line breaks.
Has your file become corrupted? Open it in notepad and make sure 4 items are on each line. It would help if you could post the text file anyway. If that's not the problem it looks like the scanner is doing something funny?

I also don't understand why the String Tokenizer isn't throwing an exception when there is only one item in the line. If you post the file I'll have a good look and try and get it working.
Was This Post Helpful? 0
  • +
  • -

#14 Ironwarrior  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 27
  • Joined: 26-November 08

Re: Parsing a .csv file format

Posted 28 November 2008 - 06:07 PM

 Barnsley,0,0,Bristol City
Birmingham City,3,1,Sheffield Wednesday
Blackpool,2,2,Crystal Palace
Charlton Athletic,1,1,Burnley
Coventry City,1,1,Derby County
Norwich City,2,1,Doncaster Rovers
Nottingham Forest,0,1,Cardiff City
Plymouth Argyle,1,3,Ipswich Town
Reading,0,0,Queens Park Rangers
Sheffield United,1,0,Preston North End
Swansea City,3,0,Southampton
Watford,2,3,Wolverhampton Wanderers


Tada... not sure whats happened its a .csv file...

But opens and displays as 4 items on one line normally..
Was This Post Helpful? 0
  • +
  • -

#15 cfoley  Icon User is offline

  • Cabbage
  • member icon

Reputation: 2071
  • View blog
  • Posts: 4,307
  • Joined: 11-December 07

Re: Parsing a .csv file format

Posted 28 November 2008 - 06:37 PM

Found it. From the Scanner specification in the java API: A Scanner breaks its input into tokens using a delimiter pattern, which by default matches whitespace. So, every time it meets a line break, tab or SPACE it treats it as a new token.

When you instantiate the scanner, specify the delimiter like this:

		String delimiter = System.getProperty("line.separator");
		fileScanner = new Scanner(new File(userInputFile));
		fileScanner.useDelimiter(delimiter);



That should sort it out.

Programming in Java for 8 years, never used a Scanner before :S
Was This Post Helpful? 0
  • +
  • -

  • (2 Pages)
  • +
  • 1
  • 2