2 Replies - 13534 Views - Last Post: 16 May 2008 - 09:46 PM Rate Topic: -----

#1 nivekious  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 5
  • Joined: 16-May 08

Breaking string into words only, and then removing all but large words

Posted 16 May 2008 - 07:36 PM

Hi again. Sorry to continue bothering you all, but I have one last homework issue. Here's the question:

Write a static method, getBigWords, that gets a String parameter and returns an array whose elements are the words in the parameter that contain more than 5 letters. (A word is defined as a contiguous sequence of letters.) So, given a String like "There are 87,000,000 people in Canada", getBigWords would return an array of two elements, "people" and "Canada".

Now, the only thing I could think of doing was this:
public static String[] getBigWords ( String t)
{
String[] end=t.split(" ");
int co= 0;
for (int o=0; o<end.length; o=o+1)
{
if(end[o].length()>5)
{
co=co+1;
}
}
String[] y=new String[co];
int g=0;
for (int o=0; o<end.length; o=o+1)
{
if(end[o].length()>5)
{
y[g]=end[o];
g=g+1;
}
}
return y;
}



The problem is that this includes strings with numbers in them, and hyphened words, and also words with punctuation at the end. (For example 100000000 is included bu shouldn't be. Same with 09JBAS2371N, and son-in-law. Also, the last word of the sentence is included with the period, when it should be included without the punctuation, and only if it is more than five letters long without counting the punctuation.)

So what I'm trying to figure out it, how to split the words at all punctuation types in addition to whitespace, and how to not include strings that include digits as words.

Anyone have any ideas?

Is This A Good Question/Topic? 0
  • +

Replies To: Breaking string into words only, and then removing all but large words

#2 nivekious  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 5
  • Joined: 16-May 08

Re: Breaking string into words only, and then removing all but large words

Posted 16 May 2008 - 09:15 PM

I've now run into a similar issue on another problem which makes me think there is some method or something to easily get around this. Here is that question:
A String variable, fullName, contains a name in one of two formats:
last name,first name (comma followed by a blank), or
first namelast name (single blank)
Extract the first name into the String variable firstName and the last name into the String variable lastName. Assume the variables have been declared and fullName already initialized. You may also declare any other necessary variables.

and my code for it:


String r="";
int q=0;
int h=fullName.indexOf(",");


if (h>=0)
{
for(int u=0;u<h-1;u=u+1)
{
r=r+fullName.charAt(u);
q=u;
}
firstName=r;
r=""; 
q=q+2;
for(int t=q;t<fullName.length();t=t+1)
{
r=r+fullName.charAt(t);
}
lastName=r;
}
else
{
String[] f=fullName.split(" ");
firstName=f[0];
lastName=f[1];
}



I get a message saying that it fails when the first name is entered with a comma after it before the space and the last name. Again, any help would be appreciated.
Was This Post Helpful? 0
  • +
  • -

#3 cutegrrl  Icon User is offline

  • D.I.C Head

Reputation: 10
  • View blog
  • Posts: 77
  • Joined: 12-May 08

Re: Breaking string into words only, and then removing all but large words

Posted 16 May 2008 - 09:46 PM

As a warning, I haven't tested this thoroughly. Let me know if you encounter any problems.

import java.io.*;
import java.util.*;

public class BigWords {

	public static void main(String[] args) throws IOException{

		// Create a single shared BufferedReader for keyboard input
		BufferedReader kb = new BufferedReader(new InputStreamReader(System.in));

		// Prompt user for input
		System.out.print("Enter a sentence: ");
		String sentence = kb.readLine().trim();

		String[] words = getBigWords(sentence);
		
		// Output big words
		for(int i = 0; i < words.length; i++){
			System.out.println(words[i]);
		}
	}



	public static String[] getBigWords(String sentence){

		String[] words = sentence.split(" "); //  break string into substrings 
		ArrayList<String> bigWords = new ArrayList<String>(); // create a dynamic array of String objects
		boolean isValidWord = true;

		// Traverse word array
		for(int i = 0; i < words.length; i++){
			
			// Remove trailing punctuation
			int lastIndex = words[i].length()-1;
			while(lastIndex > 0 && !Character.isLetter(words[i].charAt(lastIndex))){
				words[i] = words[i].substring(0, lastIndex);
				lastIndex = words[i].length()-1;
			}
			
			// Traverse elements character-by-character 
			for(int j = 0; j < words[i].length(); j++){
				
				// Determine if word is valid (A valid word is defined as a contiguous sequence of letters + length > 5)
				if(words[i].length() <= 5 || !(Character.isLetter(words[i].charAt(j)))){
					isValidWord = false;
					break; // invalid word - no need to look further
				}
			}

			if(isValidWord){
				bigWords.add(words[i]); // add word to array
			}else{
				isValidWord = true; // reset
			}
		}

		// copy ArrayList contents to conventional array
		String[] validWords = new String[bigWords.size()];
		for(int i = 0; i < bigWords.size(); i++){
			validWords[i] = bigWords.get(i);
		}

		return validWords;
	}

}


This post has been edited by cutegrrl: 16 May 2008 - 10:28 PM

Was This Post Helpful? 0
  • +
  • -

Page 1 of 1