7 Replies - 4642 Views - Last Post: 11 October 2011 - 03:07 PM

#1 mahi.aw   User is offline

  • D.I.C Head

Reputation: 0
  • View blog
  • Posts: 114
  • Joined: 21-May 11

Reading large data from mySQL tables Vs flat file reading

Posted 11 October 2011 - 11:22 AM

Hello all, need some help:

I have huge data around 30 millions records. This data organized into around 300 thousands files..
The idea is to open and read the data from particular list of files (among 300 thousands files)
based upon some user defined criteria/input..

The way I organized data into mySQL is:
---------------------------------------

1: name of all the files and some other parameters for each of this files stored in one mySQL table: called it as
INDEXTABLE.

2: contents of each file stored in separate table (name of the table is named of the file). so i have now around
300 thousands tables (in addition to INDEXTABLE.).


Process user input and retrieve result
-------------------------------------
1: user give its input from java application.

2: I then make a connection to mySQL server via JDBC (only once) locally.

3. 1st Look up inside the INDEXTABLE, which are the tables matching our criteria(called it as LIST_OF_MATCHED_TABLES)

4: Now for each of the table in "LIST_OF_MATCHED_TABLES" i create the mysql query (select data from LIST_OF_MATCHED_TABLES_X)and
retrieve all the data from that table. This is done for all the tables in "LIST_OF_MATCHED_TABLES". Each time the data
transferred back into the java for further processing. I have multi threading for this part of program.

I hoped for the great boost in the performance of my application but it is not performing
as i expected.

I have same application in which i worked only with text file and java(no mysql) and it is quite faster
than this version of application with mySQL.

Can anyone suggests me how to boost up the performance or
is it better to work with flat files instead of using mySQL in this case???

Is This A Good Question/Topic? 0
  • +

Replies To: Reading large data from mySQL tables Vs flat file reading

#2 codeprada   User is offline

  • Changed Man With Different Priorities
  • member icon

Reputation: 963
  • View blog
  • Posts: 2,382
  • Joined: 15-February 11

Re: Reading large data from mySQL tables Vs flat file reading

Posted 11 October 2011 - 12:48 PM

Any database that needs 300,000 tables needs to be revised.
A file holds a data that is related to each other so therefore you may not need more than 3 tables to represent all your files. A table represents objects that have the same attributes so in this case it's a FILE. Some attributes of a file could be it's name, content, path, owner...etc. It depends on what your application's requirement. These attributes will be your columns.
Was This Post Helpful? 2
  • +
  • -

#3 mahi.aw   User is offline

  • D.I.C Head

Reputation: 0
  • View blog
  • Posts: 114
  • Joined: 21-May 11

Re: Reading large data from mySQL tables Vs flat file reading

Posted 11 October 2011 - 01:14 PM

View Postcodeprada, on 11 October 2011 - 12:48 PM, said:

Any database that needs 300,000 tables needs to be revised.
A file holds a data that is related to each other so therefore you may not need more than 3 tables to represent all your files. A table represents objects that have the same attributes so in this case it's a FILE. Some attributes of a file could be it's name, content, path, owner...etc. It depends on what your application's requirement. These attributes will be your columns.


codeprada Thanks for your reply!
If i am understanding you correctly, then you suggested to create three tables and tables contain the informaation
like file name and other attributes. But if i create the table with content of all the files in one table then
it will be huge in this case(i want to extend the application latter on to billions of entry). And way i am
doing sql query really retriving lots of data from many files(tables) and i expected the huge data retrival every time.

SO i wonder if its good idea to keep all the information in one table and do the sql query and then transferred the results
(huge data) back to java for further processing!!

The idea of creating the many tables are B/C i don't wanted to put all the information in one table and its lot
much easier to do the initial search with the INDEXTABLE and then retrives all the record for each matched tables in
indexTable.

In short what i want to do is:
open the intreseting files(based upon user criteria) or tables and retrives all records from all the intreseting files
and hand it over to java for further processing.

any suggetions?
Was This Post Helpful? 0
  • +
  • -

#4 Atli   User is offline

  • Enhance Your Calm
  • member icon

Reputation: 4241
  • View blog
  • Posts: 7,216
  • Joined: 08-June 10

Re: Reading large data from mySQL tables Vs flat file reading

Posted 11 October 2011 - 01:18 PM

View Postmahi.aw, on 11 October 2011 - 06:22 PM, said:

2: contents of each file stored in separate table (name of the table is named of the file). so i have now around
300 thousands tables (in addition to INDEXTABLE.).

This would be a mistake. The structure of a relational database should never have to change as a part of it's normal operation. You design the tables before hand, create them, and then use statements like INSERT and UPDATE to manipulate the data from that point on. (There are fringe cases where this may not apply, but unless you are running a near Google sized operation, you are not likely to fit within that frame.)

For one, there is no way for you to create actual relationships linking the entries in your INDEXTABLE to each of your tables, which sort of negates the point of using a relational database in the first place. Relational databases are all about data integrity, after all.


As far as I understand your data, all you would really need is two tables: one to list your files, which is what your INDEXTABLE apparently does; and one to list the records within the files, which would be the 30 million records you are talking about, or the combination of those 300 thousand tables. Each record in the second table would contain a foreign key field, creating a relationship to it's file in the INDEXTABLE. This is how you would be able to query records from specific files.

This also simplifies your queries as you wouldn't have to query a list of files and then execute a query for each of those files. You could just execute one query and get the whole set in one go.


Depending on the amount of data (the row count isn't as important as the actual size) you may have to do some optimization to get this to perform properly. At some point you may even be going beyond the scale of what MySQL can handle. It's generally considered a "small-to-medium" size database system. You'd have to move up to Oracle or MSSQL to reach the "large" end of the RDBMS selection. - Although at 30 million rows I doubt you've gone beyond MySQL's capabilities.

Edit: Wow, I take a 15 minute break and I'm two posts behind :P

This post has been edited by Atli: 11 October 2011 - 01:19 PM

Was This Post Helpful? 2
  • +
  • -

#5 mahi.aw   User is offline

  • D.I.C Head

Reputation: 0
  • View blog
  • Posts: 114
  • Joined: 21-May 11

Re: Reading large data from mySQL tables Vs flat file reading

Posted 11 October 2011 - 01:32 PM

Thanks atil for nice explanation,

one point i did not understand:
"The structure of a relational database should never have to change as a part of it's normal operation."
as i am not changing any data afterward.

I wonder if one can store the huge data in one table. At this point its ok if i stored 30 millions
But what if i want to store the billions of records?

At the moment let consider: suppose i have 30 millions of data
in my table and 1 millions records matched my criteria, then how one can retrieve all
the 1 millions records and hand the result to java? don't you feel that if you
retrieved huge data at one go, then it may hamper the performance(as i need to send the data back to java)
Was This Post Helpful? 0
  • +
  • -

#6 mahi.aw   User is offline

  • D.I.C Head

Reputation: 0
  • View blog
  • Posts: 114
  • Joined: 21-May 11

Re: Reading large data from mySQL tables Vs flat file reading

Posted 11 October 2011 - 02:48 PM

View Postmahi.aw, on 11 October 2011 - 01:32 PM, said:

Thanks atil for nice explanation,

one point i did not understand:
"The structure of a relational database should never have to change as a part of it's normal operation."
as i am not changing any data afterward.

I wonder if one can store the huge data in one table. At this point its ok if i stored 30 millions
But what if i want to store the billions of records?

At the moment let consider: suppose i have 30 millions of data
in my table and 1 millions records matched my criteria, then how one can retrieve all
the 1 millions records and hand the result to java? don't you feel that if you
retrieved huge data at one go, then it may hamper the performance(as i need to send the data back to java)


hie atil & codprada,

can any one of you tell me how to used PK and FK key:

let say my two tables are like this

Table 1
-------
ID
X1
X2
X3

Table 2
-------
ID Somedata
X1 "ABC"
X1 "XYZ"
X2 "LMN"
X3 "PQ"
X3 "Z"


how can i build the sql query around the table 1 "ID" so that i can get somedata from table 2:
let say i will query for "X1" in Table 1 and then i expect to get the ACB and XYZ from Table 2.
Was This Post Helpful? 0
  • +
  • -

#7 Atli   User is offline

  • Enhance Your Calm
  • member icon

Reputation: 4241
  • View blog
  • Posts: 7,216
  • Joined: 08-June 10

Re: Reading large data from mySQL tables Vs flat file reading

Posted 11 October 2011 - 02:55 PM

View Postmahi.aw, on 11 October 2011 - 08:32 PM, said:

one point i did not understand:
"The structure of a relational database should never have to change as a part of it's normal operation."
as i am not changing any data afterward.

Presumably you want the ability to add or remove data from your database after it's original creation? Those kind of actions would be considered a part of the "normal operation" of a database. Adding and removing tables should not be a part of any such operation, only a part of design changes.

View Postmahi.aw, on 11 October 2011 - 08:32 PM, said:

I wonder if one can store the huge data in one table. At this point its ok if i stored 30 millions
But what if i want to store the billions of records?

There are ways in which you can optimize the way your data is stored and accessed, even though it's all a part of one table. For one, defining the appropriate keys can greatly improve query speeds. That's an area you should study thoroughly before designing a large database. (Or even a small database, for that matter.)

And just because all the records are represented as a single table does not mean they have to be stored as such. MySQL allows for Partitioning of tables, where you can split the table into different physical locations. - At least on Unix servers, you could spread a single table over several disks. Hell, you might even be able to mount the index locations into the RAM, speeding queries up significantly. (Not that I've attempted this... yet.)

Then there is also Clusters: a network of machines, each set to store a part of a database in memory, working together to maximize performance and redundancy. Perhaps overkill, but worth considering.

View Postmahi.aw, on 11 October 2011 - 08:32 PM, said:

At the moment let consider: suppose i have 30 millions of data
in my table and 1 millions records matched my criteria, then how one can retrieve all
the 1 millions records and hand the result to java? don't you feel that if you
retrieved huge data at one go, then it may hamper the performance(as i need to send the data back to java)

That it may. If the data set is a million rows it may better to fetch them in chunks using the LIMIT clause. You may also benefit from dumping those million rows into a temporary table first and then selecting chunks from that.
Was This Post Helpful? 1
  • +
  • -

#8 Atli   User is offline

  • Enhance Your Calm
  • member icon

Reputation: 4241
  • View blog
  • Posts: 7,216
  • Joined: 08-June 10

Re: Reading large data from mySQL tables Vs flat file reading

Posted 11 October 2011 - 03:07 PM

View Postmahi.aw, on 11 October 2011 - 09:48 PM, said:

can any one of you tell me how to used PK and FK key:

Sure. You place an "id" column in one table, with a unique value per row (a Primary Key), and then place a column in another table where you place the value from the "id" column in the first table, linking to one specific row.

This is how it should look:
Table: files
------------
id  name      
1   'File1'
2   'File2'
3   'File3'

Table: records
----------------------
id  file_id  some_data
1   1        'ABC'
2   2        'DEF'
3   3        'GHI'
4   1        'JKL'
5   2        'MNP'
6   3        'QRS'


There records 1 and 4 belong to file 1. To query the values for that, you could do this:
SELECT r.somedata
FROM records AS r
JOIN files AS f
    ON  r.file_id = f.id
WHERE f.name = 'File1';


Was This Post Helpful? 0
  • +
  • -

Page 1 of 1