Mass file comparison and deletion

  • (3 Pages)
  • +
  • 1
  • 2
  • 3

34 Replies - 598 Views - Last Post: 24 December 2011 - 10:26 AM Rate Topic: -----

#1 webwired  Icon User is offline

  • D.I.C Regular
  • member icon

Reputation: 33
  • View blog
  • Posts: 339
  • Joined: 26-August 07

Mass file comparison and deletion

Posted 22 December 2011 - 01:51 PM

So, as you might have seen from my previous post, I'm trying to create a Duplicate File Remover, not only because I desperately need one, but because I was also in the same desperate need to create an application, to start rebuilding and learning new c# skills... and yes, I know there's tons of them out there for free, but what would be the fun in that...

But anyway, the question is this, what would be the least resource intensive method of creating a project that scans multiple drives for specified types of files, and ultimately deletes all but say, the "Last Modified" version of that file...

Option A: Scan and make one huge honking list of every file there is and then go through that list and find all of the duplicates?
Option B: Scan, and every time I come to the next file in the scan, check against the list to see if there is a duplicate?
Option C: ??? That's all I got right now...

Is This A Good Question/Topic? 0
  • +

Replies To: Mass file comparison and deletion

#2 tlhIn`toq  Icon User is offline

  • Closing in on 5,000
  • member icon

Reputation: 4928
  • View blog
  • Posts: 10,465
  • Joined: 02-June 10

Re: Mass file comparison and deletion

Posted 22 December 2011 - 02:19 PM

If you are doing this as a learning exercise, then I would recommend making two methods: One for each option, and code them both up.

The only way to gain experience is to actually do things and learn from them. Comparing the real world results of each of these options is the type of thing you might have to do on the job. We all get faster and more proficient by doing, not being told which way to do it.

It sounds like you already have an idea of how to write them. So write them, and monitor their resource consumption, time to completion, recovery capability if interrupted by a power outage in mid work-flow, ability to accurately present to the user the progress status and time remaining, and so on.
Was This Post Helpful? 1
  • +
  • -

#3 webwired  Icon User is offline

  • D.I.C Regular
  • member icon

Reputation: 33
  • View blog
  • Posts: 339
  • Joined: 26-August 07

Re: Mass file comparison and deletion

Posted 22 December 2011 - 02:24 PM

View PosttlhIn`toq, on 22 December 2011 - 03:19 PM, said:

If you are doing this as a learning exercise, then I would recommend making two methods: One for each option, and code them both up.

The only way to gain experience is to actually do things and learn from them. Comparing the real world results of each of these options is the type of thing you might have to do on the job. We all get faster and more proficient by doing, not being told which way to do it.

It sounds like you already have an idea of how to write them. So write them, and monitor their resource consumption, time to completion, recovery capability if interrupted by a power outage in mid work-flow, ability to accurately present to the user the progress status and time remaining, and so on.


I like that idea, thanks, never thought of it that way, but you are most definitely correct...
Was This Post Helpful? 0
  • +
  • -

#4 webwired  Icon User is offline

  • D.I.C Regular
  • member icon

Reputation: 33
  • View blog
  • Posts: 339
  • Joined: 26-August 07

Re: Mass file comparison and deletion

Posted 22 December 2011 - 07:57 PM

So, I have a question... I was going to try to create this program with SQL Server Compact, to put the file information in temporarily... but after hours and hours of trial and error and Googling, MSDN, forums, etc... I have learned that SQL Server Compact does not allow the use of Stored Procedures, but that isn't the worst part... Apparently it isn't very C# friendly either, I found out that I'm not going crazy, that in fact there are tons of people unable to connect to an SQL Server Compact Database File via C#... At first I thought I had really found something when I started "using System.Data.SqlServerCe;" instead of "using System.Data.SqlClient;", but I was wrong...

Anyway, enough ranting about SQL Server Compact... Would anyone have an as good or better idea of how to store a LOT of rows of data, temporarily, to be parsed and ultimately deleted? I was hoping to stay away from using a non-local database... The dictionary sounded good until I figured out that it could only hold 2 columns...
Was This Post Helpful? 0
  • +
  • -

#5 wiero  Icon User is offline

  • D.I.C Head

Reputation: 45
  • View blog
  • Posts: 78
  • Joined: 29-June 11

Re: Mass file comparison and deletion

Posted 23 December 2011 - 01:39 AM

You can try to use Sqlite. When it comes to sql compact it sounds strange for me that it doesn't work well with c#, what problems did you have?
Was This Post Helpful? 0
  • +
  • -

#6 RexGrammer  Icon User is offline

  • Coding Dynamo
  • member icon

Reputation: 178
  • View blog
  • Posts: 750
  • Joined: 27-October 11

Re: Mass file comparison and deletion

Posted 23 December 2011 - 06:14 AM

A quick search of the net found this discussion:storing temporary and large amount of data

Also I disagree with your assumption that SqlServer doesn't play well with C#. Maybe look at this tutorial:
Beginners guide to accessing SQL Server through C# (CodeProject Tutorial)

This post has been edited by RexGrammer: 23 December 2011 - 06:15 AM

Was This Post Helpful? 0
  • +
  • -

#7 modi123_1  Icon User is online

  • Suitor #2
  • member icon



Reputation: 6464
  • View blog
  • Posts: 23,509
  • Joined: 12-June 08

Re: Mass file comparison and deletion

Posted 23 December 2011 - 10:17 AM

You are talking about adding a local database, right? The "add to project"/.sdf file, right? What's the problem? I've used it a ton of times, for personal and work, and never had a problem. Sure it doesn't hold a stored procedure, but what do you expect out of essentially a specifically formatted file?

How many rows are you talking about? A dataset would be my first thought.
Was This Post Helpful? 0
  • +
  • -

#8 webwired  Icon User is offline

  • D.I.C Regular
  • member icon

Reputation: 33
  • View blog
  • Posts: 339
  • Joined: 26-August 07

Re: Mass file comparison and deletion

Posted 23 December 2011 - 10:25 AM

View PostRexGrammer, on 23 December 2011 - 07:14 AM, said:

A quick search of the net found this discussion:storing temporary and large amount of data


Yes, I also found the same results from my quick search... but as you can see, that is using a database...

View PostRexGrammer, on 23 December 2011 - 07:14 AM, said:

Also I disagree with your assumption that SqlServer doesn't play well with C#. Maybe look at this tutorial:
Beginners guide to accessing SQL Server through C# (CodeProject Tutorial)


I didn't say that SqlServer doesn't play well with c#, I said that SqlServer Compact Edition doesn't play well with c#... The tutorial, on both counts, is irrelevant ...

View Postmodi123_1, on 23 December 2011 - 11:17 AM, said:

You are talking about adding a local database, right? The "add to project"/.sdf file, right? What's the problem? I've used it a ton of times, for personal and work, and never had a problem. Sure it doesn't hold a stored procedure, but what do you expect out of essentially a specifically formatted file?

How many rows are you talking about? A dataset would be my first thought.


Yes, a local database. I did just what you described and no matter what, I could not connect to it... Let me put it back to the way it was and generate the error I was receiving and I'll post back in a few minutes...

Probably talking about a few thousand records...
Was This Post Helpful? 0
  • +
  • -

#9 modi123_1  Icon User is online

  • Suitor #2
  • member icon



Reputation: 6464
  • View blog
  • Posts: 23,509
  • Joined: 12-June 08

Re: Mass file comparison and deletion

Posted 23 December 2011 - 10:29 AM

A few thousand? A dataset in memory will be fine.. that's a pretty small number of string records.. though I am not sure if you'll need more columns or something outside of one.
Was This Post Helpful? 0
  • +
  • -

#10 webwired  Icon User is offline

  • D.I.C Regular
  • member icon

Reputation: 33
  • View blog
  • Posts: 339
  • Joined: 26-August 07

Re: Mass file comparison and deletion

Posted 23 December 2011 - 10:37 AM

View Postmodi123_1, on 23 December 2011 - 11:29 AM, said:

A few thousand? A dataset in memory will be fine.. that's a pretty small number of string records.. though I am not sure if you'll need more columns or something outside of one.


Here's the error I get: "Format of the initialization string does not conform to specification starting at index 0."

I researched and researched that error and kept applying different fixes, but to no avail...

The table Files has 6 columns, FileID, FileDirectoryPath, FileName, FileCreationDate, FileLastModifiedDate, and FileSize

These fields will be used to determine if a file being compared really is a duplicate of the same file...

I also created my first class ever, and although its basic functionality works, it wouldn't do other things because of IEnumerator, which is something I will have to learn about first I suppose... It's a class for the files, looks like this:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace FileDupeDeleter
{
    class FoundFile
    {
        public string DirectoryPath { get; set; }
        public string Name { get; set; }
        public DateTime CreationDate { get; set; }
        public DateTime LastModifiedDate { get; set; }
        public long Size { get; set; }
    }
}


Was This Post Helpful? 0
  • +
  • -

#11 webwired  Icon User is offline

  • D.I.C Regular
  • member icon

Reputation: 33
  • View blog
  • Posts: 339
  • Joined: 26-August 07

Re: Mass file comparison and deletion

Posted 23 December 2011 - 10:45 AM

Oh, guess I should post my code for trying to connect to the SQL Compact DB...

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
//using System.Data.SqlClient;
using System.Configuration;
using System.Data;
using System.Data.SqlServerCe;

namespace FileDupeDeleter
{
    class DatabaseConnection
    {

        public void insertFoundFileDataIntoDatabase(string fileDirectoryPath, 
            string fileName, DateTime fileCreationDate, DateTime fileLastModifiedDate, long fileSize)
        {
            SqlCeConnection myConnection = new SqlCeConnection("C:\temporaryDatabase.sdf");
            string sqlAction = "INSERT INTO Files (AttachmentName, AttachmentLocation, AttachmentDescription ) VALUES ( @AttachmentName, @AttachmentLocation, @AttachmentDescription )";
            SqlCeCommand mySQLCommand = new SqlCeCommand();
            mySQLCommand.Parameters.Clear();
            mySQLCommand.CommandText = sqlAction;
            mySQLCommand.CommandType = CommandType.Text;
            mySQLCommand.Parameters.AddWithValue("@FileDirectoryPath", fileDirectoryPath);
            mySQLCommand.Parameters.AddWithValue("@FileName", fileName);
            mySQLCommand.Parameters.AddWithValue("@FileCreationDate", fileCreationDate);
            mySQLCommand.Parameters.AddWithValue("@FileLastModifiedDate", fileLastModifiedDate);
            mySQLCommand.Parameters.AddWithValue("@FileSize", fileSize);
            try 
	        {	        
		        myConnection.Open();
                mySQLCommand.ExecuteNonQuery();
	        }
	        catch (Exception)
	        {

                throw;
	        }
            finally
            {
                myConnection.Close();
            }
        }


    }
}


Was This Post Helpful? 0
  • +
  • -

#12 modi123_1  Icon User is online

  • Suitor #2
  • member icon



Reputation: 6464
  • View blog
  • Posts: 23,509
  • Joined: 12-June 08

Re: Mass file comparison and deletion

Posted 23 December 2011 - 10:48 AM

eDIT: look past this is VB.NEt.. it was the quickest one I had in my sandbox available.


Here is my go-to default snippet I use for connecting. It's, oddly, the same format for regular mssql and mysql.


        Dim oConn As SqlServerCe.SqlCeConnection = Nothing
        Dim dsData As DataSet = Nothing
        Dim sSQL As String = String.Empty

        Dim myAdapter As SqlServerCe.SqlCeDataAdapter = Nothing

        Try
            sSQL = "SELECT     lValue, sValue FROM MyTable" '-- 1.0  sql statement
            oConn = New SqlServerCe.SqlCeConnection("Data Source=test_localdb.sdf;Persist Security Info=False;") '-- 2.0 connection string. file's location is in the solution directory.
            oConn.Open()

            myAdapter = New SqlServerCe.SqlCeDataAdapter(sSQL, oConn)
            dsData = New DataSet
            myAdapter.Fill(dsData)

            Console.WriteLine(String.Format("How many rows: {0}", dsData.Tables(0).Rows.Count))

        Catch ex As Exception
            '-- 3.0
            MsgBox(ex.Message)
        Finally
            oConn.Dispose()

        End Try

Was This Post Helpful? 0
  • +
  • -

#13 wiero  Icon User is offline

  • D.I.C Head

Reputation: 45
  • View blog
  • Posts: 78
  • Joined: 29-June 11

Re: Mass file comparison and deletion

Posted 23 December 2011 - 10:51 AM

try to use this connection string:

"Data Source=C:\\temporaryDatabase.sdf"

This post has been edited by wiero: 23 December 2011 - 10:52 AM

Was This Post Helpful? 0
  • +
  • -

#14 webwired  Icon User is offline

  • D.I.C Regular
  • member icon

Reputation: 33
  • View blog
  • Posts: 339
  • Joined: 26-August 07

Re: Mass file comparison and deletion

Posted 23 December 2011 - 10:57 AM

View Postmodi123_1, on 23 December 2011 - 11:48 AM, said:

eDIT: look past this is VB.NEt.. it was the quickest one I had in my sandbox available.


Here is my go-to default snippet I use for connecting. It's, oddly, the same format for regular mssql and mysql.


        Dim oConn As SqlServerCe.SqlCeConnection = Nothing
        Dim dsData As DataSet = Nothing
        Dim sSQL As String = String.Empty

        Dim myAdapter As SqlServerCe.SqlCeDataAdapter = Nothing

        Try
            sSQL = "SELECT     lValue, sValue FROM MyTable" '-- 1.0  sql statement
            oConn = New SqlServerCe.SqlCeConnection("Data Source=test_localdb.sdf;Persist Security Info=False;") '-- 2.0 connection string. file's location is in the solution directory.
            oConn.Open()

            myAdapter = New SqlServerCe.SqlCeDataAdapter(sSQL, oConn)
            dsData = New DataSet
            myAdapter.Fill(dsData)

            Console.WriteLine(String.Format("How many rows: {0}", dsData.Tables(0).Rows.Count))

        Catch ex As Exception
            '-- 3.0
            MsgBox(ex.Message)
        Finally
            oConn.Dispose()

        End Try


Thanks for the snippet, I was just reading through it, trying to understand everything it was doing and it looks like you are filling a dataset and then dumping the dataset into the database, is that correct?

Np on the VB .net, my favorite language of all time, I can convert it to c# easily enough...

View Postwiero, on 23 December 2011 - 11:51 AM, said:

try to use this connection string:

"Data Source=C:\\temporaryDatabase.sdf"


Nice catch on that, but no, unfortunately that doesn't do it either... just so you know, I did try it again as you requested...
Was This Post Helpful? 0
  • +
  • -

#15 modi123_1  Icon User is online

  • Suitor #2
  • member icon



Reputation: 6464
  • View blog
  • Posts: 23,509
  • Joined: 12-June 08

Re: Mass file comparison and deletion

Posted 23 December 2011 - 11:00 AM

No.. I am taking a sql string, an active connection to the local db, and with the help of an adapter filling a generic dataset... then as proof of being filled (there's two rows in the DB by the way) I have it print out how many rows were returned... which should be all.. or two in my case.

Again.. filling the dataset FROM the database... no dumping back into the database.
Was This Post Helpful? 0
  • +
  • -

  • (3 Pages)
  • +
  • 1
  • 2
  • 3