Reputation: 991 Master
- Active Posts:
- 971 (0.66 per day)
- 30-September 10
- Profile Views:
- Last Active:
- Sep 15 2014 05:31 AM
- OS Preference:
- Favorite Browser:
- Favorite Processor:
- Favorite Gaming Platform:
- Your Car:
- Who Cares
- Dream Kudos:
- Expert In:
Posts I've Made
Posted 19 Jul 2014Since you stated that your goal is to build good habits, I'm going to use some of the points I make about your specific code to bring to light some more general software development good practices that I feel are often neglected in many initial teachings, thereby allowing bad habits to develop.
Note that you probably won't have covered a lot of the language features I mention here, so you can't possibly beat yourself up for not using them in your code. Hopefully though, as you get to some of the topics I mention, you will remember some of the points I make. At the very least, they should give you a few things to think about and look into.
Also note that many of the things I mention are only guidelines, not hard rules to be blindly applied to every situation.
Finally, note that I am assuming that your method is meant to allow the reading of a subset of the lines in the file, not necessarily all the lines in the file. If you intend to always read all the lines in the file, you don't need the linesOfTextInDoc parameter or the line counting code.
Taking into account that you are new, your code looks very good. Many of the things I am going to mention are merely enhancements and things to be aware of, rather than things you are doing wrong.
However, there are two tangible problems that you should definitely fix...
1) If a negative number is passed to the method, the method will throw an OverflowException on line 6 because an array cannot be initialized with a negative size. If that exception is not handled by your code, it will crash the program.
Now, the critical point here is not so much that the exception will potentially crash the program. If the code that calls this method gives it a negative number, crashing is a reasonable thing to do because, clearly, the code that calls the method is broken and needs fixing because asking to read a negative number of lines isn't reasonable.
The problem with your method is really the type of exception that is being thrown. An OverflowException doesn't clearly describe what the problem was, and so it makes debugging that bit more difficult.
A better thing to do would be to check for a negative number at the start of the method, and throw an ArgumentOutOfRangeException, as so:
Now if a negative number is passed in, you immediately get an error telling you exactly what you did wrong, and where your did it wrong.
Points to take away
[*] Validate incoming arguments to methods, especially in public methods. Throw an exception if the method cannot correctly do what it is supposed to do with the passed in argument(s) and/or the passed in arguments almost certainly indicate a bug in the calling code.
[*] Don't treat exceptions as scary, nasty things that crash your program. Instead, think of them as helpful tools that help you quickly identify unexpected problems and bugs in your program.
[*] Don't be irrationally afraid of letting the program crash. If there is a problem in your code that you cannot deal with, let it crash immediately so that you can be promptly made aware of the problem. This is often much more preferable than ignoring the problem and trying to continue, and thinking your program works brilliantly just because it isn't flat out crashing!
Imagine that you then give a demo to your customer having ignored a problem without really fixing it, and your program is suddenly producing incorrect results ultimately because of that problem you sat on weeks ago, but you have no clue why it is happening by this point, and you have no explanation for the customer. No thanks. I'll take the horrible red flashing lights crash during development instead .
2) As andrewsw mentioned, you should look into using the using statement around your StreamReader. This will ensure that the reader is closed even if ReadLine() (for example) produces an error (specifically, if it throws an exception). In your code, if any of the code in between the definition of myReader and the line myReader.Close() throws an exception, myReader will not be closed, and the file will remain marked as 'In Use', even though it no longer is.
Point to take away
Always clean up any resources that you use as promptly as possible, and try to make any cleanup as resilient as possible in the face of exceptions. Liberal use of the using statement will help you achieve this in C#.
Other points to consider
I think you've done quite well with the naming of your variables and method in general. Much better than many newcomers.
A few things to consider though...
Standard .NET naming conventions say that method names should start with an upper case letter. This may seem like pedantry, but solid adherence to conventions (whether it be Microsoft's, or just internal conventions used within a company or a project) is important because it helps give your code a critically important quality - consistency. As a developer, you should be able to adapt to your surroundings, and write code in a manner that is consistent with the code base you are working on at the time. Most C# code you see will adhere to this particular convention, and so you should to in your own code.
Also, some of your names focus a little to much on the mechanism you are using to achieve your goal. I would prefer to see names that describe in more real world terms what the variables/methods are representing/doing, rather than how they are representing/doing it. Doing so removes clutter, and makes the process of reading the code flow that little bit better. E.g. lineArray should become lines, and setTxtDataToArray should become SetTxtData.
Also, rather than linesOfTextInDoc, I'd probably name that parameter linesToRead. I think that more clearly demonstrates what that parameter represents.
In addition, be careful in your use of abbreviations (like txt) in method names (especially public ones). Such abbreviations are more accepted in variable names, but they should always be critically evaluated before use to ensure readability isn't compromised.
Finally, drop any extraneous words that add nothing to the clarity of the name. For example, myReader could be just reader.
Point to take away - The readability of your code is of paramount importance, and you should start actively thinking about the readability of your code now, in order to build good habits. Well named methods and variables are an important step towards making your code readable.
Passing in key dependencies
Again, andrewsw touched on this. A dependency is some service or piece of data that your method (or class) requires in order to do its job. In your method, the key dependencies are the number of lines to read, and where to read them from.
Therefore, you should pass the file path as a argument to the method, rather than hard coding it within the method. This allows you to use the method with any text file you want without having to constantly change your method. Suddenly, your method that was only useful for reading a text file at a particular path with a particular name, is now useful for reading any text file at any path! That is a pretty big win, and has increased the flexibility of the method considerably.
(You could take this even further if you wanted/needed to by passing in a TextReader instead, but that's one for another day )
Point to take away - Design your methods and classes so that they ask for the key things that they need as parameters so that you can configure their behavior without changing their code.
Return the array that you build
Your method currently just builds the array, but doesn't return it, so code that uses the method can't actually use the array you build. Therefore, you should return the array from the method to make it available for use.
Use the highest level tool that will do the job
This basically amounts to being as lazy (and smart ) as possible. You want to use the tool that minimizes the amount of work you have to do, and maximizes the clarity, simplicity and expressiveness of your code.
In this case, that could amount to using a List<T> instead of an array (a list is really just an array with a few extra features to make life easier).
Using a list would mean every item in the final collection of lines would be non-null (because when you loop over a list, it only returns items you have actually added to it, and you only ever add non-null lines), meaning you could remove the null check in your foreach debug code.
By using an array, if there are less lines actually in the file than there are lines you want to read, you get null items at the end of the array. A collection of items where only some of the items are valid isn't very nice to work with.
Similarly, you could use a for loop instead of a while loop, which may arguably communicate your logic and intent in a clearer manner in this case, since a for loop is specifically designed to iterate a set number of times. This is debatable though.
Point to take away - Try to use the tool that demonstrates your intent most clearly (this is often the tool that most closely resembles the real world concept you are modeling), makes your life (and everyone else's) as easy as it can be, and results in the simplest code to read and use.
Example Applying These Suggestions
This code still suffers from one problem though. It allocates memory for a list of linesToRead lines when there may in reality be far fewer lines in the file, hence it potentially wastes memory. One solution to this is to make use of the yield keyword.
However, the best solution is to make use of the highest level tools we have available that will do this job, and simply do:
Posted 4 May 2014Just to reiterate what _HAWK_ said, IDisposable and garbage collection are really two very different resource management mechanisms that have little in common, but together provide a complete resource management solution.
Garbage collection deals purely with raw memory. It is concerned only with deallocating memory that is no longer used, so that it can be reused by your program.
IDisposable is concerned with freeing resources that the garbage collector cannot free. A prime example is a file handle allocated in the process's handle table. The garbage collector has absolutely no knowledge of open handles or how to close them, so the Dispose() method is needed to allow such resources to be cleaned up.
The only real link between IDisposable and garbage collection is the finalizer. Basically, the garbage collector calls this before reclaiming the memory used by an object, and the finalizer can then potentially clean up the resources that the garbage collector cannot.
However, there is no guarantee a type will implement a finalizer. More importantly, there is no guarantee when, or even if, a finalizer will be called. Relying on a finalizer to do your cleanup is not only unreliable and indeterminate (thereby potentially keeping resources locked for long periods of time, for example), but can hurt performance and increase memory pressure dramatically. Hence, as a rule, you should aim to call the Dispose() method of an object (if it has one) promptly.
ianjohn - Are you sure it is the call to DataSet.Dispose() that is slowing things down? If I remember rightly, DataSet.Dispose() actually does very little work.
A DataSet object is one of a few objects that you could get away with not calling Dispose() on because it suppresses finalization in its constructor, and the Dispose() method does nothing important (and probably won't in the future). It just happens to inherit its Dispose() method from the MarshalByValueComponent class.
Picking and choosing which objects to call Dispose() on is definitely not a habit you want to be getting into, but in certain cases (and if you know what you are doing, and have a good reason), not calling Dispose() can be acceptable.
Posted 16 Feb 2014If you mean you want to find the natural logarithm (often written as ln x) of a number, you can use Math.Log().
If you mean you want to implement such a function yourself, yes, you can do that too.
Otherwise, you will have to clarify what you mean.
Posted 30 Jan 2014The cleanest way to do this (in my opinion) is probably to ditch the application framework VB provides, and role your own Main() method. That way, you have complete control over which form to use as the startup form.
The general steps to do this are:
1) In the 'Application' section of the project settings, untick 'Enable Application Framework' and change the 'Startup Object' to 'Sub Main()'. This will probably create an error saying something like 'Sub Main() was not found in <project name>', but don't worry.
2) Create a new class called something like 'Startup' (the name isn't relevant), and in that class put a shared method called Main(), and decorate it with the STAThreadAttribute. EDIT: You should also add a call to Application.EnableVisualStyles() as the first line in Main().
That Main() method is the entry point of your application, and will be the first method called in your application. It is always there, but is usually quite well hidden by default in VB.NET. You now have access to it, and therefore have complete control over your application startup.
To do what you want, you could write your Startup class like this, for example:
Public Class Startup <STAThread()> Shared Sub Main() Application.EnableVisualStyles() Application.Run(CreateStartupForm()) End Sub Private Shared Function CreateStartupForm() As Form If My.Settings.ShowForm1 Then Return New Form1() Else Return New Form2() End If End Function End Class
It is worth noting that by disabling the 'Application Framework', you do lose some things that VB would normally automatically do for you, like saving settings on application exit. Although some of these features can be very useful at times, they are also all very easy to implement yourself, and I personally like having absolute control over what my application does, and when it does it.
Posted 31 Dec 2013
QuoteSo when you use a background thread it's still using just ONE thread correct me if I'm wrong to carry out a calculation or operation.
If you start one extra thread, then only one extra thread will be used, yes. If you meant to say background worker, rather than background thread, then your statement is still correct. The BackgroundWorker class grabs one thread from the thread pool, and executes your work on that thread.
QuoteOkay when you use these options on machines like mine which has an i7 processor this doesn't necessarily mean the calculations will go faster does it?
True. Parallelizing a section of code won't necessarily mean that it will complete faster. Whether or not you see any speed increases, and the extent of any speed increase, will depend on a wide range of factors. Some key factors often include:
- The ratio of the number of I/O bound operations to the number of CPU bound operations in the code you are parallelizing.
- The amount of work the threads have to perform vs the cost of partitioning the work across the threads, and managing the threads at runtime.
- The amount of contention for shared data/resources that is present between threads.
- The number of processor cores you have vs the number of threads you are using.
Basically, when using multiple threads to improve the performance of your code, the only sane advice is to experiment, and let performance measurements guide you towards the best configuration.
QuoteSee I've seen a lot of confusion on forums etc. that basically state when you use multithreading like that you're gaining more Gigahertz processing calculation power etc. Correct me if I'm wrong but this isn't true?
Essentially in one sense using 1 thread for a long calculation generally calculates things just as fast from a sheer processing in Gigahertz clock speed etc. as using parallel threads. Can you explain this whole phenomenon?
Generally, when you use multiple threads as a means of improving the speed of your code, you are trying to use the available processing resources more efficiently, and ensure your code can continue to do this in the event that extra processing resources become available.
Now, the clock speed of a CPU will not be increased by using multiple threads. Using multiple threads won't magically make your hardware faster. However, using multiple threads can help you make better use of the hardware you have. Primarily, it can increase CPU utilization, thereby increasing the number of instructions executed per unit time (a measure of throughput).
For example, if you have N cores and your code is purely CPU bound, you could theoretically improve the speed of your code by a factor of N. This is because each core can execute instructions simultaneously, so by using all cores, you can theoretically increase the throughput by a factor of N.
Furthermore, a similar idea can apply to single core machines. For example, if you have a single core machine, and two threads, if one thread gets blocked after an I/O request, the other thread can still potentially continue working; making use of an otherwise idle CPU. This could once again lead to increases in throughput over using only one thread.
In both these examples, the hardware hasn't got faster; we are just making better, more efficient use of the hardware.
- The ratio of the number of I/O bound operations to the number of CPU bound operations in the code you are parallelizing.
- Member Title:
- D.I.C Addict
- 22 years old
- June 11, 1992
- United Kingdom
Music - Led Zeppelin, Guns N' Roses, AC/DC
Software Development and Design
Watching Supernatural, The Big Bang Theory, The Walking Dead, Red Dwarf, QI, The Simpsons, Breaking Bad
Online gaming (Xbox One).
- Full Name:
- Years Programming:
- Programming Languages:
- C#, a bit of Actionscript 3.0 ,SQL,Progress4GL, VB.NET. Learning Java, and Assembly/C/C++ when I can.
- Click here to e-mail me