5 Replies - 10525 Views - Last Post: 02 September 2010 - 12:11 PM

#1 macosxnerd101  Icon User is online

  • Self-Trained Economist
  • member icon




Reputation: 10445
  • View blog
  • Posts: 38,682
  • Joined: 27-December 08

Week #30- The R Project for Statistical Computing

Post icon  Posted 18 August 2010 - 09:14 AM

Week #30- The R Project for Statistical Computing
Challenge submitted by Paul-.

Posted Image

CHALLENGE:
Learn the basisc of the R language and environment, to perform some simple data manipulation and analysis tasks.

INTRODUCING THE LANGUAGE/TECHNOLOGY:
R is both a language and a programming environment, used primarily for data analysis and statistical applications. As a language, it is the GNU implementation of S, a high level, functional programming language. One of its more interesting and important features is the extensive use of vectors and matrices as data types. It takes some getting used to this paradigm, if it is the first time you encounter it. Although typical looping structures of procedural programming (for, while) are available, it is more efficient to avoid them, and use vectors or matrices instead. For example, the computation of the squares of a list of numbers can be done in a single step. The code
n=1:20
n*n

will output the squares of the numbers from 1 to 20.

Here are a few more descriptive words from the R project web site:
"R is an integrated suite of software facilities for data manipulation, calculation and graphical display. It includes
  • an effective data handling and storage facility,
  • a suite of operators for calculations on arrays, in particular matrices,
  • a large, coherent, integrated collection of intermediate tools for data analysis,
  • graphical facilities for data analysis and display either on-screen or on hardcopy, and
  • a well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and output facilities."


IDEAS:
  • Do one or more exercises from "Using R for Data Analysis and Graphics" by John Maindonald (http://cran.r-projec...trib/usingR.pdf).
  • Write a function which takes an array of numbers and a single value x. Return the subset of the array with values greater than x. Compare the running time of implementations which use or do not use a for loop.
  • Find magic squares of order n. A magic square of order n has n*n cells in which the numbers from 1 to n^2 are placed. The sums of the numbers across each row, each columns, and the 2 diagonals has to be the same. Try minimizing the use of explicit loops.
  • Explore the dataset "florida". It records the number of votes each candidate received by county in the 2000 United States presidential election in the state of Florida. Make a plot that shows the relationship between the number of votes for Bush against the number of votes for Buchanan. Look for a trend, and find any outliers. Consider the data of Miami-Dade county, home of the infamous butterfly ballot. It has been suggested that many votes intended for Gore mistakenly went to Buchanan in this county. Based on the general trend of the data in other counties, predict the number of votes that Gore supposedly lost in Miami-Dade.


RESOURCES:
There is a large number of tutorials available for R, at different levels of depth. The following are two, taken directly from the R project web site. I think they are clear, and to the point.

"An Introduction to R" (http://cran.r-projec...g/manuals.html/) by Venables and Smith gives an overview of the main features of both the language and the environment. Appendix A, "A sample session", is a must for any beginner. Look at specific chapters for the topics that interest you most.

The first 2 chapters of "Using R for Data Analysis and Graphics - Introduction, Examples and Commentary" by John Maindonald (http://cran.r-projec...trib/usingR.pdf), is perhaps an even better introduction to R. The rest of the chapters are probably beyond a one week challenge.

You may also benefit from the extensive online help in the R software, and an active forum at http://n4.nabble.com/R-f789695.html.

HOW TO GET STARTED:
R is straght-forward to install. From the project main web site (http://www.r-project.org/) you can access CRAN, the R software repository. Binary distributions are available for Windows, Mac, and Linux.

CRAN also holds a long list of contributed packages. For the 1 week challenge you are unlikely to need any, so stick to the base distribution.

After installation, fire up R and you will get an interactive window with a ">" prompt. Type in your commands and see what happens. You can start with:
> "Hello R!"
> demo("graphics")


Enjoy!


Is This A Good Question/Topic? 0
  • +

Replies To: Week #30- The R Project for Statistical Computing

#2 mufasa  Icon User is offline

  • New D.I.C Head

Reputation: 2
  • View blog
  • Posts: 44
  • Joined: 12-February 10

Re: Week #30- The R Project for Statistical Computing

Posted 20 August 2010 - 11:40 PM

I have a couple of friends who use R, but I've never really taken the time to play with it before. I liked the demo("graphics"), especially the "Interest in R" plot. I'm definitely gonna try to do something for this challenge, it seems pretty fun.
Was This Post Helpful? 0
  • +
  • -

#3 Guest_Guest*


Reputation:

Re: Week #30- The R Project for Statistical Computing

Posted 22 August 2010 - 07:43 AM

I've actually done quite a bit of work with IDL and have been wanting to get my hands dirty with R. I've got the Introduction to Scientific Programming and Simulation Using R book and may try to cook something up.
Was This Post Helpful? 0

#4 captainhampton  Icon User is offline

  • Jawsome++;
  • member icon

Reputation: 13
  • View blog
  • Posts: 548
  • Joined: 17-October 07

Re: Week #30- The R Project for Statistical Computing

Posted 22 August 2010 - 07:44 AM

Above post was me, forgot to sign in :)
Was This Post Helpful? 0
  • +
  • -

#5 soldner  Icon User is offline

  • New D.I.C Head

Reputation: 3
  • View blog
  • Posts: 30
  • Joined: 07-December 09

Re: Week #30- The R Project for Statistical Computing

Posted 26 August 2010 - 05:27 AM

Not really interested in R, but wanted to voice my appreciation for the 52 week challenge. While I will not code something in R, I will take a look at it.

Thanks!
Was This Post Helpful? 0
  • +
  • -

#6 Stovek  Icon User is offline

  • New D.I.C Head

Reputation: 3
  • View blog
  • Posts: 16
  • Joined: 20-March 09

Re: Week #30- The R Project for Statistical Computing

Posted 02 September 2010 - 12:11 PM

The week is getting old, but I've been meaning to do one of these challenges. This has piqued my interest, and I might have to look into it a little tonight to see what I can do.
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1