3 Replies - 8375 Views - Last Post: 15 July 2012 - 03:06 AM

#1 state1_1  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 2
  • Joined: 14-July 12

Perl script question for data extracting in text file

Posted 14 July 2012 - 07:12 PM

I have a story problem type that I'm trying to write script for and I'm new to Perl.
The problem goes as follows.....
Please write a perl program that takes a text file with 1000 medical publication citations as the input and creates a report of the information.
The report should have:
The top 30 authors with the most citations.
The total number of citations by the top 30 authors
The journals that the top three authors publish in
The top three journals with the most citations overall.

An example of a citation is :
1: Mardis ER. Applying sequencing to cancer treatment for pancreas. Nat Rev Gastro Hepatol. 2012 Jul 3. doi:10.1038/101.12/nrgastro.2012.12 [Epub ahead of print] PubMed PMID: 22751458.

Could someone please help with how to write script for this. Thank you.

Is This A Good Question/Topic? 0
  • +

Replies To: Perl script question for data extracting in text file

#2 GunnerInc  Icon User is offline

  • "Hurry up and wait"
  • member icon




Reputation: 864
  • View blog
  • Posts: 2,308
  • Joined: 28-March 11

Re: Perl script question for data extracting in text file

Posted 14 July 2012 - 08:10 PM

Let's move this on over to the Perl forum.

What have you tried so far?
Was This Post Helpful? 0
  • +
  • -

#3 state1_1  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 2
  • Joined: 14-July 12

Re: Perl script question for data extracting in text file

Posted 14 July 2012 - 08:25 PM

I am very new to Perl but I think I need to start out by anchoring the first line through a script to only search authors and generate a list? Then I need to count the authors and extract the top 30? Thanks.
Was This Post Helpful? 0
  • +
  • -

#4 dsherohman  Icon User is offline

  • Perl Parson
  • member icon

Reputation: 226
  • View blog
  • Posts: 654
  • Joined: 29-March 09

Re: Perl script question for data extracting in text file

Posted 15 July 2012 - 03:06 AM

View Poststate1_1, on 15 July 2012 - 03:12 AM, said:

I have a story problem type that I'm trying to write script for and I'm new to Perl.
The problem goes as follows.....
Please write a perl program that takes a text file with 1000 medical publication citations as the input and creates a report of the information.
The report should have:
The top 30 authors with the most citations.
The total number of citations by the top 30 authors
The journals that the top three authors publish in
The top three journals with the most citations overall.


Assuming you're supposed to do this purely in Perl without using a database, the basic approach I would take is:

1) Use a regex (regular expression) to extract the author and journal from each citation entry.

2) Use three hashes to store the results: One to count total citations for each author, one to count total citations for each journal, and a hash of arrays (HoA) with author names as keys and (references to) arrays of journal names that the author was cited from.

To produce the output:

Quote

The top 30 authors with the most citations.
The total number of citations by the top 30 authors

Sort the "total citations by author" hash by descending value and print the name and count for the first 30 results.

Quote

The journals that the top three authors publish in

Take the first three names from the previous list and print all associated journal names from the HoA.

Quote

The top three journals with the most citations overall.

Sort the "total citations by journal" hash by descending value and print the first three entries.

(Or at least this is how I would approach it as a homework assignment... If the Bibliometrics guys at work asked for something like this, I'd do something more flexible and probably a bit over-engineered...)

Once you've got some code written, feel free to ask for help with any specific parts that may be giving you trouble.

Oh, and general advice, in case your instructor hasn't already made a point of it: Always start your code with
use strict;
use warnings;
Turning these two settings on will catch a lot of errors for you. Even for errors they don't catch, they tend to be very helpful in identifying why your code is behaving strangely.
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1