5 Replies - 696 Views - Last Post: 20 February 2014 - 01:47 PM

#1 codeKitty3  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 3
  • Joined: 20-February 14

How to read the current page contents

Posted 20 February 2014 - 11:54 AM

Hi,
I am new in javascript. How to read the current url and then read the contents of the current page. Eventually, I want to count the frequency of words in the current web page.

var currentPageUrl = "";
if (typeof this.href === "undefined") {
currentPageUrl = document.location.toString().toLowerCase();

}
else {
currentPageUrl = this.href.toString().toLowerCase();

}

I used the above code to get the current page URL. So what is the next step? By the way, when I ran this script using eclipse, it just shows nothing on the page with the local browser url. How can I read (e.g www.wiki.com) page and count the frequency of words of that page. Any idea! Help, please!

Thanks,
Kitty

Is This A Good Question/Topic? 0
  • +

Replies To: How to read the current page contents

#2 ArtificialSoldier  Icon User is offline

  • D.I.C Lover
  • member icon

Reputation: 485
  • View blog
  • Posts: 1,816
  • Joined: 15-January 14

Re: How to read the current page contents

Posted 20 February 2014 - 12:07 PM

You're only going to be able to use Javascript on pages that you write. Even if you put that code on a page that has a frame which embeds another site, you're not going to be able to use Javascript to get to that site. That's a security restriction since the two pages are on different domains.

Are you trying to do this for your own pages, or for any arbitrary page? It seems like a server-side language would be a much better choice for analyzing arbitrary pages like that.
Was This Post Helpful? 0
  • +
  • -

#3 codeKitty3  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 3
  • Joined: 20-February 14

Re: How to read the current page contents

Posted 20 February 2014 - 12:18 PM

If I am trying to do it in my page, I put "This is a testing!" in my page and then try to read the content. When I use document.write() to see if I can read it and print it out. Nothing I got!

Thanks,
Kitty

This post has been edited by andrewsw: 20 February 2014 - 12:38 PM
Reason for edit:: Removed previous quote

Was This Post Helpful? 0
  • +
  • -

#4 ArtificialSoldier  Icon User is offline

  • D.I.C Lover
  • member icon

Reputation: 485
  • View blog
  • Posts: 1,816
  • Joined: 15-January 14

Re: How to read the current page contents

Posted 20 February 2014 - 12:49 PM

document.write doesn't read anything. If you use document.write on an existing page it will remove what's there and replace it. In general, it should be avoided.

You can use this to get the text for a page:

var textContent = document.body.textContent || document.body.innerText;

Was This Post Helpful? 0
  • +
  • -

#5 codeKitty3  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 3
  • Joined: 20-February 14

Re: How to read the current page contents

Posted 20 February 2014 - 01:29 PM

why only var textContent = document.body.textContent this work?

The following doesn't work :
var textContent = document.body.innerText;

or
var textContent = document.getElementsByTagName('body')[0].innerHTML;

Thanks,
Kitty
Was This Post Helpful? 0
  • +
  • -

#6 ArtificialSoldier  Icon User is offline

  • D.I.C Lover
  • member icon

Reputation: 485
  • View blog
  • Posts: 1,816
  • Joined: 15-January 14

Re: How to read the current page contents

Posted 20 February 2014 - 01:47 PM

Some browsers use textContent, and some use innerText. Doing what I showed will check both of them.
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1