11 Replies - 1548 Views - Last Post: 01 June 2012 - 12:50 AM Rate Topic: -----

#1 livium  Icon User is offline

  • D.I.C Addict

Reputation: 0
  • View blog
  • Posts: 519
  • Joined: 21-December 08

Problem with open an unicode file in an rich edit control

Posted 31 May 2012 - 10:58 PM

Hello!
I have this very strange problem. I write something in a rich edit control 2.0. I save it to a .rtf file. The I open the file, but the problem is that the text in the rich edit control appears with some other strange characters. If I open the file using Microsoft Word, then the text is OK.

for example:

I save the folowing text:

this is just a test



then when I open the file in the rich edit the text appears as follows:


this is just a test췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍


before of "this" it also puts me a small square which I am unable to paste here.


this is the code for saving the file:

void CProofingToolsAppDlg::OnFisierSalveaza()
{
	
	CFileDialog fdlg(false, L"rtf", NULL, OFN_FILEMUSTEXIST | OFN_PATHMUSTEXIST,
		L"Rich text files (*.rtf)|*.rtf|All Files (*.*)|*.*||", NULL, NULL);

	if(fdlg.DoModal() == IDOK)
	{
		CString FileName = fdlg.GetPathName();

		FILE* fisier;
		fisier=_wfopen(FileName,L"w, ccs=UTF-16LE");


		CString cs;

                //here i get the text from the rich edit
		m_rich.GetWindowTextW(cs);

		//and write it to file
                fwrite(cs.GetBuffer(),sizeof(wchar_t), wcslen(cs.GetBuffer()),fisier);

		fclose(fisier);

	}
}


void CProofingToolsAppDlg::OnFisierDeschide()
{
	CFileDialog fdlg(true, L"rtf", NULL, OFN_FILEMUSTEXIST | OFN_PATHMUSTEXIST,
		L"Rich text files (*.rtf)|*.rtf|All Files (*.*)|*.*||", NULL, NULL);

	if(fdlg.DoModal() == IDOK)
	{
		  CString FileName = fdlg.GetPathName();
		
  		  int length;
		  wchar_t* buffer;

		 
		  FILE* fisier;
		  fisier=_wfopen(FileName,L"r, ccs=UTF-16LE");
				  
		 fseek(fisier,0,SEEK_END);
		 length=ftell(fisier);
		 fseek (fisier, 0, SEEK_SET);
		  // get length of file:
		  
		  // allocate memory:
		  buffer = new wchar_t [length+1];


		  fread(buffer, sizeof(wchar_t),length,fisier);
		  buffer[length] = L'\0';
		  fclose(fisier);

		  CString c1(buffer);
		 
                  //here I write the file to the rich edit
		 m_rich.SetWindowTextW(c1.GetBuffer());

		  delete[] buffer;
		
	}

	UpdateData(false);
}



instead of UTF-16LE I tried all the other encodings from http://msdn.microsof...b(v=vs.80).aspx but for no use

To put it simple: the saving works, but the opening does not.

This post has been edited by livium: 31 May 2012 - 11:12 PM


Is This A Good Question/Topic? 0
  • +

Replies To: Problem with open an unicode file in an rich edit control

#2 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3467
  • View blog
  • Posts: 10,687
  • Joined: 05-May 12

Re: Problem with open an unicode file in an rich edit control

Posted 01 June 2012 - 12:00 AM

View Postlivium, on 31 May 2012 - 10:58 PM, said:

Hello!
I have this very strange problem. I write something in a rich edit control 2.0. I save it to a .rtf file. The I open the file, but the problem is that the text in the rich edit control appears with some other strange characters. If I open the file using Microsoft Word, then the text is OK.

for example:

I save the folowing text:

this is just a test



then when I open the file in the rich edit the text appears as follows:


this is just a test췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍


before of "this" it also puts me a small square which I am unable to paste here.


The box is due to your seeking to before the BOM and then you read the BOM back into your buffer and later pass it into the RichEdit control. You can fix that by changing:
fseek(fisier,0,SEEK_END);
length=ftell(fisier);
fseek (fisier, 0, SEEK_SET);


to
int currPos=ftell(fisier);
fseek(fisier,0,SEEK_END);
length=ftell(fisier) - currPos;
fseek (fisier, currPos, SEEK_SET);


This post has been edited by Skydiver: 01 June 2012 - 12:07 AM

Was This Post Helpful? 1
  • +
  • -

#3 livium  Icon User is offline

  • D.I.C Addict

Reputation: 0
  • View blog
  • Posts: 519
  • Joined: 21-December 08

Re: Problem with open an unicode file in an rich edit control

Posted 01 June 2012 - 12:14 AM

I have changed those lines and I have the same problem. The only difference is that there's no square at the begining of the sentence. :(

This post has been edited by livium: 01 June 2012 - 12:15 AM

Was This Post Helpful? 0
  • +
  • -

#4 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3467
  • View blog
  • Posts: 10,687
  • Joined: 05-May 12

Re: Problem with open an unicode file in an rich edit control

Posted 01 June 2012 - 12:21 AM

View Postlivium, on 01 June 2012 - 12:14 AM, said:

I have changed those lines and I have the same problem. The only difference is that there's no square at the begining of the sentence. :(


What library are you using that implements your CString? The documentation I'm seeing for CString::GetBuffer() takes a parameter while your version takes no parameters. Are you sure that the string returned by CString is null terminated?

For grins can you replace the lines:
CString c1(buffer);
//here I write the file to the rich edit
m_rich.SetWindowTextW(c1.GetBuffer());


with
CString c1(buffer);
//here I write the file to the rich edit
m_rich.SetWindowTextW(buffer);



Also attach a debugger and report back what the value of length is. Your text file should only be 40 bytes. If it is longer that 40 bytes, that is the source of your extra characters... your GetBuffer() is not null terminating properly when you write out your file.

This post has been edited by Skydiver: 01 June 2012 - 12:25 AM

Was This Post Helpful? 1
  • +
  • -

#5 livium  Icon User is offline

  • D.I.C Addict

Reputation: 0
  • View blog
  • Posts: 519
  • Joined: 21-December 08

Re: Problem with open an unicode file in an rich edit control

Posted 01 June 2012 - 12:28 AM

The CString is null terminated. If I replace cs with buffer i get the same thing.

In debugging I see that length is set to 40 instead of 19, which is the length of the sentence.
I don't understand why the length is 40.

There's something wrong with ftell, which returns 40 instead of 19 on this line
length=ftell(fisier) - currPos;

This post has been edited by livium: 01 June 2012 - 12:32 AM

Was This Post Helpful? 0
  • +
  • -

#6 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3467
  • View blog
  • Posts: 10,687
  • Joined: 05-May 12

Re: Problem with open an unicode file in an rich edit control

Posted 01 June 2012 - 12:31 AM

If you are using the MFC CString, the GetBuffer() is not the correct way to get a C style string. Simply letting the compiler do the cast to a wchar_t *,(or you forcing the cast) it will invoke the correct operator. See http://msdn.microsof...(v=vs.100).aspx
Was This Post Helpful? 1
  • +
  • -

#7 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3467
  • View blog
  • Posts: 10,687
  • Joined: 05-May 12

Re: Problem with open an unicode file in an rich edit control

Posted 01 June 2012 - 12:37 AM

If your length variable is 40, that means that you didn't keep the change
int currPos=ftell(fisier);
fseek(fisier,0,SEEK_END);
length=ftell(fisier) - currPos;
fseek (fisier, currPos, SEEK_SET);



It is 40 bytes because of the BOM (2 bytes) + your string (19 characters * 2 bytes per character).

*sigh* I should have seen the issue sooner. Palm to the face.

Change that to:
int currPos=ftell(fisier);
fseek(fisier,0,SEEK_END);
length = (ftell(fisier) - currPos) / sizeof(wchar_t);
fseek (fisier, currPos, SEEK_SET);


Was This Post Helpful? 1
  • +
  • -

#8 livium  Icon User is offline

  • D.I.C Addict

Reputation: 0
  • View blog
  • Posts: 519
  • Joined: 21-December 08

Re: Problem with open an unicode file in an rich edit control

Posted 01 June 2012 - 12:43 AM

As ftell returns 40, then the buffer will get allocated for 40 chars. fread will put in buffer
"this is just a text"

but the rest of chars from 20 to 40 will be
췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍

This post has been edited by livium: 01 June 2012 - 12:43 AM

Was This Post Helpful? 0
  • +
  • -

#9 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3467
  • View blog
  • Posts: 10,687
  • Joined: 05-May 12

Re: Problem with open an unicode file in an rich edit control

Posted 01 June 2012 - 12:43 AM

View Postlivium, on 01 June 2012 - 12:38 AM, said:

As ftell returns 40, then the buffer will get allocated for 40 chars. fread will put in buffer
"this is just a text"

but the rest of chars from 20 to 40 will be
췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍


This is the problem, ftell is not working properly.


ftell() is working correctly. It is supposed to return the current position in bytes. We were just interpreting the return value incorrectly as characters. Since you saved the file as UTF-16, each character takes 2 bytes.
Was This Post Helpful? 1
  • +
  • -

#10 livium  Icon User is offline

  • D.I.C Addict

Reputation: 0
  • View blog
  • Posts: 519
  • Joined: 21-December 08

Re: Problem with open an unicode file in an rich edit control

Posted 01 June 2012 - 12:44 AM

This is the problem, ftell is not working properly.

Yes, it is working!!!
Man, thank you again.
Of course ftell works only with one byte. Now I understand. What a mistake.

Thank you very much! You saved me once again!

This post has been edited by livium: 01 June 2012 - 12:47 AM

Was This Post Helpful? 0
  • +
  • -

#11 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3467
  • View blog
  • Posts: 10,687
  • Joined: 05-May 12

Re: Problem with open an unicode file in an rich edit control

Posted 01 June 2012 - 12:47 AM

You are welcome.

Additionally, to save on CPU cycles and memory, you can change:
		  CString c1(buffer);
		 
                  //here I write the file to the rich edit
		 m_rich.SetWindowTextW(c1.GetBuffer());


to
		 m_rich.SetWindowTextW(buffer);


Was This Post Helpful? 1
  • +
  • -

#12 livium  Icon User is offline

  • D.I.C Addict

Reputation: 0
  • View blog
  • Posts: 519
  • Joined: 21-December 08

Re: Problem with open an unicode file in an rich edit control

Posted 01 June 2012 - 12:50 AM

View PostSkydiver, on 01 June 2012 - 01:47 AM, said:

You are welcome.

Additionally, to save on CPU cycles and memory, you can change:
		  CString c1(buffer);
		 
                  //here I write the file to the rich edit
		 m_rich.SetWindowTextW(c1.GetBuffer());


to
		 m_rich.SetWindowTextW(buffer);




Of course. :)
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1