please help with using tts engine in C++

  • (2 Pages)
  • +
  • 1
  • 2

15 Replies - 1899 Views - Last Post: 21 September 2014 - 09:37 AM Rate Topic: -----

#1 zaza.balakhashvili   User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 7
  • Joined: 20-September 14

please help with using tts engine in C++

Posted 20 September 2014 - 03:46 AM

hello devs.

sorry, if i am posting this question in wrong forum, i am knew in this forum.

we wrote a simple console app in C++ which should write speech into a wave file.

however, we get a little noise instead recorded speech.
i am wondering if we are using writing in file correctly.

link is https://dl.dropboxus.../RecordText.cpp
thank you.

// RecordText.cpp: îïðåäåëÿåò òî÷êó âõîäà äëÿ êîíñîëüíîãî ïðèëîæåíèÿ.
//

#include <Windows.h>
#include "stdafx.h"
#include "RHVoice.h"
#include <iostream>
#include <fcntl.h>
#include <fstream>
#include <io.h>
#include <string>

using namespace std;

int my_play_speech(const short* samples,unsigned int count,void* user_data);
const char* my_converte_string(const char *str);

int _tmain(int argc, _TCHAR* argv[])
{

RHVoice_callbacks cb;
cb.play_speech = my_play_speech;
cb.process_mark=0;
cb.play_audio=0;
cb.sentence_ends=0;
cb.sentence_starts=0;
cb.word_ends=0;
cb.word_starts=0;

RHVoice_init_params params;
params.data_path = 0;
params.config_path = 0;
params.resource_paths = 0;
params.callbacks = cb;
params.options=0;

const char* ver = RHVoice_get_version();
cout << "Version: " << ver << endl;

RHVoice_tts_engine engine = RHVoice_new_tts_engine(&params);
cout << "engine: " << engine << endl;

const char *tv = "Slt+Natia";

RHVoice_synth_params synth_params;
synth_params.voice_profile = my_converte_string(tv);
synth_params.absolute_rate=0;
synth_params.relative_rate=1;
synth_params.absolute_pitch=0;
synth_params.relative_pitch=1;
synth_params.absolute_volume=0;
synth_params.relative_volume=1;
synth_params.punctuation_mode=RHVoice_punctuation_mode::RHVoice_punctuation_default;
synth_params.punctuation_list=0;
synth_params.capitals_mode=RHVoice_capitals_mode::RHVoice_capitals_default;

unsigned int num = RHVoice_get_number_of_voices(engine);
const RHVoice_voice_info* voice = RHVoice_get_voices(engine);
for (unsigned int i = 0; i < num; i++) {
cout << "Info synth: " << voice[i].name << ", " << voice[i].language << endl;
}

num = RHVoice_get_number_of_voice_profiles(engine);
  const char* const* profile = RHVoice_get_voice_profiles(engine);
for (unsigned int i = 0; i < num; i++) {
cout << "Info profile: " << profile[i] << endl;
}

const char *t = "Hello, World! I am people!";
const char *text = my_converte_string(t);

const RHVoice_message msg = RHVoice_new_message(engine,
text,
strlen(text),
RHVoice_message_type::RHVoice_message_text,
&synth_params,
NULL);
cout << "message: " << " " << msg << endl;

int res = RHVoice_speak(msg);
cout << "speak: " << res << endl;

RHVoice_delete_message(msg);

RHVoice_delete_tts_engine(engine);
cout << "end:" << endl;

	return 0;
}


int my_play_speech(const short* samples,unsigned int count,void* user_data) {

WAVEFORMATEX fmt;
fmt.wFormatTag = WAVE_FORMAT_PCM;
fmt.nChannels = 1;
fmt.nSamplesPerSec = 16000;
fmt.nAvgBytesPerSec = 32000; //nSamplesPerSec * nBlockAlign
fmt.nBlockAlign = 2; // nChannels * wBitsPerSample / 8
fmt.wBitsPerSample = 16;
fmt.cbSize = 0;

ofstream file("C:\\record.wav", ios::beg | ios::out | ios::trunc | ios::binary);
if (!file) return 0;
DWORD dwRIFF = 0;
dwRIFF = MAKEFOURCC('R','I','F','F');
file.write((char*) &dwRIFF, 4);
DWORD dwSpace = ((sizeof(short) * count) + 40);
cout << dwSpace << endl;
file.write((char*) &dwSpace, 4);
DWORD dwWave = 0;
DWORD dwFormat = 0;
long  lSizeFmt = 0; //Ðàçìåð ñåêöèè fmt
dwWave = MAKEFOURCC('W','A','V','E');
file.write((char*)&dwWave, 4);
dwFormat = MAKEFOURCC('f','m','t',' ');
file.write((char*)&dwFormat, 4);
lSizeFmt = 16;
file.write((char*) &lSizeFmt, 4); //Åùå ÷èòàåì 4 áàéòà, òàì ó íàñ ðàçìåð ñåêöèè ôîðìàò, ãäå èäåò ñàì ôîðìàò + ìîæåò èäòè âñÿêàÿ îõèíåÿ, ôîðìàò ëåæèò â 16 áàéòàõ
file.write((char*) &fmt, 16);
DWORD dwSection = 0; //Ñëåäóþùàÿ ñåêöèÿ ó íàñ èëè fact(Íå îáÿçàòåëüíàÿ) èëè data(ýòî êàê ðàç íàø çâóê)
dwSection = MAKEFOURCC('d','a','t','a');
file.write((char*) &dwSection, 4);
long lSizeData = 0;
lSizeData = (sizeof(short) * count);
cout << lSizeData << endl;
file.write((char*) &lSizeData, 4);
file.write((char*) samples, lSizeData);
file.close();

return 1;
}

const char* my_converte_string(const char *inStr) {
// Âàæíî, â ýòó ôóíêöèþ ïåðåäàâàòü òîëüêî ñòðîêè ñ çàâåðøàþùèì íóë¸ì.
DWORD dwNum = MultiByteToWideChar(CP_ACP, 0, inStr, -1, NULL, 0);
wchar_t *tempStr = new wchar_t[dwNum];
MultiByteToWideChar(CP_ACP, 0, inStr, -1, tempStr, dwNum);

dwNum = WideCharToMultiByte(CP_UTF8, 0, tempStr, -1, NULL, 0, NULL, NULL);
char *outStr = new char[dwNum];
WideCharToMultiByte(CP_UTF8, 0, tempStr, -1, outStr, dwNum, NULL, NULL);

delete[] tempStr;

return (const char*) outStr;
}


This post has been edited by JackOfAllTrades: 20 September 2014 - 03:47 AM
Reason for edit:: Added code to post


Is This A Good Question/Topic? 0
  • +

Replies To: please help with using tts engine in C++

#2 #define   User is offline

  • Duke of Err
  • member icon

Reputation: 1862
  • View blog
  • Posts: 6,711
  • Joined: 19-February 09

Re: please help with using tts engine in C++

Posted 20 September 2014 - 05:24 AM

Hi, welcome to DIC.

Do you get anything if you use Windows Sound Recorder?

Ahh, my mistake, it converts text to sound.

Are you trying to speak text to a wav file?

.

This post has been edited by #define: 20 September 2014 - 05:34 AM

Was This Post Helpful? 0
  • +
  • -

#3 zaza.balakhashvili   User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 7
  • Joined: 20-September 14

Re: please help with using tts engine in C++

Posted 20 September 2014 - 05:39 AM

yes, i am trying to write synthesized speech into a wav file.
Was This Post Helpful? 0
  • +
  • -

#4 #define   User is offline

  • Duke of Err
  • member icon

Reputation: 1862
  • View blog
  • Posts: 6,711
  • Joined: 19-February 09

Re: please help with using tts engine in C++

Posted 20 September 2014 - 05:45 AM

Is it Microsoft SAPI you are using? What is RHVoice?
Was This Post Helpful? 0
  • +
  • -

#5 zaza.balakhashvili   User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 7
  • Joined: 20-September 14

Re: please help with using tts engine in C++

Posted 20 September 2014 - 05:49 AM

no, it isnotsapi, itis independent engine, the problemis in stream i think, but i don't know how to fix it.
Was This Post Helpful? 0
  • +
  • -

#6 #define   User is offline

  • Duke of Err
  • member icon

Reputation: 1862
  • View blog
  • Posts: 6,711
  • Joined: 19-February 09

Re: please help with using tts engine in C++

Posted 20 September 2014 - 06:51 AM

If you think it is the file stream, you can check the size of the file and look at it with a hex viewer (which you could write yourself if you want).

Is the number of samples correct at 16000?

Your program would be better with indentation.
Was This Post Helpful? 0
  • +
  • -

#7 #define   User is offline

  • Duke of Err
  • member icon

Reputation: 1862
  • View blog
  • Posts: 6,711
  • Joined: 19-February 09

Re: please help with using tts engine in C++

Posted 20 September 2014 - 08:47 AM

Hi, taking inspiration from the MSDN tutorial here:

Tutorial:Decoding Audio

The header function is quite nice so I adapted it and a write to file function. I haven't tested this so likely a problem or two.

This just writes the header.

bool write_to_file(fstream &out, char* ptr, streamsize cb)
{
  bool result = true;

  out.write(ptr, cb);

  if(!out.good())
  {
    result = false;
  }

  return result;
}


bool write_wave_header(fstream &out, DWORD &numbyteswritten)
{
  bool result = true;
  UINT32 cbFormat = 0;

  WAVEFORMATEX fmt; // = {0}

  // values set
  fmt.wFormatTag = WAVE_FORMAT_PCM;
  fmt.cbSize = 0;

  fmt.nChannels = 1;
  fmt.nSamplesPerSec = 16000;
  fmt.wBitsPerSample = 16;

  // values calculated
  fmt.nBlockAlign = fmt.nChannels * fmt.wBitsPerSample / 8; // 2
  fmt.nAvgBytesPerSec = fmt.nSamplesPerSec * fmt.nBlockAlign; // 32000

  cbFormat = sizeof(fmt); // format size ? chunk size ?

  cout << "WAVEFORMATEX size is " << cbFormat << endl;

  DWORD header[] = {
    // RIFF header
    FCC("RIFF"),
    0,
    FCC("WAVE"),
    // Start of 'fmt ' chunk
    FCC("fmt "),
    cbFormat
  };

  DWORD dataheader[] = {FCC("data"), 0};

  if(result)
  {
    result = write_to_file(out, header, sizeof(header));
  }

  // write WAVEFORMATEX structure
  if(result)
  {
    result = write_to_file(out, &fmt, sizeof(fmt));
  }

  // write start of data chunk
  if(result)
  {
    result = write_to_file(out, dataheader, sizeof(dataheader));

  }

  if(result)
  {
    numbyteswritten = sizeof(header) + cbFormat + sizeof(dataheader);
  }

  return result; 
}




I'm wondering about the fmt size.
.

This post has been edited by #define: 20 September 2014 - 08:50 AM

Was This Post Helpful? 0
  • +
  • -

#8 #define   User is offline

  • Duke of Err
  • member icon

Reputation: 1862
  • View blog
  • Posts: 6,711
  • Joined: 19-February 09

Re: please help with using tts engine in C++

Posted 20 September 2014 - 09:07 AM

There can be problems with writing structures to file. It looks like the size of WAVEFORMATEX should be 18 bytes, but in memory that can be different due to the compiler aligning data members. So that may need adjusting or writing differently.

Anyway you seem to be using 16 for the format size which would be wrong, according to some.

dwFormat = MAKEFOURCC('f','m','t',' ');
file.write((char*)&dwFormat, 4);

lSizeFmt = 16;
file.write((char*) &lSizeFmt, 4);
 
file.write((char*) &fmt, 16);




Another way to check is have a look at reading a working wav file.

The Wave File Format is shown here, using 16, the structure members are written individually, cbSize is not written.

Recording and Playing sound ...

You could also write 2 bytes less of the structure (not write cbsize) :

RIFF update ...

.

This post has been edited by #define: 20 September 2014 - 09:25 AM

Was This Post Helpful? 0
  • +
  • -

#9 snoopy11   User is offline

  • Engineering ● Software
  • member icon

Reputation: 1556
  • View blog
  • Posts: 4,930
  • Joined: 20-March 10

Re: please help with using tts engine in C++

Posted 20 September 2014 - 09:38 AM

Please see my Tutorial which explains the wave format and how to properly save a wave file here

Tutorial

When saving a wave file it should go.

"RIFF", 4 bytes
chunksize, 4 bytes
"WAVE", 4 bytes
"fmt", 4 bytes
subchunk1size, 4 bytes
audioformat, 2 bytes
numchannels, 2 bytes
samplerate, 4 bytes
byterate, 4 bytes
blockalign, 2 bytes
bitspersample, 2 bytes
"data" , 4 bytes
subchunk2size, 4 bytes
Waveheader->data , size of wave header.

audioformat should equal 1 for PCM.
subchunk1size should equal 16 for PCM.
numChannels I would set to 1 for simplicity's sake.

Try making those changes and see where it gets you.

Best Wishes

Snoopy.
Was This Post Helpful? 0
  • +
  • -

#10 zaza.balakhashvili   User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 7
  • Joined: 20-September 14

Re: please help with using tts engine in C++

Posted 20 September 2014 - 11:16 PM

hi all.

we rewrote this app, but when i am enterring a text through the commandline only first char is spoken.

i think we should terminate a string with NULL but i don't know how to use it inC++

a link is https://dl.dropboxus...RecordText2.cpp
please help if you can.
Was This Post Helpful? 0
  • +
  • -

#11 snoopy11   User is offline

  • Engineering ● Software
  • member icon

Reputation: 1556
  • View blog
  • Posts: 4,930
  • Joined: 20-March 10

Re: please help with using tts engine in C++

Posted 21 September 2014 - 01:05 AM

The code for this

// RecordText.cpp: определяет точку входа для консольного приложения.
//

#include <Windows.h>
#include <mmsystem.h>
#include "stdafx.h"
#include "RHVoice.h"
#include <iostream>
#include <fcntl.h>
#include <fstream>
#include <io.h>
#include <string>
#include <stdio.h>

using namespace std;

int my_play_speech(const short* samples,unsigned int count,void* user_data);
const char* my_converte_string(const char *str);

int numbuf;
HWAVEOUT hWaveOut;
HANDLE hEvent;
char buf[5][5120];
WAVEHDR wHdr[5];
int nb;

int _tmain(int argc, char* argv[])
{
int size;
nb = 0;
numbuf = 0;
const char* s1 = argv[1];

	WAVEFORMATEX fmt;
fmt.wFormatTag = WAVE_FORMAT_PCM;
fmt.nChannels = 1;
fmt.nSamplesPerSec = 16000;
fmt.nAvgBytesPerSec = 32000; //nSamplesPerSec * nBlockAlign
fmt.nBlockAlign = 2; // nChannels * wBitsPerSample / 8
fmt.wBitsPerSample = 16;
fmt.cbSize = 0;

hEvent = CreateEvent(NULL, false, false, NULL);
waveOutOpen(&hWaveOut, WAVE_MAPPER, &fmt, (DWORD)hEvent, 0, CALLBACK_EVENT);

RHVoice_callbacks cb;
cb.play_speech = my_play_speech;
cb.process_mark=0;
cb.play_audio=0;
cb.sentence_ends=0;
cb.sentence_starts=0;
cb.word_ends=0;
cb.word_starts=0;

RHVoice_init_params params;
params.data_path = 0;
params.config_path = 0;
params.resource_paths = 0;
params.callbacks = cb;
params.options=0;

const char* ver = RHVoice_get_version();

RHVoice_tts_engine engine = RHVoice_new_tts_engine(&params);

const char *tv = "Alan";

RHVoice_synth_params synth_params;
synth_params.voice_profile = my_converte_string(tv);
synth_params.absolute_rate=0;
synth_params.relative_rate=1;
synth_params.absolute_pitch=0;
synth_params.relative_pitch=1;
synth_params.absolute_volume=1;
synth_params.relative_volume=1;
synth_params.punctuation_mode=RHVoice_punctuation_mode::RHVoice_punctuation_default;
synth_params.punctuation_list=0;
synth_params.capitals_mode=RHVoice_capitals_mode::RHVoice_capitals_default;

unsigned int num = RHVoice_get_number_of_voices(engine);
const RHVoice_voice_info* voice = RHVoice_get_voices(engine);
for (unsigned int i = 0; i < num; i++) {
}

num = RHVoice_get_number_of_voice_profiles(engine);
  const char* const* profile = RHVoice_get_voice_profiles(engine);
for (unsigned int i = 0; i < num; i++) {
}

const char *t = s1;
const char *text = my_converte_string(t);

const RHVoice_message msg = RHVoice_new_message(engine,
text,
strlen(text),
RHVoice_message_type::RHVoice_message_text,
&synth_params,
NULL);

int res = RHVoice_speak(msg);

RHVoice_delete_message(msg);

RHVoice_delete_tts_engine(engine);

waveOutClose(hWaveOut);
CloseHandle(hEvent);


	return 0;
}


int my_play_speech(const short* samples,unsigned int count,void* user_data) {

memcpy(buf[nb], samples, (count * sizeof(short)));
wHdr[nb].lpData = buf[nb];
wHdr[nb].dwBufferLength = (count * sizeof(short));
waveOutPrepareHeader(hWaveOut, &wHdr[nb], sizeof(WAVEHDR));
waveOutWrite(hWaveOut, &wHdr[nb], sizeof(WAVEHDR));
WaitForSingleObject(hEvent, INFINITE);
waveOutUnprepareHeader(hWaveOut, &wHdr[nb], sizeof(WAVEHDR));
numbuf++;
nb++;
if (nb > 4) nb = 0;

return 1;
}

const char* my_converte_string(const char *inStr) {
// Важно, в эту функцию передавать только строки с завершающим нулём.
DWORD dwNum = MultiByteToWideChar(CP_ACP, 0, inStr, -1, NULL, 0);
wchar_t *tempStr = new wchar_t[dwNum];
MultiByteToWideChar(CP_ACP, 0, inStr, -1, tempStr, dwNum);

dwNum = WideCharToMultiByte(CP_UTF8, 0, tempStr, -1, NULL, 0, NULL, NULL);
char *outStr = new char[dwNum];
WideCharToMultiByte(CP_UTF8, 0, tempStr, -1, outStr, dwNum, NULL, NULL);

delete[] tempStr;

return (const char*) outStr;
}




Please use the code tags provided in your editor.

Thank You.
Was This Post Helpful? 0
  • +
  • -

#12 zaza.balakhashvili   User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 7
  • Joined: 20-September 14

Re: please help with using tts engine in C++

Posted 21 September 2014 - 03:30 AM

hi, please can you post only the changed code?

i can't copy it in the normal way with screenreader.
Was This Post Helpful? 0
  • +
  • -

#13 #define   User is offline

  • Duke of Err
  • member icon

Reputation: 1862
  • View blog
  • Posts: 6,711
  • Joined: 19-February 09

Re: please help with using tts engine in C++

Posted 21 September 2014 - 05:32 AM

Hi, if you go to the top right of the code box there is a pop-up 'view source' button, which will probably work for you.
Was This Post Helpful? 0
  • +
  • -

#14 zaza.balakhashvili   User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 7
  • Joined: 20-September 14

Re: please help with using tts engine in C++

Posted 21 September 2014 - 07:24 AM

unfortunately, it is not accessible for screenreaders.
Was This Post Helpful? 0
  • +
  • -

#15 #define   User is offline

  • Duke of Err
  • member icon

Reputation: 1862
  • View blog
  • Posts: 6,711
  • Joined: 19-February 09

Re: please help with using tts engine in C++

Posted 21 September 2014 - 07:59 AM

Can a screenreader download a linked/attached file?
Was This Post Helpful? 0
  • +
  • -

  • (2 Pages)
  • +
  • 1
  • 2