Determine if someone is speaking

Login to reply to this topic.
Thu, 2005-07-21 10:12
Joined: 2005-07-14
Forum posts: 3
Hi,

Does anybody know if it's possible to recognize if the user is actually speaking to the michrophone?

Maybe recording with CMdaAudioInputStream and analyzing the data would do it but in that case some sort of an algorithm would be needed.

Thanks a lot,
Pekka


Thu, 2005-08-04 12:29
Joined: 2005-07-25
Forum posts: 31
Re: Determine if someone is speaking
Hi pkosonen

Well as far as i can interpret is that you want to know weather the in coming input signal to the microphone
is speech or not (Means Voice Signal or Unvoiced Signal), Well if this is what you exactly want then there
is a way to conclude weather the incoming audio signal is voiced or unvoiced,

Thus please refer the G.729 Annex B (VAD / DTX and CNG). Specifically VAD (Voice Activity Detection)
Helps in idendifing the coming speech signal (frame) is Voiced or Unvoiced, Hence it helps in Identifing
to know, weather the coming signal is containg the Speech data or Silence Data.

If you are going for the above then let me know if you face any problem regarding G.729 Annex B algorthim

Cheers
Ranjeet
Fri, 2005-08-05 05:04
Joined: 2005-08-01
Forum posts: 44
Re: Determine if someone is speaking
Hi all,

I want to develop similar application, record only when someone's speaking / there's quite loud noise. What do you mean by G.729 Annex B (VAD / DTX and CNG) ? how can I get it? is it a plug in or something?

Anyone succeed with this way? please guide me  Cry thanx

Regards,

Irma
Wed, 2005-08-10 07:33
Joined: 2005-07-25
Forum posts: 31
Re: Determine if someone is speaking
Yes this is the ITU standard Which is going to detect the Speech, Thats Why it is called the Voice
Activity Detection. By Pluging this You only encode the Speech and not the silence. For encoding of silence
you got For thr DTX, And for decoding of the Silence you go for the CNG.

Refer ITU site.

Ranjeet
Sat, 2005-09-03 10:34
Joined: 2005-01-24
Forum posts: 76
Re: Determine if someone is speaking
Hi

If you're recording voice from the microphone, you'd probably be interested in using AMR to compress the speech.

If you do use AMR, that already has speech detection algorithms you require inbuilt.  If, when setting up the AMR encoder, you specify DTX (discontinuous transmission) to be turned on, then the AMR codec will do the kind of detection algorithms of which you speak.

It's all based around observed background silence and changes in energy levels.  I'm no expert, but the upshot of it is that it will encode voice audio when someone is speaking, and produce no encoded data whilst it is detecting background silence.  At the decoding side, the AMR codec will decode and play voice when voice data has been recorded, and in periods where silence was recorded and no data is available it will generate "comfort noise" to produce an uninterrupted audio streams that don't contain horrible artifacts where transmission starts/ends.

Hope this helps.

Regards.

Andy.
  • Login to reply to this topic.