• Hello and welcome to our new forums. We upgraded our forum sites to a more robust and modern system which we hope you will enjoy. Be sure to check out your profile by clicking the button on the top right and configure your preferences, signature, time zone, avatar, etc. as you wish. If you need help with using this new forum'ware try the help link on the bottom right.

    Click here to review your account now.

Speech recognition program - need conceptual approach

Solo

New member
Joined
Apr 15, 2019
Messages
2
Location
Ashe county, NC
Programming Experience
10+
I currently have in test operation a program written in VB that uses SAPI for speech and the speech recognition facility. The PC is connected to a hardware device that the user will occasionally send hardware commands to by speaking into a mic. The program reports on the results of the operations by speaking to the user. All that works just fine. There's a limited grammar of about 25 words that are defined to the speech recognition engine. The issue is that the user in this situation is speaking continually to another person and in the course of his speaking would say one of the words defined in the grammar and thus cause the program to react. What's needed is a keyword that wouldn't turn up in the user's normal speech that he would say prior entering a spoken command. It would function much like the Alexa device where the user would first say "Alexa" before saying a command. So, the question is how to implement that function. (Don't need the code, just the approach).

One approach might be to alter the grammar available to the speech recognition engine. On startup, the engine grammar would contain only the Alexa key word. When that word was recognized, the program would add to the grammar all the other words that could be recognized. But, it's not clear to me how then I could return the grammar to just the single key word after the detailed processing was finished. There are ways to add to the grammar, but I couldn't find a way to subtract from the grammar. I suppose that I could just stop the engine and then restart it with just the single word in the grammar.

Another approach might be to start two separate speech recognition engines, one with a one word grammar and one with the full grammar. The full grammar engine could be stopped and started by the code associated with the one word grammar. But, it's not clear whether you can have two speech recognition engines operating at one time, nor if the same mic input could be used for both.

Any suggestions on the best solution for this?
 

JohnH

VB.NET Forum Moderator
Staff member
Joined
Dec 17, 2005
Messages
15,266
Location
Norway
Programming Experience
10+
I have not worked with speech recognition, but have read some articles from time to another, I found this with a seach now: Voice Recognition - Speech Recognition with .NET Desktop Applications
The example here is similar to what you ask, it has start-stop commands in one grammar and other functionality in another, multiple grammars can be loaded and listened to at the same time. In SpeechRecognized it checks for start-stop first and toggles a Boolean field (speechOn) and uses this to determine whether to process other recognized speech. Similar you could have a single start keyword that enables recognizing commands, and following a recognized command you would set the boolean to false again and only the start word would be processed.
In short: one SpeechRecognitionEngine with multiple active grammars, where one of them is just the start command.
What should happen if start word is detected, but no commands follow is up to you, but I would probably use a Date type field too and set speechOn to false if it took too long time between start and command.
 

Solo

New member
Joined
Apr 15, 2019
Messages
2
Location
Ashe county, NC
Programming Experience
10+
Thanks for the info. Conceptually, that's what I'm trying to accomplish but the mechanics of doing the grammar changes are apparently quite different between C and Visual Basic. I just am not able to decipher the C code and effectively translate it into Visual Basic. While some of the terms are the same, the structures appear to be quite different. So far, I've found that removing one of the grammars and then stopping/starting the speech engine doesn't prevent the engine from recognizing words that were in only the deleted grammar. I'll keep experimenting and see what I can figure out.
 

JohnH

VB.NET Forum Moderator
Staff member
Joined
Dec 17, 2005
Messages
15,266
Location
Norway
Programming Experience
10+
The example is in C#, there are some semicolons and curly brackets but it's not very different from VB. Look up a C# VB converter online and you get VB sample code. For instance taking the first find Code Converter C# to VB and VB to C# – Telerik I get this
VB.NET:
Imports System
Imports Microsoft.Speech.Recognition
Imports Microsoft.Speech.Synthesis
Imports System.Globalization

Namespace ConsoleSpeech
    Class ConsoleSpeechProgram
        Shared ss As SpeechSynthesizer = New SpeechSynthesizer()
        Shared sre As SpeechRecognitionEngine
        Shared done As Boolean = False
        Shared speechOn As Boolean = True

        Private Shared Sub Main(ByVal args As String())
            Try
                ss.SetOutputToDefaultAudioDevice()
                Console.WriteLine(vbLf & "(Speaking: I am awake)")
                ss.Speak("I am awake")
                Dim ci As CultureInfo = New CultureInfo("en-us")
                sre = New SpeechRecognitionEngine(ci)
                sre.SetInputToDefaultAudioDevice()
                sre.SpeechRecognized += AddressOf sre_SpeechRecognized
                Dim ch_StartStopCommands As Choices = New Choices()
                ch_StartStopCommands.Add("speech on")
                ch_StartStopCommands.Add("speech off")
                ch_StartStopCommands.Add("klatu barada nikto")
                Dim gb_StartStop As GrammarBuilder = New GrammarBuilder()
                gb_StartStop.Append(ch_StartStopCommands)
                Dim g_StartStop As Grammar = New Grammar(gb_StartStop)
                Dim ch_Numbers As Choices = New Choices()
                ch_Numbers.Add("1")
                ch_Numbers.Add("2")
                ch_Numbers.Add("3")
                ch_Numbers.Add("4")
                Dim gb_WhatIsXplusY As GrammarBuilder = New GrammarBuilder()
                gb_WhatIsXplusY.Append("What is")
                gb_WhatIsXplusY.Append(ch_Numbers)
                gb_WhatIsXplusY.Append("plus")
                gb_WhatIsXplusY.Append(ch_Numbers)
                Dim g_WhatIsXplusY As Grammar = New Grammar(gb_WhatIsXplusY)
                sre.LoadGrammarAsync(g_StartStop)
                sre.LoadGrammarAsync(g_WhatIsXplusY)
                sre.RecognizeAsync(RecognizeMode.Multiple)

                While done = False
                End While

                Console.WriteLine(vbLf & "Hit <enter> to close shell" & vbLf)
                Console.ReadLine()
            Catch ex As Exception
                Console.WriteLine(ex.Message)
                Console.ReadLine()
            End Try
        End Sub

        Private Shared Sub sre_SpeechRecognized(ByVal sender As Object, ByVal e As SpeechRecognizedEventArgs)
            Dim txt As String = e.Result.Text
            Dim confidence As Single = e.Result.Confidence
            Console.WriteLine(vbLf & "Recognized: " & txt)
            If confidence < 0.60 Then Return

            If txt.IndexOf("speech on") >= 0 Then
                Console.WriteLine("Speech is now ON")
                speechOn = True
            End If

            If txt.IndexOf("speech off") >= 0 Then
                Console.WriteLine("Speech is now OFF")
                speechOn = False
            End If

            If speechOn = False Then Return

            If txt.IndexOf("klatu") >= 0 AndAlso txt.IndexOf("barada") >= 0 Then
                (CType(sender, SpeechRecognitionEngine)).RecognizeAsyncCancel()
                done = True
                Console.WriteLine("(Speaking: Farewell)")
                ss.Speak("Farewell")
            End If

            If txt.IndexOf("What") >= 0 AndAlso txt.IndexOf("plus") >= 0 Then
                Dim words As String() = txt.Split(" "c)
                Dim num1 As Integer = Integer.Parse(words(2))
                Dim num2 As Integer = Integer.Parse(words(4))
                Dim sum As Integer = num1 + num2
                Console.WriteLine("(Speaking: " & words(2) & " plus " & words(4) & " equals " & sum & ")")
                ss.SpeakAsync(words(2) & " plus " & words(4) & " equals " & sum)
            End If
        End Sub
    End Class
End Namespace
 
Top Bottom