SocketException on OriginateAction

Mar 27, 2014 at 4:04 PM
I've been struggling with an on-going reliability problem connecting to AMI with Asterisk.Net at first and now AsterNET (beta3-r2). Our Asterisk (v1.4) server is supporting a call centre with about 150 people and fairly high volumes. I've created program to listen to events (originate & status) which will happily run without exception. Shortly after I add sync or async originate actions in the mix the SocketConnection dies and won't auto reconnect. I have to create a new instance of the ManagerConnection to get things running again. The connection remains active for between 2 to 5 minutes listening to events and originating calls before it dies.

I've looked at other wrapper projects and seen some interesting things. A java wrapper uses a pool of ManagerConnections for example. Something I've tried to do but without success. I've seen a Perl wrapper which suggested that if you don't fire/monitor all call events then your AMI connection can die (probably a PERL specific buffer problem). I've also seen other projects mention that if a user hangs up without answering the originated call a channel could be put in an inconsistent state (FastAGI). Another proxy played with the receive and send buffer sizes to work around some problems but that doesn't seem to help me.

Ping doesn't help. The ManagerConnection is unusable after the SocketException.

I've tried synch/asynch both in the originate action and also in the way my program handles the incoming call requests.

I'm hoping someone has some fresh ideas. Should I try create a new ManagerConnection for each originate action and have a separate connection for listening? At the moment (but I have experimented with other options ) my ManagerConnection is wrapped in a singleton. I would still like to understand why the connection is dropping in the first place. I of course can't rule out that I've haven't just done something incredibly dumb in my code.

Thanks
Mar 27, 2014 at 4:40 PM
been struggling with the same problem for years. same as you, I also got a 150 agent CC and faced all the things you said.

I now use a windows service for the events with a timer that checks for connection drops and initiates login as required and for the originates, a webservice that creates a new manager connection for each async originate action that I logoff and dispose straight after. doesn't cause any performance issues that I can make out and very light on server resources.

works well, but its just a work around.
Mar 28, 2014 at 8:40 AM
I'm glad it's not just me.

Have you tried a proxy like AstManProxy hosted directly on the Asterisk server? I was wondering if going that route would be better. I suppose at the end of the day that also connects to the AMI and would probably have the same problems. In the meantime I think I'll implement your workarounds. They seem like the most practical solution.
Mar 28, 2014 at 9:24 AM
I looked at astmanproxy, but the last release was 6 years ago. even got as far as installing it on a vm and connecting to a asterisk box. I found that a few events were not firing. gave up after that ( no free time to play with it )
Mar 28, 2014 at 12:02 PM
Hi Guys,

I've seen SocketExceptions myself with AsterNET when using AMI, but have never had the time to track down the cause. Could you give me some more detailed information, steps to reproduce etc, that might help me track this down?

it sounds like the exception isn't being handled correctly. Any additional information you can provide would be useful.
Mar 31, 2014 at 10:36 AM
I haven't been able to isolate the cause of the problem. Let me do some logging with the latest version of AsterNET and see if anything turns up over the course of this week. I've tried loads of options over different versions so I don't want to confuse the issue with my existing findings.

I've seen SocketExceptions caused by the send/receive buffers being too small in another proxy program. Might be related - I'll adjust those values as part of my experiments.
Mar 31, 2014 at 5:25 PM
So far...

ManagerReader.cs
while (true)
{
    try
    {
        while (!die)
        {
        ...
        Thread.Sleep(50);
                if (mrConnector.TraceCallerThread 
                                    && mrConnector.CallerThread != null 
                                    && mrConnector.CallerThread.ThreadState == ThreadState.Stopped)
                {
                    die = true;
                    break;
                }
        ...     
        }
        if (mrSocket != null)
        {
            mrSocket.Close();
        }
        break;
        ...
ThreadState becomes stopped and the while loop exits causing the socket to be closed. TraceCallerThread is defaulted to true.
Apr 2, 2014 at 9:03 AM
I've not tried this myself yet, but you could turn on debugging to show any debug output from AsterNET.

Add the following to your app,conf

<system.diagnostics>
<trace autoflush="false" indentsize="4">
<listeners>
<add name="EventLogListener" type="System.Diagnostics.XmlWriterTraceListener"
              initializeData="c:\asternet.svclog" />
</listeners>
</trace>
</system.diagnostics>

This will create a log file in the path above, of course it might be very large depending on how busy your server is.
Apr 2, 2014 at 10:16 AM
Thanks. I've got logging running at the moment (massive file).

In ManagerConnection.cs I changed this line and reliability has improved dramatically this morning :
private bool traceCallerThread = true;
to
private bool traceCallerThread = false;
In the ChangeLog there is a note about this feature: 09.18.2008 Fix error with Close Windows.Forms without Logoff() - to disable this feature set TraceCallerThread property to false.

The calling thread gets monitored to see whether it has stopped and when it has the "while (!die)" loop is terminated causing the socket to be closed.

Under my Windows service the calling thread sometimes has a threadstate of stopped. I'm trying to work out now why the calling thread is stopped since the service continues to run without problem.
Apr 2, 2014 at 2:08 PM
OK cool! Keep me posted please would be nice to find out the cause of this problem.

Appreciate you help and time.
Apr 2, 2014 at 3:11 PM
sorry I cannot join in at the moment, but got deadlines with a different tech.
thank you both for taking the time to follow this up, and to skrusty for reviving this really useful and underestimated DLL.
May 13, 2014 at 9:33 AM
I think I finally understand what is happening. I've got a windows service which starts up and creates a connection to MSMQ. The BeginReceive() method on the queue is asynchronous and is pulling a thread from the ThreadPool. I've got the ManagerConnection wrapped in a singleton and it's only initialised on receipt of the first message from the queue. With the ManagerConnection traceCallerThread flag set it monitors the thread that initialised it but after a while the ThreadPool is recycling it causing the ManagerConnection to set 'die=true'. That eventually leads to the SocketConnection being closed. Setting traceCallerThread = false or initialising the ManagerConnection on the main windows service thread solves the problem.
May 13, 2014 at 10:52 AM
Great,

Thanks for the run-down on the issue. I am wondering if the default should be false for monitor flag.
May 14, 2014 at 3:46 PM
I've been wondering about the implications of doing that. Its strange that the property is internal and everything else is public.
        internal bool TraceCallerThread
        {
            get { return traceCallerThread; }
            set { traceCallerThread = value; }
        }
I tried to reproduce the exception that TraceCallerThread was put in for but I'm not getting the same behaviour. What I am seeing though is that when the WinForm is closed without calling LogOff() the mrReaderThread continues to run. So the application is no longer visible on the task bar but I can see the process running in task manager.

The TraceCallerThread logic should probably be completely removed and the way the mrReaderThread gets setup looked at. Marking the thread as a background thread causes it to close automatically when the main program thread is closed.
A managed thread is either a background thread or a foreground thread. Background threads are identical to foreground threads with one exception: a background thread does not keep the managed execution environment running. Once all foreground threads have been stopped in a managed process (where the .exe file is a managed assembly), the system stops all background threads and shuts down.
Setting IsBackground = true in ThreadClass.cs has the desired result of ending the mrReaderThread when the WinForm is closed. The only problem is that it isn't a clean shutdown so you could then get an exception. That behaviour might be more expected though and prompt you to do a clean shutdown by calling logoff().
        public ThreadClass(ThreadStart start, string name)
        {
            thread = new Thread(start);
            thread.IsBackground = true;
            this.Name = name;
        }
May 15, 2014 at 5:06 PM
You could always fire up a fork and propose a pull request? :) All contributions are as every very welcome!
May 15, 2014 at 9:17 PM
Sure - sounds good.