Computer and Hal crashes, hardware failure ensues.

More
08 Jul 2014 03:08 #48508 by Dao
Hi,

I am running a three degree of freedom motion table with EMC2 on ubuntu 10.04.4 LTS. The interface were made using motenc driver control boards. The issue that I am having is that the HAL program and the computer crashes occasionally, then the motion table moves until it approaches singularity and gets stuck... So I am wondering if anybody could provide some helpful suggestions on where and how should I begin troubleshooting this issue. Perhaps a way to investigate the crash report if the crash event was logged somewhere.

I have attached all the codes and components that are required to run. One more important thing to mention is that I did not experience any crashes when running the program alone, at least not yet. So far he crashes only occur occasionally when the hardware is running simultaneously.

Thanks,
Dao.


File Attachment:

File Name: MotionBase2New.hal
File Size:4 KB


File Attachment:

File Name: KinematicsV2.comp
File Size:2 KB


File Attachment:

File Name: Controller.comp
File Size:0 KB


File Attachment:

File Name: deg2rad.comp
File Size:0 KB


File Attachment:

File Name: BaseConsoleNew.xml
File Size:2 KB

Please Log in or Create an account to join the conversation.

More
09 Jul 2014 05:06 #48548 by andypugh

The issue that I am having is that the HAL program and the computer crashes occasionally, then the motion table moves until it approaches singularity and gets stuck.


That's not good.

Do you have a charge-pump / watchdog circuit? That's something that kills the power if a pin stops toggling (so no matter whether it stops high or low, it still trips the power).

The logic is that the software needs to be running properly, not crashed to properly toggle the pin.

Please Log in or Create an account to join the conversation.

More
09 Jul 2014 21:58 #48584 by Dao
Hi Andy,

Thanks for your reply! No I do not have a charge-pump / watchdog circuit. You can imagine how troublesome it can be when the motion table gets stuck, so I'm always standing by the power source and be ready to kill it whenever the program crashes to prevent any linkage damage...

I studied the example here: (wiki.linuxcnc.org/cgi-bin/wiki.pl?About_Charge_Pumps) where there's only one output pin and no dependent signals connected to it. But I have three physical outputs: (1)motenc.0.dac-00-value, (2)motenc.0.dac-01-value, (3)motenc.0.dac-02-value, and they have dependent signals going into the outputs such as limits, which can be referred to in MotionBase2New.hal. Could you guide me on how I can connect these pins to "charge-pump.out" or if it's possible?

So all that being said, the software still needs to be running properly like you mentioned, is there a way I can verify this?

Thanks,
Dao.

Please Log in or Create an account to join the conversation.

More
09 Jul 2014 22:50 #48586 by andypugh

I studied the example here: (wiki.linuxcnc.org/cgi-bin/wiki.pl?About_Charge_Pumps) where there's only one output pin and no dependent signals connected to it. But I have three physical outputs: (1)motenc.0.dac-00-value, (2)motenc.0.dac-01-value, (3)motenc.0.dac-02-value.


You may be misunderstanding the charge pump idea. You connect the charge pump HAL component to a physical IO pin. That physical oscillatig voltgae is connected to a mechanical interlock which disables the drives if the chage_pump hal componenent stops operating.

I am actually surprised that the Motenc hardware doesn't have such a feature. All the Mesa cards go into shut-down if they don't see a write from HAL for a certain length of time.

Of course it is possible that your issue is that the problem is exactly that, that the Motenc card shuts down and that sends the drives to a bad position.

Please Log in or Create an account to join the conversation.

More
12 Jul 2014 09:48 #48681 by jmelson

I studied the example here: (wiki.linuxcnc.org/cgi-bin/wiki.pl?About_Charge_Pumps) where there's only one output pin and no dependent signals connected to it. But I have three physical outputs: (1)motenc.0.dac-00-value, (2)motenc.0.dac-01-value, (3)motenc.0.dac-02-value.


Well, you SHOULD get the charge pump working so that it shuts down your drives when the program stops.

But, a totally separate issue is WHY is the computer crashing? Possibly you have the real time threads running too fast, and you need
to back off a bit. If the real time thread ever overruns to the point it is still running when the next period comes up, the system
can crash. That is rare on setups using hardware-assisted motion control. If it is not that, then you should run the
memtest-86 program to check for memory or cache problems, and then reseat the CPU and memory sticks to
eliminate flaky connections.

I've run various versions of EMC(1), EMC2 and LinuxCNC, and have not had a crash since 1998! The package is
EXTREMELY reliable, but you have to select good PC hardware to run it on. (I run a lot of desktp Linux
systems, and have had some run over 400 days before a power failure got to it.)

Jon

Please Log in or Create an account to join the conversation.

More
15 Jul 2014 04:52 #48781 by Dao
Thanks for your reply Jon! I ran the memtest-86 program and there was the error : "too small lower memory (0x99100 > 0x96000) " , so does this imply that I am having memory issue? I tried re-seating the CPU and memory sticks but that didn't help. How should I troubleshoot this?

Thanks,
Dao.

Please Log in or Create an account to join the conversation.

More
15 Jul 2014 05:24 #48784 by Dao
Andy:

Sorry for the late respond. Thanks for the clarifications. I guess I am still slightly unclear as to what is considered as physical IO pins between hal and motenc.

By consulting the Motenc Driver info in linuxcnc documentation, I can see that "motenc.<board>.out-<channel>" are the output pins but are default as FALSE when I display the pins in the terminal. The only pins with TRUE conditions are "motenc.<board>.in-<channel>-not -" so does that mean I set up the charge pump to these pins which correspond to whichever "motenc.0.dac-<channel>" channel it may be? So this brings me to back to the same question again, does that also imply that I now have three physical IO pins still? And how do I configure multiple charge pumps in that case? (you must be shaking your head while reading my questions, I know I'm rather slow, so please, bear with me, if you could...)

Thanks,
Dao.

Please Log in or Create an account to join the conversation.

More
15 Jul 2014 05:48 #48785 by andypugh

Sorry for the late respond. Thanks for the clarifications. I guess I am still slightly unclear as to what is considered as physical IO pins between hal and motenc.


I think you might have it backwards. The physical pins are the pins that come out of the Motenc card. I was suggesting connecting a charge-pump from a Motenc output pin to the hardware.

However, according to the manpage:
www.linuxcnc.org/docs/html/drivers/motenc.html
The Motenc card has a built-in watchdog. How do you have that configured?
Not that the manpage makes it especially clear what the effect of the watchdog is, but I would expect it to set the DAC outputs to zero. Would that be safe in your case?

Please Log in or Create an account to join the conversation.

More
15 Jul 2014 06:09 #48786 by Dao
I wasn't aware of the usage of watchdog so I didn't have that configured. I just added it to my program though by simply enabling it and setting watchdog reset to TRUE. I guess I'll have to run some tests to see if it crashes. Having zero DAC output is most definitely safe in my case, it should just shut everything down whenever problem occurs.

I guess another useful detail that could potentially contribute to the problem is the GUI that I created in the .xml file, because almost always the program crashes while changing some parameters in real-time (I can't believe I left this detail out...). A colleague of mine suggested using a stop/kill switch whenever I want to change parameters.

Thanks,
Dao.

Please Log in or Create an account to join the conversation.

More
15 Jul 2014 06:33 #48787 by andypugh

I wasn't aware of the usage of watchdog so I didn't have that configured. I just added it to my program though by simply enabling it and setting watchdog reset to TRUE. I guess I'll have to run some tests to see if it crashes. Having zero DAC output is most definitely safe in my case, it should just shut everything down whenever problem occurs.


The way to test it would be to deliberately stop realtime and see what happens.

With LinuxCNC up and running open a terminal window and type
halcmd -kf
You should now be in a command-line environment where all the commands that are familiar from the HAL file will work, and also a few others.
Try
show thread
and you should see the current threads and their execution times. try it a few times and you should see a different number each time, which shows that the threads are running.
Now, you can kill realtime with a simple command:
stop
When that happens the watchdog should trip. I am not 100% sure how to tell that it tripped, though.

Please Log in or Create an account to join the conversation.

Time to create page: 0.302 seconds
Powered by Kunena Forum