EtherCAT communication issues with miniPCs

More
10 Oct 2023 20:28 #282701 by JiiPee
Hello all!

I have not used LinuxCNC for a long time but I have already received a lot of great help on getting up to speed and tackling problems so I decided to share back some solutions which might plaque others as well. I didn't at least find similar case with search but then again I might have missed it as well.

So I have been working a problem that I encountered with my small CNC router running on Minisforum GK41 miniPC. The system consists of 4 RTelligent ECT86 EtherCAT drives and Beckhoff EK1100 with couple of IO modules. I have dedicated NIC for EtherCAT fieldbus to avoid any traffic interference on the communication bus. The GK41 is great little device in this regards as it has dual NIC out of the box and was quite cheap when I got it last spring (120€).

My issue has been that the whole EtherCAT bus stopped communicating intermittently which in turn lead to numerous errors as limit switch signals got tripped and my HAL configuration is such that estop is initiated as soon as any of the EtherCAT devices lose communication with the master.

Whenever the EtherCAT crashed I would receive follow stuff in the machine log:

Warning: Spoiler!


Took me a while to figure out that the problem is not electrical noise nor limit switches seeing ghosts. 

bringing up the dmesg log revealed immediately that issue is with the NIC dropping the connection for few seconds and then bringing the port back up again. 
 

The solution for this kind of issue seems to be to check that you indeed are running dkms drivers for Realtek RL8168 or similar series ethernet interfaces. Great instructions for installing these are already on RodW's awesome linuxCNC installation instructions here:

linuxcnc.org/docs/2.9/html/getting-start...etting-linuxcnc.html

To check which drivers you are currently running you might run command:
lspci -v

This alone did not help though. I had to add these parameters to kernel parameter list: 
r8168.aspm=0 r8168.eee_enable=0 pcie_aspm=off loglevel=3

One way to add these into kernel parameters is to use grub-customizer tool

I did not test them individually to see which helped but I suspect that disabling the "Energy Efficient Ethernet" mode might be a good culprit.

Anyway I used to lose the EtherCAT bus couple times in an hour and now the system has been running without a hitch for few hours straight. Time will tell if this helped or is it just plain luck.
 
Attachments:
The following user(s) said Thank You: 0x2102, onceloved

Please Log in or Create an account to join the conversation.

Time to create page: 0.150 seconds
Powered by Kunena Forum