Can the OPI5 be Configured to Run LCNC?

More
21 Jun 2023 00:28 - 21 Jun 2023 00:30 #273974 by royka
Perhaps moving the irq of the NIC to another cpu will help.

cat /proc/interrupts
check which one is from the NIC, then as root:
echo "CPU number in HEX" > /proc/irq/105/smp_affinity
                Binary     Hex
CPU 0       0001        1
CPU 1       0010        2
CPU 2       0100        4
CPU 3       1000        8
Last edit: 21 Jun 2023 00:30 by royka.

Please Log in or Create an account to join the conversation.

More
22 Jun 2023 01:00 #274017 by echristley
end1 was using interrupts 86, and 87. Both were set to use ff. I changed both to 1. Hit the error very quickly.

Please Log in or Create an account to join the conversation.

More
22 Jun 2023 01:30 - 22 Jun 2023 01:33 #274021 by royka
You could try to move them to 1,2 and 3 so that will be "E".
But a big chance it won't work then. I'll see if I could find something on Friday.

Have you already tried to run the gcode on a simulation machine?
Last edit: 22 Jun 2023 01:33 by royka.

Please Log in or Create an account to join the conversation.

More
22 Jun 2023 03:00 #274025 by rodw
If the limiting factor is network latency, getting amazing scores with latency-histogram is not helpful.
Without a Base thread for software stepping, there is basically only one real time thread running on just one of the isolated cores so having 3 or 4 cores isolated probably does not help at all.

Best practice according to the RT kernel guys would be isolate just one core, and use CPU affinity to force the NIC onto the same core. This seems to have been confirmed by some testing by PCW. There are command line tools to identify which interrupts are in use  and the cores they are running on. Once you can identify the NIC interrupt you can change its CPU affinity to be on the servo thread core...

Please Log in or Create an account to join the conversation.

More
22 Jun 2023 09:54 #274033 by royka
"getting amazing scores with latency-histogram is not helpful"
Does any of my comments makes you think it's about getting amazing scores with the latency-histogram?
Most irq interrupts are on core 0, so moving it from 0-7 to 1-3 ....
Wouldn't moving it to a isolated core that's dedicated for rtapi cause it to get interrupted more often?

Although I don't really believe this is the cause, perhaps giving networking a higher priority (rt) would be a better option?

But it could be various other things of course which I'll look tomorrow at.

Please Log in or Create an account to join the conversation.

More
22 Jun 2023 10:21 #274034 by rodw
No but there has been some pretty amazing latencies shared on this thread. But that is no longer the limiting factor.
Note comment by PCW here.
forum.linuxcnc.org/38-general-linuxcnc-q...read?start=50#272641

And also use of a single core here a bit earlier in that thread.
forum.linuxcnc.org/38-general-linuxcnc-q...read?start=40#271748

When you think about it, the NIC is controlled by the hm2_eth driver which is a real time component on the rtapi thread.
So in theory (right or wrong), it keeps the RT comms all together so there is no intercore comms happening.

Peter's posts above were written after the feedback we got from them as he joined the email discussion.

I might add that in our kernel traces suggested by the RT kernel guys., the rtapi thread slept for about 800 usec so all of that time could be used by the NIC without any ill affect. (That was on the Bookworm 6.1 kernel that has much higher performance than the 5.1 kernel which I believe you are using). This is a good  reason to rebuild your kernel with Debian Bookworm.

Please Log in or Create an account to join the conversation.

More
22 Jun 2023 17:57 - 22 Jun 2023 18:30 #274064 by echristley
I've been running the gcode on an actual machine, but doubling the servo thread time period was enough to account for the network latency and eliminate the failures.

I have isolcpus set to 5,6,7. 
'end1', the ethernet, is on interrupts 86 and 87 (not sure why two interrupts).
What I'm gather here is that I should:
echo "e0"  > /proc/irq/86/smp_affinity
echo "e0"  > /proc/irq/87/smp_affinity

I'll run the test is a couple hours if that is correct.
Last edit: 22 Jun 2023 18:30 by echristley.

Please Log in or Create an account to join the conversation.

More
23 Jun 2023 01:33 #274105 by echristley
I've got wonderful news.

IT WORKS.

After moving the mill's base, I changed the smp_affinity. My isolcpu parameter is set to 2,4,5,6 and the end1 ethernet adapter is using IRQ 86 and 87. For some reason it didn't let me do it with sudo, I had to sudo su into root, but I

echo "32">/proc/irq/86/smp_affinity
echo "32">/proc/irq/87/smp_affinity

Ran the cds.ngc program in the examples folder, and got about halfway through the top before it errored out. Disappointing, but further than it had been getting without the modifications.
Then, I saw that 32 was not the correct hex for the bitmap from 2,4,5,6. I should have used "3c" (0011 1100). Go easy on me. I had just moved the heavy part of a 3500lb mill in the NC summer humidity.

Back to root to make the change,

echo "3c">/proc/irq/86/smp_affinity
echo "3c">/proc/irq/87/smp_affinity

then into PncConf to make sure I had the servo period set to 1000000. All good, then back to LinuxCNC.
It ran through the cds.ngc with no problem at 32ipm. , so I ran it again. Halfway through, I got bored, so I loaded up a couple more at random. The snowflake one had the motors shaking the wireframe shelf I have them sitting on. The round drill pattern one is cool to watch.

I'm going to do more testing (and of course some tuning), but it looks like setting the affinity is the answer.
The following user(s) said Thank You: tommylight, royka

Please Log in or Create an account to join the conversation.

More
23 Jun 2023 11:28 #274125 by royka
That's great news indeed!

sudo echo doesn't work directly indeed, but instead you could type:
echo 3c | sudo tee /proc/irq/86/smp_affinity

"3c" (0011 1100) = cpu 2,3,4,5

Setting the IRQ affinity this way is temporary, so after a reboot you'll have to type it again. You could try to add the commands to /etc/rc.local before the last line "exit 0", then check "cat /proc/irq/86/smp_affinity" if it worked.
In rc.local you don't have to use sudo tee

I'm not sure if isolcpus=2,4,5,6 is optimal but at least it works, I've just did a fresh install on my Opi5 and will do some tests too.
The following user(s) said Thank You: rodw

Please Log in or Create an account to join the conversation.

More
23 Jun 2023 14:04 #274145 by royka
After loading a 42mb gcode file it immediately threw a hmread error. After moving the irq affinity of eth0 (only the one with interrupts) to core 5 (HEX 20) it runs fine.
isolcpus=5,7

TBH the reason why isolcpus=5,7 runs better than isolcpus=7 I don't know, but somehow it seems to be. My first thoughts was that they might share the same timer, but after looking at the dtb this isn't the case. Perhaps the timers come from the same mcu?

Please Log in or Create an account to join the conversation.

Time to create page: 0.255 seconds
Powered by Kunena Forum