Foundational questions about my setup

More
01 May 2024 17:51 - 01 May 2024 18:06 #299484 by hachi
Apologies if I forget some detail... I tried to include everything but the first copy of this post got lost and I'm writing it again.

I have a tabletop router made from a 6090 frame that I was using 2.9.0-pre1 on last year. I kinda fell off the earth for a while and coming back to it now I'd like to freshen the machine, customize the interface a bit, and make sure my original conclusions were actually correct, that's where this post comes in.

My setup uses a Mesa ethernet FPGA card and so I have no base thread. I ran latency-tool long ago and got my machine to have a quite stable maximum jitter on a 1ms servo thread to be 60µs. The only tuning I did was BIOS settings and idle=poll on the kernel command line. I did always noticed an error saying:

linuxcnc@linuxcnc:~$ latency-test
Note: Using POSIX realtime
Unexpected realtime delay on task 0 with period 25000
This Message will only display once per session.
Run the Latency Test and resolve before continuing.
Note: Using POSIX realtime


This message almost always fired at startup, but latency-* would never show anything beyond my 60µs even with the machine playing video, browsing internet, or doing heavy network operations.

Is/was this a problem I should pay attention to in itself? or just a sign that thread startup is a little heavy?

Now, today I upgraded the machine to 2.9.2 and kernel 6.6.13 (from 6.0.something) and immediately I'm getting:

hm2/hm2_7i95.0: error finishing read! iter=1563

I haven't tried downgrading anything yet, but I see the thread pinned Latency, error finishing read, and IRQ affinity

This seems like a direct connection to my problem and I'm going to start pouring over that and trying things from there.

Please let me know if you think I'm barking up the wrong tree though, or if my earlier conclusions were wrong.

Now for the sake of completeness here's my hardware details if needed:

Dell Wyse 5070 thin client modified to have some active cooling on the cpu
Intel(R) Celeron(R) J4105 CPU @ 1.50GHz w/ 4 cores
4GB RAM
2x Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
- One for the Mesa card connection
- One for network connection

Thanks for reading and any input
hachi
Last edit: 01 May 2024 18:06 by hachi. Reason: Fix code and url block extending to the end of the post incorrectly

Please Log in or Create an account to join the conversation.

More
01 May 2024 18:02 #299485 by hachi
Quick fix,

I have no clue where my brain was but I mean 2.9.2, not 2.9.6, of course.

Please Log in or Create an account to join the conversation.

More
02 May 2024 02:59 #299523 by blazini36
Well hopefully it wasn't this specific error.....
Unexpected realtime delay on task 0 with period 25000

....cuz then you'd be trying to run a 25us thread which most newer PCs can't handle. If you're seeing that error with a period of 1ms like you should be seeing if you actually set the servo period to 1ms. In that case it's not a big deal on start up, it's pretty common as long as that's the only time it happens. PCW generally points people to look at 2 hm2 pins which I forget ATM but comparing those 2 pins gives a better actual idea of the latency on an ethernet card.

Please Log in or Create an account to join the conversation.

More
02 May 2024 03:18 #299524 by PCW
To get an idea of where you are is to run the command:

halcmd show param *.tmax

When LinuxCNC is running.

These times will be in CPU clocks on X86 CPUs

Also the ping times (pinging the Mesa card) will help

Please Log in or Create an account to join the conversation.

More
02 May 2024 08:01 #299530 by rodw
Try
sudo apt install r8168-dkms
You need the linux headers for your RT kernel installed

More info here 
docs.google.com/document/d/1jeV_4VKzVmOI...diY/edit?usp=sharing

Please Log in or Create an account to join the conversation.

More
02 May 2024 17:34 - 02 May 2024 17:35 #299573 by hachi

Well hopefully it wasn't this specific error.....Unexpected realtime delay on task 0 with period 25000
....cuz then you'd be trying to run a 25us thread which most newer PCs can't handle.


It is exactly that message, but I think that's because latency-test wants to start a base thread on 25µs periods by default... either way it's not relevant anymore. Thank you.
Last edit: 02 May 2024 17:35 by hachi. Reason: Add quoted text

Please Log in or Create an account to join the conversation.

More
02 May 2024 17:40 #299575 by hachi

Try
sudo apt install r8168-dkms
You need the linux headers for your RT kernel installed

More info here 
docs.google.com/document/d/1jeV_4VKzVmOI...diY/edit?usp=sharing
 

Wonderful hint, thank you.

Please Log in or Create an account to join the conversation.

More
02 May 2024 17:42 #299576 by hachi

To get an idea of where you are is to run the command:

halcmd show param *.tmax

When LinuxCNC is running.

These times will be in CPU clocks on X86 CPUs

Also the ping times (pinging the Mesa card) will help
 

Thank you very much PCW, you've been very helpful with this machine in private emails in the past as well.

Please Log in or Create an account to join the conversation.

More
02 May 2024 19:04 - 02 May 2024 19:05 #299590 by hachi
Small note for anyone in the future, the dkms driver actually made my latency worse and lcnc unusable. I guess I should always test.

Kernel 6.6.13 In-tree driver:

linuxcnc@linuxcnc:~$ sudo taskset -c 3 chrt 99 ping 192.168.1.121
PING 192.168.1.121 (192.168.1.121) 56(84) bytes of data.
64 bytes from 192.168.1.121: icmp_seq=1 ttl=64 time=0.170 ms
64 bytes from 192.168.1.121: icmp_seq=2 ttl=64 time=0.103 ms
64 bytes from 192.168.1.121: icmp_seq=3 ttl=64 time=0.103 ms
64 bytes from 192.168.1.121: icmp_seq=4 ttl=64 time=0.103 ms
64 bytes from 192.168.1.121: icmp_seq=5 ttl=64 time=0.105 ms
64 bytes from 192.168.1.121: icmp_seq=6 ttl=64 time=0.101 ms
64 bytes from 192.168.1.121: icmp_seq=7 ttl=64 time=0.101 ms

linuxcnc@linuxcnc:~$ halcmd show param hm2*.tmax
Parameters:
Owner   Type  Dir         Value  Name
    45  s32   RW              0  hm2_7i95.0.read-request.tmax
    45  s32   RW         479510  hm2_7i95.0.read.tmax
    45  s32   RW          93556  hm2_7i95.0.write.tmax

DKMS Realtek Driver 8.053:

linuxcnc@linuxcnc:~$ sudo taskset -c 3 chrt 99 ping 192.168.1.121
PING 192.168.1.121 (192.168.1.121) 56(84) bytes of data.
64 bytes from 192.168.1.121: icmp_seq=1 ttl=64 time=1.17 ms
64 bytes from 192.168.1.121: icmp_seq=2 ttl=64 time=1.18 ms
64 bytes from 192.168.1.121: icmp_seq=3 ttl=64 time=1.19 ms
64 bytes from 192.168.1.121: icmp_seq=4 ttl=64 time=1.18 ms
64 bytes from 192.168.1.121: icmp_seq=5 ttl=64 time=1.18 ms

linuxcnc@linuxcnc:~$ halcmd show param hm2*.tmax
Parameters:
Owner   Type  Dir         Value  Name
    45  s32   RW              0  hm2_7i95.0.read-request.tmax
    45  s32   RW        1290884  hm2_7i95.0.read.tmax
    45  s32   RW          85734  hm2_7i95.0.write.tmax

And I got a joint following error and a read error almost instantly from hostmot2.
Last edit: 02 May 2024 19:05 by hachi. Reason: Fix kernel version
The following user(s) said Thank You: rodw

Please Log in or Create an account to join the conversation.

More
02 May 2024 20:38 - 02 May 2024 20:39 #299594 by PCW
The listed read.tmax (479510) is not enough to cause a problem (~300 usec),
but if you get read errors its means you have had a burst of timeout errors
with read  times > 800 usec. (Tmax > 1200000 with a 1.5 GHz CPU)

I would first verify that you have the usual suspects eliminated by
disabling all power management in the BIOS (Turbo modes, EIST, C states > C1,
PCIE and network power management) and disable any Intel spyware like AMT
and hyperthreading if applicable.

If this does not help, the next step is to pin the Ethernet IRQ to the last processor (3)
and set isolcpus=3 in grub
Last edit: 02 May 2024 20:39 by PCW.

Please Log in or Create an account to join the conversation.

Time to create page: 0.158 seconds
Powered by Kunena Forum