Mesa hm2/hm2_7i96s.0: error finishing read

More
16 Oct 2022 23:07 #254292 by bensttech
GRUB_CMDLINE_LINUX_DEFAULT="quiet lapic=notscdeadline hpet=disable i915.i915_enable_rc6=0 i915.powersave=0 intel_idle.max_cstate=1 processor.max_cstate=1 isolcpus=1 idle=poll"

That little lot significantly improves things it makes much more stable, it reduces the frequency and magnitude of the big delay spikes.

Please Log in or Create an account to join the conversation.

More
17 Oct 2022 01:40 #254301 by PCW
You might also try running Intels powertop utility and setting all power saving options to "bad"

Please Log in or Create an account to join the conversation.

More
17 Oct 2022 06:09 #254312 by bensttech
So extending the servo period does not seem to overcome the issues as the read.tmas also extends when the timing spikes occur, i left it on soak test overnight and sure enough i got an "error finishing read! iter=" my read hm2_7i95.0.read.tmax hit a record 3376032 and servo .tmax hit 3597792 both with servo period set to SERVO_PERIOD = 2000000 I will try the exact hw with my older 2.8 build and also some different hw with this setup and post the results. Is it possible to make the error finishing read non fatal?

Please Log in or Create an account to join the conversation.

More
17 Oct 2022 09:36 #254320 by bensttech
The issue follows the OS / Kernel or LinuxCNC version, not the hardware (hardware has small effect). Will post the stats later.

Please Log in or Create an account to join the conversation.

More
17 Oct 2022 15:21 - 17 Oct 2022 15:28 #254334 by PCW
"Error finishing read" means that too many successive timeouts have occurred

Default settings are: timeout means >80% of the servo period,
and 5 timeouts in a row result in a fatal I/O error: "error finishing read"

You can tune the controlling parameters:

hm2_7i96s.0.packet-read-timeout (default = 80%)
hm2_7i96s.0.packet-error-increment (default = 2)
hm2_7i96s.0.packet-error-decrement (default = 1)
hm2_7i96s.0.packet-error-limit (default=10)
Last edit: 17 Oct 2022 15:28 by PCW.

Please Log in or Create an account to join the conversation.

More
17 Oct 2022 17:22 #254342 by bensttech
many thanks, is the packet-error exposed to hal at all i was struggling to see it.

Please Log in or Create an account to join the conversation.

More
17 Oct 2022 17:28 - 17 Oct 2022 17:37 #254343 by PCW
Yes:
hm2_7i96s.0.packet-error
hm2_7i96s.0.packet-error-exceeded
hm2_7i96s.0.packet-error-level

Note that packet-error can be useful to prevent bogus velocity corrections
in the case of lost/dropped packets:

# position command and feedback
net emcmot.00.pos-cmd joint.0.motor-pos-cmd => pid.0.command mux2.0.in1
net stepgen0fb hm2_[HOSTMOT2](BOARD).0.stepgen.00.position-fb mux2.0.in0
net motor.00.pos-fb <= mux2.0.out joint.0.motor-pos-fb pid.0.feedback
net motor.00.command pid.0.output hm2_[HOSTMOT2](BOARD).0.stepgen.00.velocity-cmd
net ioerror hm2_[HOSTMOT2](BOARD).0.packet-error mux2.0.sel

This hal section uses the packet error pin to select the commanded position
rather that the [stale] feedback position for loop feedback in the event of packet
loss
 
Last edit: 17 Oct 2022 17:37 by PCW.
The following user(s) said Thank You: arvidb, tommylight, bensttech

Please Log in or Create an account to join the conversation.

More
17 Oct 2022 19:14 #254351 by bensttech
Ok so next steps? There are still to many variables to figure out if its kernel or linuxcnc its self.
Has anyone got 2.9 running a mesa Ethernet card 7i95 or 96 etc stably?

I looked at building 2.8 for deb-11 to see if its in CNC or the OS but i looks like this is not really possible.

I looked at getting the 4.19 series kernel in deb-11 and that is not easy either, would have to be a source config and build.

Please Log in or Create an account to join the conversation.

More
17 Oct 2022 20:30 #254355 by tommylight
The following user(s) said Thank You: arvidb, bensttech

Please Log in or Create an account to join the conversation.

More
17 Oct 2022 21:04 #254366 by PCW
Yes 4.19 is a better choice on most hardware but
5.10 and 6.0 work for me on my test CPU (HP Elite 800).

I dont think the LinuxCNC version is related as there have been no
significant upper level driver changes to hm2_eth for quite a while

( Though this reminds me that I should add and accumulated error count )

Please Log in or Create an account to join the conversation.

Moderators: PCWjmelson
Time to create page: 0.101 seconds
Powered by Kunena Forum