Segfault on 2.9 with kernel 5.4.258-rtai-amd64
18 Jan 2024 02:58 #290985
by natester
Segfault on 2.9 with kernel 5.4.258-rtai-amd64 was created by natester
Hi,
I'm setting up a new machine with a parallel port output. I tested both the preempt-rt and rtai kernels and get 1/10th of the jitter on RTAI than on preempt-rt.
I setup a new Debian 12 installation (gnome) and installed linuxcnc as described here: gist.github.com/nathantsoi/e892965b2d97f735a51b707d4a3b1eff
Latency test and `halrun -I -f ptest.hal` both work fine.
However, when I try to launch a basic config (generated by stepconf), I get a segfault.
I've attached both the dmseg output as well as the result of the launch command: `linuxcnc -d -v /home/cnc/linuxcnc/configs/router-axis/router.ini`
I'm not sure what I'm missing here. Any direction is greatly appreciated.
I'm setting up a new machine with a parallel port output. I tested both the preempt-rt and rtai kernels and get 1/10th of the jitter on RTAI than on preempt-rt.
I setup a new Debian 12 installation (gnome) and installed linuxcnc as described here: gist.github.com/nathantsoi/e892965b2d97f735a51b707d4a3b1eff
Latency test and `halrun -I -f ptest.hal` both work fine.
However, when I try to launch a basic config (generated by stepconf), I get a segfault.
I've attached both the dmseg output as well as the result of the launch command: `linuxcnc -d -v /home/cnc/linuxcnc/configs/router-axis/router.ini`
I'm not sure what I'm missing here. Any direction is greatly appreciated.
Please Log in or Create an account to join the conversation.
- tommylight
- Away
- Moderator
Less
More
- Posts: 18725
- Thank you received: 6295
18 Jan 2024 05:07 #290991
by tommylight
Replied by tommylight on topic Segfault on 2.9 with kernel 5.4.258-rtai-amd64
Would be nice to have the debug.txt file also, but this seems like a clue
qtvcp -ini /home/cnc/linuxcnc/configs/router/router.ini qtdragon
munmap_chunk(): invalid pointer
Although, i would venture a guess at QT having issues with Wayland since you are using GNOME, so switching to XFCE would be the first thing to try.
Those can be installed without reinstalling and can be switched over at any time via logout/login .
qtvcp -ini /home/cnc/linuxcnc/configs/router/router.ini qtdragon
munmap_chunk(): invalid pointer
Although, i would venture a guess at QT having issues with Wayland since you are using GNOME, so switching to XFCE would be the first thing to try.
Those can be installed without reinstalling and can be switched over at any time via logout/login .
Please Log in or Create an account to join the conversation.
18 Jan 2024 07:22 #291000
by rodw
Replied by rodw on topic Segfault on 2.9 with kernel 5.4.258-rtai-amd64
Don't use gnome (or use it in X mode)
Please Log in or Create an account to join the conversation.
18 Jan 2024 13:02 - 18 Jan 2024 14:08 #291021
by natester
Replied by natester on topic Segfault on 2.9 with kernel 5.4.258-rtai-amd64
Good idea.
I switched to xfce and also from qtdragon to axis, but I am still getting the segfault (logs attached).
"/usr/bin/linuxcnc: line 977: 4507 Segmentation fault $EMCDISPLAY -ini "$INIFILE" $EMCDISPLAYARGS $EXTRA_ARGS"
Also, I'm not sure why, but about 50% of the time, I also get a kernel panic along with the segfault:
```
Jan 18 07:55:06 router kernel: axis[4507]: segfault at 28 ip 00007fe522d03573 sp 00007ffc68980250 error 4 in libc.so.6[7fe522c92000+155000]
Jan 18 07:55:06 router kernel: Code: 41 f0 49 39 f0 0f 84 a4 05 00 00 4d 8b 49 08 48 83 c8 01 4d 8b 69 08 41 f6 c5 04 0f 85 d1 0a 00 00 4c 39 e8 0f 83 29 06 00 00 <48> 8b 46 28 66 48 0f 6e c6 66 48 0f 6e c8 66 0f 6c c1 0f 11 42 20
Jan 18 07:55:06 router kernel: hal_manualtoolc[4495]: segfault at 10 ip 00007f00bfbf7489 sp 00007ffd74306c10 error 4 in libc.so.6[7f00bfb86000+155000]
Jan 18 07:55:06 router kernel: Code: 8b 4e 08 49 39 c8 0f 82 7d 03 00 00 48 83 f9 0f 0f 86 73 03 00 00 4c 8b 06 49 83 e0 f8 49 39 c0 0f 85 db 05 00 00 4c 8b 42 18 <49> 3b 50 10 0f 85 fd 04 00 00 4c 39 7a 10 0f 85 f3 04 00 00 f6 c1
```
Would it help if I built linuxcnc from source? I could also test the 4.x RTAI kernel this way. I think I can also get the trace for the segfault if I do this, right?
Update: I also built from source and tried again, but now I'm getting a different (malloc) error. Any suggestions on how I debug this?
```
Starting TASK program: milltask
+ program_available milltask
+ type -path milltask
+ NUM=1
+ halcmd loadusr -Wn inihal milltask -ini /home/cnc/linuxcnc/configs/router-axis/router-axis.ini
++ inivar -ini /home/cnc/linuxcnc/configs/router-axis/router-axis.ini -var HALCMD -sec HAL -num 1
+ HALCOMMAND=
+ ''
+ halcmd start
+ run_applications
+ NUM=1
malloc(): unsorted double linked list corrupted
+ APPFILE=
+ ''
+ return
+ echo 'Starting DISPLAY program: axis'
Starting DISPLAY program: axis
```
I switched to xfce and also from qtdragon to axis, but I am still getting the segfault (logs attached).
"/usr/bin/linuxcnc: line 977: 4507 Segmentation fault $EMCDISPLAY -ini "$INIFILE" $EMCDISPLAYARGS $EXTRA_ARGS"
Also, I'm not sure why, but about 50% of the time, I also get a kernel panic along with the segfault:
```
Jan 18 07:55:06 router kernel: axis[4507]: segfault at 28 ip 00007fe522d03573 sp 00007ffc68980250 error 4 in libc.so.6[7fe522c92000+155000]
Jan 18 07:55:06 router kernel: Code: 41 f0 49 39 f0 0f 84 a4 05 00 00 4d 8b 49 08 48 83 c8 01 4d 8b 69 08 41 f6 c5 04 0f 85 d1 0a 00 00 4c 39 e8 0f 83 29 06 00 00 <48> 8b 46 28 66 48 0f 6e c6 66 48 0f 6e c8 66 0f 6c c1 0f 11 42 20
Jan 18 07:55:06 router kernel: hal_manualtoolc[4495]: segfault at 10 ip 00007f00bfbf7489 sp 00007ffd74306c10 error 4 in libc.so.6[7f00bfb86000+155000]
Jan 18 07:55:06 router kernel: Code: 8b 4e 08 49 39 c8 0f 82 7d 03 00 00 48 83 f9 0f 0f 86 73 03 00 00 4c 8b 06 49 83 e0 f8 49 39 c0 0f 85 db 05 00 00 4c 8b 42 18 <49> 3b 50 10 0f 85 fd 04 00 00 4c 39 7a 10 0f 85 f3 04 00 00 f6 c1
```
Would it help if I built linuxcnc from source? I could also test the 4.x RTAI kernel this way. I think I can also get the trace for the segfault if I do this, right?
Update: I also built from source and tried again, but now I'm getting a different (malloc) error. Any suggestions on how I debug this?
```
Starting TASK program: milltask
+ program_available milltask
+ type -path milltask
+ NUM=1
+ halcmd loadusr -Wn inihal milltask -ini /home/cnc/linuxcnc/configs/router-axis/router-axis.ini
++ inivar -ini /home/cnc/linuxcnc/configs/router-axis/router-axis.ini -var HALCMD -sec HAL -num 1
+ HALCOMMAND=
+ ''
+ halcmd start
+ run_applications
+ NUM=1
malloc(): unsorted double linked list corrupted
+ APPFILE=
+ ''
+ return
+ echo 'Starting DISPLAY program: axis'
Starting DISPLAY program: axis
```
Last edit: 18 Jan 2024 14:08 by natester.
Please Log in or Create an account to join the conversation.
- tommylight
- Away
- Moderator
Less
More
- Posts: 18725
- Thank you received: 6295
18 Jan 2024 19:30 #291054
by tommylight
Replied by tommylight on topic Segfault on 2.9 with kernel 5.4.258-rtai-amd64
OK, this now looks like memory or memory controller issues, so if the PC has more than one memory module, yank them out and test them one by one.
Please Log in or Create an account to join the conversation.
21 Jan 2024 13:38 #291257
by natester
Replied by natester on topic Segfault on 2.9 with kernel 5.4.258-rtai-amd64
Ok, I tried pulling each module (of 2) using them individually in each socket. This didn't help.
I ended up reverting to 2.8 with the 4.x preempt rt kernel and it's working ok, jitter spikes to 60k occasionally, but is usually around 25k.
For reference, on 2.9 with the 5.x rtai kernel I was getting ~1k jitter.
I'll try 2.8 with the 4.x rtai kernel next.
I ended up reverting to 2.8 with the 4.x preempt rt kernel and it's working ok, jitter spikes to 60k occasionally, but is usually around 25k.
For reference, on 2.9 with the 5.x rtai kernel I was getting ~1k jitter.
I'll try 2.8 with the 4.x rtai kernel next.
The following user(s) said Thank You: tommylight
Please Log in or Create an account to join the conversation.
22 Jan 2024 01:24 #291298
by MiniDemi
Replied by MiniDemi on topic Segfault on 2.9 with kernel 5.4.258-rtai-amd64
Hello, for what it's worth I've been having the same issue, although I followed a different installation procedure.
First, I installed the linuxcnc 2.9.2 iso. Then, I wanted to apt update and upgrade the system, but it gave me some errors related to raspi-firmware, but I solved it following another post . Next, I installed the packages on the downloads page (linuxcnc for RTAI, RTAI kernel and RTAI modules) and installed them with gdebi, having previously removed the linuxcnc-uspace package and the documentation.
Once everything is installed I reboot the system but during boot I see that the systemd-binfmt.service fails to load up properly.
Finally, when I launch some Stepconf generated linuxcnc instance, it crashes with the same segmentation fault as yours.
However, when I go to LinuxCNC Configuration Selector and load the Sample Configuration of axis it loads up perfectly.
I will keep looking for solutions, although I believe that the crash is related to the binfmt service not working properly.
First, I installed the linuxcnc 2.9.2 iso. Then, I wanted to apt update and upgrade the system, but it gave me some errors related to raspi-firmware, but I solved it following another post . Next, I installed the packages on the downloads page (linuxcnc for RTAI, RTAI kernel and RTAI modules) and installed them with gdebi, having previously removed the linuxcnc-uspace package and the documentation.
Once everything is installed I reboot the system but during boot I see that the systemd-binfmt.service fails to load up properly.
Finally, when I launch some Stepconf generated linuxcnc instance, it crashes with the same segmentation fault as yours.
However, when I go to LinuxCNC Configuration Selector and load the Sample Configuration of axis it loads up perfectly.
I will keep looking for solutions, although I believe that the crash is related to the binfmt service not working properly.
Please Log in or Create an account to join the conversation.
Time to create page: 0.083 seconds