ColorCNC Colorlight 5A-75E/5A-75B as FPGA controller board
18 Aug 2022 10:22 #250009
by TOLP2
Replied by TOLP2 on topic ColorCNC Colorlight 5A-75E/5A-75B as FPGA controller board
It is correct that packages have to travel through the network, but that travel takes most of the time in the loop. I've cross-checked my code for the communication with the EtherBone protocol with the other drivers and there is no difference; it is implemented everywhere the same. Therefore it seems strange ti me that my driver performs worse then the others.
However, there is a remote chance that the size of the package adversely affects performance. In my case the package is larger (60 bytes of data + headers) then the other drivers (20-ish bytes of data + headers) for the same functionality. The core difference is the implementation of the stepgen, where more variables can be changed during the loop (acceleration and timings can be variable in my implementation).
Tonight I'm going to vary the size of the package to determine how the size of the package affects the speed of sending and receiving the data. two scenario's are possible:
1. The speed is linear with the data. In this case it is either accept the timings or revamp the driver to send less data.
2. The speed is non-linear with the data (i.e. exponential). In this case it might be beneficial to send several smaller packets instead of one bigger one.
We will see.
However, there is a remote chance that the size of the package adversely affects performance. In my case the package is larger (60 bytes of data + headers) then the other drivers (20-ish bytes of data + headers) for the same functionality. The core difference is the implementation of the stepgen, where more variables can be changed during the loop (acceleration and timings can be variable in my implementation).
Tonight I'm going to vary the size of the package to determine how the size of the package affects the speed of sending and receiving the data. two scenario's are possible:
1. The speed is linear with the data. In this case it is either accept the timings or revamp the driver to send less data.
2. The speed is non-linear with the data (i.e. exponential). In this case it might be beneficial to send several smaller packets instead of one bigger one.
We will see.
Please Log in or Create an account to join the conversation.
18 Aug 2022 17:56 #250027
by romanetz
Replied by romanetz on topic ColorCNC Colorlight 5A-75E/5A-75B as FPGA controller board
@TOLP2 please pay attention to colorlight.py, lines 43-49:
self.add_etherbone(
phy=self.ethphy,
mac_address=config.etherbone.mac_address,
ip_address=str(config.etherbone.ip_address),
buffer_depth=255,
data_width=32
)
this increases available working frequency in timing report from 43 MHz to 115 MHz in rgmii clock domain
self.add_etherbone(
phy=self.ethphy,
mac_address=config.etherbone.mac_address,
ip_address=str(config.etherbone.ip_address),
buffer_depth=255,
data_width=32
)
this increases available working frequency in timing report from 43 MHz to 115 MHz in rgmii clock domain
Please Log in or Create an account to join the conversation.
18 Aug 2022 21:37 #250040
by TOLP2
Replied by TOLP2 on topic ColorCNC Colorlight 5A-75E/5A-75B as FPGA controller board
Changing the data-size of the packets did not affect the timing of the loop that much. It will definitely not optimize the code bij splitting in multiple chunks. Setting the data_width to 32-bits seems to improve the read speed from the device, write speed is unaffected.
The timings I posted earlier are affected by the time it is required to make the timings. PWM does not read anything back from the FPGA and therefore has an empty function. I assume this function is optimized by the compiler (should really test this assumption). This gives an estimation of the time used by the timing function it self, this is in the order of magnitude of 3000000 ns per 1000 loops.
About uncommanded movement: I was able to reproduce it this evening finally. Have to investigate the cause of it further (I would assume the machine to be at standstill).
As soon as LinuxCNC runs, packets will be send to the FPGA; thus the Watchdog does not bite when the machine goes into E-Stop (why was I thinking this). This means the E-Stop should be communicated to the FPGA by disabling all movement at the FPGA (safety first).
The timings I posted earlier are affected by the time it is required to make the timings. PWM does not read anything back from the FPGA and therefore has an empty function. I assume this function is optimized by the compiler (should really test this assumption). This gives an estimation of the time used by the timing function it self, this is in the order of magnitude of 3000000 ns per 1000 loops.
About uncommanded movement: I was able to reproduce it this evening finally. Have to investigate the cause of it further (I would assume the machine to be at standstill).
As soon as LinuxCNC runs, packets will be send to the FPGA; thus the Watchdog does not bite when the machine goes into E-Stop (why was I thinking this). This means the E-Stop should be communicated to the FPGA by disabling all movement at the FPGA (safety first).
Please Log in or Create an account to join the conversation.
20 Aug 2022 21:19 #250129
by muvideo
I finally was able to put together a working machine around my board and the firmware.
I'm not sure about the source of your disconnection problem, but I have a pair of ideas.
Surely in future the error messaging need to be updated, because now is too crude to be
meaningful.
In the meantime I'm comparing your ini and hal files with mine, I have noticed a pair of things.
First I would strongly suggest to not use directly joint.n.vel-cmd as input for stepgen velocity command.
I have had problem with this because joint.n.vel-cmd can have some discontinuities, for example in my case there is backlash compensation.
The correct approach should be to use joint.n.motor-pos-cmd as input.
Since this is a position signal, the solution is to use a pid that takes the joint motor position as input and gives back the requested motor velocity as output.
So you will have joint.n.motor-pos-cmd -> pid.n.command pid.n.output->Lcnc.00.stepgen.nn.vel-cmd
A basic pid setup shoud be simple just to have the system going:
first order feed-forward: FF1=1 all others zero
Pgain high, but significantly lower than the value at wich the motor start self oscillating
I term some value that zeroes the residual error fast enough.
This removed the strange random joint following error that I was experiencing.
Second Is the enabling-disabling of the system, I've made it somewhat different than you,
honestly don't know if this has any influence on your system behavior, but:
in my system I'm using iocontrol.0.user-enable-out connected to Lcnc.00.enable
then iocontrol.0.user-request-enable to Lcnc.00.enable-request to request the enabling of the system.
Lcnc.00.enabled is the feedback that I'm using for powering the hardware and
(negated) used also for resetting the stepgens in case of problems.
In my case I have an input from external drivers alarm that will stop the machine and reset the stepgens also, it is ANDed with Lcnc.00.enabled.
You can use as reference the files I've uploaded here:
github.com/faeboli/Lcnc/tree/master/support_files/LcncMill
Replied by muvideo on topic ColorCNC Colorlight 5A-75E/5A-75B as FPGA controller board
@svb:
svb please post you hal file. I can try testing Fabio firmware
I finally was able to put together a working machine around my board and the firmware.
I'm not sure about the source of your disconnection problem, but I have a pair of ideas.
Surely in future the error messaging need to be updated, because now is too crude to be
meaningful.
In the meantime I'm comparing your ini and hal files with mine, I have noticed a pair of things.
First I would strongly suggest to not use directly joint.n.vel-cmd as input for stepgen velocity command.
I have had problem with this because joint.n.vel-cmd can have some discontinuities, for example in my case there is backlash compensation.
The correct approach should be to use joint.n.motor-pos-cmd as input.
Since this is a position signal, the solution is to use a pid that takes the joint motor position as input and gives back the requested motor velocity as output.
So you will have joint.n.motor-pos-cmd -> pid.n.command pid.n.output->Lcnc.00.stepgen.nn.vel-cmd
A basic pid setup shoud be simple just to have the system going:
first order feed-forward: FF1=1 all others zero
Pgain high, but significantly lower than the value at wich the motor start self oscillating
I term some value that zeroes the residual error fast enough.
This removed the strange random joint following error that I was experiencing.
Second Is the enabling-disabling of the system, I've made it somewhat different than you,
honestly don't know if this has any influence on your system behavior, but:
in my system I'm using iocontrol.0.user-enable-out connected to Lcnc.00.enable
then iocontrol.0.user-request-enable to Lcnc.00.enable-request to request the enabling of the system.
Lcnc.00.enabled is the feedback that I'm using for powering the hardware and
(negated) used also for resetting the stepgens in case of problems.
In my case I have an input from external drivers alarm that will stop the machine and reset the stepgens also, it is ANDed with Lcnc.00.enabled.
You can use as reference the files I've uploaded here:
github.com/faeboli/Lcnc/tree/master/support_files/LcncMill
Please Log in or Create an account to join the conversation.
23 Aug 2022 18:52 #250276
by svb
Replied by svb on topic ColorCNC Colorlight 5A-75E/5A-75B as FPGA controller board
Hello Fabio.
I'm so sorry but with your config's maximum working time before error only 25 minutes. With my config's this time more than 4hours
I'm so sorry but with your config's maximum working time before error only 25 minutes. With my config's this time more than 4hours
Please Log in or Create an account to join the conversation.
24 Aug 2022 14:03 - 24 Aug 2022 14:05 #250325
by muvideo
Hi,
I found a mistake in my handling of incoming data: in case I had a receive timeout I keep reading data, also if they are garbled.
This results in joint following errors in case of packet loss.
I've updated the git repository with a fix, I also made some small changes in error messaging.
Can you try last version of driver github.com/faeboli/Lcnc ?
If it behaves better, you can change the tx_retry parameter back to some sensible number, depending on your link robustness.
I'm working with 5, but you have to monitor number of "normal" packets loss you see in your setup and use a value reasonably higher, in order to not have an error in normal use, but be sure that in case of a big link disconnection you are notified.
Please let me know what changes you will notice in behavior.
Fabio
Replied by muvideo on topic ColorCNC Colorlight 5A-75E/5A-75B as FPGA controller board
Hello Fabio.
I'm so sorry but with your config's maximum working time before error only 25 minutes. With my config's this time more than 4hours
Hi,
I found a mistake in my handling of incoming data: in case I had a receive timeout I keep reading data, also if they are garbled.
This results in joint following errors in case of packet loss.
I've updated the git repository with a fix, I also made some small changes in error messaging.
Can you try last version of driver github.com/faeboli/Lcnc ?
If it behaves better, you can change the tx_retry parameter back to some sensible number, depending on your link robustness.
I'm working with 5, but you have to monitor number of "normal" packets loss you see in your setup and use a value reasonably higher, in order to not have an error in normal use, but be sure that in case of a big link disconnection you are notified.
Please let me know what changes you will notice in behavior.
Fabio
Last edit: 24 Aug 2022 14:05 by muvideo.
The following user(s) said Thank You: svb
Please Log in or Create an account to join the conversation.
25 Aug 2022 20:06 - 25 Aug 2022 20:08 #250443
by svb
Replied by svb on topic ColorCNC Colorlight 5A-75E/5A-75B as FPGA controller board
Hello Fabio.
Try new version.
After first error machine go to OFF state.
All errors after first is a in IDLE mode (Machine OFF)
Try new version.
After first error machine go to OFF state.
All errors after first is a in IDLE mode (Machine OFF)
Attachments:
Last edit: 25 Aug 2022 20:08 by svb.
Please Log in or Create an account to join the conversation.
26 Aug 2022 11:54 #250505
by TOLP2
Replied by TOLP2 on topic ColorCNC Colorlight 5A-75E/5A-75B as FPGA controller board
For Litex-CNC I've found that the uncommanded movement is not due to the driver. Somehow when a joint follow error occurs, LinuxCNC goes into EStop. However the motion planner (motmod) is still commanding new positions to the driver, thus it keeps moving. This is a really weird error. The only way how I see I can solve this, is by preventing a joint follow error. But if an error occurs, I don't know what will happen.
I think the source of the joint follow error is due to the acceleration speed of the driver is the same as the acceleration of the motion planner. Due to rounding errors, the commanded speed cannot be reached within one cycle and thus the driver is trying its best to compensate. There are two possible problems to this problem:
Backup plan
I still like the flexibility of my version of the driver, in such away that you can make it tailormade to the machine based on a config-file. If above is not working, my plan is:
I think the source of the joint follow error is due to the acceleration speed of the driver is the same as the acceleration of the motion planner. Due to rounding errors, the commanded speed cannot be reached within one cycle and thus the driver is trying its best to compensate. There are two possible problems to this problem:
- Add a contingency in the driver (let's say 10%) on the acceleration.
- Follow the hostmot2 (Mesa cards) approach and allow for an acceleration of 0, which means that the acceleration is completely handled by the motion planner.
Backup plan
I still like the flexibility of my version of the driver, in such away that you can make it tailormade to the machine based on a config-file. If above is not working, my plan is:
- to base a template on LCNC or ColorCNC;
- create a Python-script which makes the source code of a driver based on the template and the settings in the JSON-file;
- compile the dedicated driver.
The following user(s) said Thank You: svb
Please Log in or Create an account to join the conversation.
09 Sep 2022 20:17 #251570
by TOLP2
Replied by TOLP2 on topic ColorCNC Colorlight 5A-75E/5A-75B as FPGA controller board
Quick update: found that the previous stepgen had an error in the algorithm, which prevented the errors to be smoothed out. Hence, a moving machine.
At this moment I've developed a new algorithm, which will try to smooth out any error in two cycles. Imagine a machine which is at standstill and has an error of 1 mm. In the first step the machine is accelerated to compensate half of the error. At the start of the second cycle the machine has thus speed. In the second cycle we decelerate to stand-still and correcting the second half.
In theory above works, in practice I ran in to some troubles:
- the stepgen sometimes reports erronous values for the speed and position back (much higher then commanded);
- while not running a program, the motion planner of LinuxCNC suddenly starts to send out oscillating motion commands (which are followed by the stepgen). Why these oscillations occur is unknown to me, maybe it has something to do with the feedback from the stepgen. Any help or suggestions on this subject are more than welcome/
At this moment I've developed a new algorithm, which will try to smooth out any error in two cycles. Imagine a machine which is at standstill and has an error of 1 mm. In the first step the machine is accelerated to compensate half of the error. At the start of the second cycle the machine has thus speed. In the second cycle we decelerate to stand-still and correcting the second half.
In theory above works, in practice I ran in to some troubles:
- the stepgen sometimes reports erronous values for the speed and position back (much higher then commanded);
- while not running a program, the motion planner of LinuxCNC suddenly starts to send out oscillating motion commands (which are followed by the stepgen). Why these oscillations occur is unknown to me, maybe it has something to do with the feedback from the stepgen. Any help or suggestions on this subject are more than welcome/
Please Log in or Create an account to join the conversation.
16 Sep 2022 20:45 #252120
by TOLP2
Replied by TOLP2 on topic ColorCNC Colorlight 5A-75E/5A-75B as FPGA controller board
FINALLY
I have resolved the bugs in the stepgen of Litex-CNC. It was a bit harder then I initially thought it would be. The things which I have changed:
If you want to try it out, the code is available on Github ..For now the working stepgen is located in the branch stepgen_improvement, but I will merge it after some more testing.
I have resolved the bugs in the stepgen of Litex-CNC. It was a bit harder then I initially thought it would be. The things which I have changed:
- There is now a config phase which is executed during the first loop. During this phase the timings for the stepgen are sent to the FPGA and all related parameters are calculated. Positive effect is that the speed of especially the write function is improved. Con is that the timings (like step length, dir hold time, etc) cannot be changed during runtime, but who does that anyways?
- During the config phase it is checked whether the read and write functions are in the recommended order (first read, do something with the data, then write the commands back). A warning will be generated when the write function is executed before the read function.
- The stepgen component is essentially completely rewritten:
- The contributions to the required acceleration is split between the commanded movement and movement required to compensate for errors (if any).
- Errors are compensated in two equal steps (theoretically). First cycle is ramping up speed, second one is ramping down to the desired speed as demanded from the position command. The speed of the second step is seamless to the desired speed.
- To calculate the feedback correctly the resolution of the counters are being taken into account. The calculated speed and speed are converted to binary (as required for the FPGA) and then converted back to float for the feedback values. Especially at low speeds and situations with a small number of steps per unit this improved the accuracy of the feedback loop tremendously.
If you want to try it out, the code is available on Github ..For now the working stepgen is located in the branch stepgen_improvement, but I will merge it after some more testing.
Please Log in or Create an account to join the conversation.
Time to create page: 0.192 seconds