Ok so we don't want to solve a quadratic equation every time we make a step.

We want all of the moving axis to be move in a synchronized way, and be able to implement acceleration on top of that with a minimal additional cost.

The idea here is to decouple the acceleration from the linear motion, if we pick a fixed step rate and quantize all of the axis motion to that rate, then we could just treat the motion itself as linear and then change that rate independently to implement acceleration.

The nice thing about this is that acceleration calculation is completely decoupled from the step interrupt, this is nice because the acceleration calculation still involves a divide which isn't cheap on the microcontroller.

The requirement of all the axis being synchronized is now also easy, the problem become analogous to a simple line drawing algorithm, it just happens to be in N dimensions which as we'll see later doesn't actually make it any more complicated.

So what should we select as our initial "fixed step rate"?

It must be high enough so that you are not required to do more than one step on any given axis per interrupt.

We could pick some arbitrarily high number, but this would result in interrupts where no steps are emitted and that would limit our maximum speed. And starve the rest of the system of CPU resources.

Instead it's better to pick the longest axis in our multidimensional line, compute a step rate based on than and it's initial velocity, such that it would output exactly one step per interrupt.

We then just need to determine if for any given step on that axis we need to output a step on the other axis as well.

There are a couple of ways to approach this, we could implement a DDA with fixed or floating point math, though this would accumulate error over the length of the line because of the limited number of bits in the representation, or we could just use the Bresenham line algorithm which uses only integer math, and therefore introduces no error in addition to the necessary quantization.

I'll quickly derive the bits of the Bresenham algorithm we need here, but Wikipedia has a pretty nice description if you're interested, though it only discusses two dimensions extending it to n is trivial.

The basic DDA is something like this for a stepper driver

// Primary axis is trivially the longest one
select a primary axis P

// Initialize
foreach axis A
Ad = (Aend -Astart)/(Pend-Pstart)
Apos = Astart+0.5; //Center of the "pixel"

// Steps
foreach step in P
foreach axis A
Aposnew = Apos + Ad
if (Aposnew != Apos)
output step for this axis
Apos = Aposnew
It's a relatively easy to realize you don't actually need to track positions, just the fractional portions of them and you end up with.

// Primary axis is trivially the longest one
select a primary axis P

// Initialize
foreach axis A
Ad = abs((Aend -Astart)/(Pend-Pstart))
Aerror = 0.5;

// Steps
foreach step in P
foreach axis A
Aerror += Ad
if (Aerror >= 1)
output step for this axis
Aerror -= 1;
Looking at the above algoritm the only error we introduce is computing Ad where we have to divide by (Pend-Pstart), so all Bresenham does is multiply everything through by that and you end up with.

// Primary axis is trivially the longest one
select a primary axis P

// Initialize
Plength = abs(Pend-Pstart)
foreach axis A
Ad = abs(Aend -Astart)
Aerror = Plength >> 1;

// Steps
foreach step in P
foreach axis A
Aerror += Ad
if (Aerror >= Plength)
output step for this axis
Aerror -= Plength

Now we can do everything with integers, no multiplies, our only divide is a divide by two which can be done as a trivial shift (the bottom bit doesn't really matter). All dropping the bottom bit in the divide by two above does is offset the concept of the pixel center very slightly. It would in fact be entirely "correct" to pick any pixel center, but picking 0.5. 0.5 results im more natural looking lines. If its something that you think is significant you can multiply everything through by two.

What's our accuracy with the new approach?

Since we're quantizing steps in the none primary axis to steps on the primary axis, we are accurate to ~1/2 a step.

In practice I think this ends up being more accurate that the original implementations, because of rounding error introduced in the original when converting velocities to frequencies.