image.png

Currently, the model somewhat forces the output to have a fixed sequence of 15, yet it’s able to parse the input data in different sequence. You could record some new dynamic gesture in different speed(maybe some quick ones with short detection window) and see if the current model behaves good enough. If not, play around with the sequence and see how well it performs.