I was describing my old M99 algorithm to someone via email recently and decided to put together a working demo by a complete rewrite (the original code is lost on some broken drive in a shoebox somewhere and it's also over a decade old so it was time to re-write and document publicly - finally). The souce code is here on github.
M99 might be almost two decades old now but it's still very fast and efficient compared to more modern works. Plus it's a worthwhile alternative to Huffman in a lot of cases and achieves way better compression. I still believe that a specialized version designed for 8x8 DCT blocks would be very successful both in terms of encoding/decoding speed as well as overall compression. But that's a story for another day. Now I can get back to MSufsort 4 and also a clean re-write of M03 - if only there were enough time!
It took a lot longer than I wanted, although not long than I had anticipated, to produce a version of MSufSort which was fully parallel. There is still work to be done to make it worthy of 'Return of The King' status but I'm certainly happy that a lot of the road map has been achieved. There is a single component of MSufSort 4 remaining before the algorithm can clearly differentiate itself as best in class again. Specifically, induction sorting. This is a bit odd, really, considering that induced sorting is the single characteristic trait that every version of MSufSort has in common as a core algorithm. At it's heart the entire MSufSort family is a direct comparison, induced sort SACA. So, it is unexpected that it should be the only component of MSufSort 4 which remains on the road map.
The theoretical work has been done for a long time - how to do parallel induced sorting. But there are many variations to try out in practice and I'm not sure that I want to invest my time in that work at this point. I'm much more enthusiastic about Glimpse to be honest. It's not easy to decide that my personal project of a decade and a half is no longer a priority, but it just isn't. I guess it is true that with age comes wisdom. But I've come to realize that Glimpse has far more potential in the long term. It will become a product and not a project.
Anyhow, MSufSort 4/demo is posted on https://github.com/michaelmaniscalco/msufsort.
The chart below graphs the performance of MSufSort 4 on enwik8 using an increasing number of CPUs. The test machine has 4 physical cores. Using virtual cores (hyper threading) has very little effect on performance and so are not presented in the graph. The red line line represents the 'theoretical' best performance assuming that increasing the number of cores from 1 to N would result in an overall decrease in time required to compute the suffix array by a factor of N. The blue line represents the 'actual' performance of MSufSort 4. The chart demonstrates that MSufSort scales almost exactly at the theoretical best as the number of cores increases. I'm reasonably pleased with this performance.
Road map: (* indicates new for version 4)