The Advisory Boar

By Abhijit Menon-Sen <ams@toroid.org>

Who invented the slicing-by-N CRC32 algorithm?

2017-01-20

One of my contributions to Postgres 9.5 (back in 2015) was a two-stage optimisation of the CRC computation code. First, switching to a faster algorithm; and second, to use the Intel SSE4.2 CRC instructions where available. I was delighted to have the opportunity to implement such a dramatic performance improvement (CRC computation used to be at the top of the profile on every streaming replica by some distance).

Optimising something by writing assembly (even if it was only a couple of instructions, later replaced by compiler intrinsics) is always fun, but here the algorithm change was also a substantial improvement, in that it used a lookup table to process eight input bytes at a time. This technique is known as “slicing-by-N” (where N depends on the size of the lookup table), and was originally described here:

Frank L. Berry, Michael E. Kounavis, "Novel Table Lookup-Based Algorithms for High-Performance CRC Generation", IEEE Transactions on Computers, vol. 57, no. , pp. 1550-1560, November 2008, doi:10.1109/TC.2008.85

This paper, having been published in a prestigious IEEE journal, is of course not easily available for download (not when I looked in 2015, and apparently not today). I was able to find what I needed to implement the technique thanks to other reference materials, notably including Stephan Brumme's Fast CRC32 page (now considerably expanded since 2015), but I never actually got to read what Messrs. Kounavis and Berry had to say about their technique.

Recently, I had occasion to look at CRC32 implementations again, and I found a different paper that I had looked at briefly the last time: Cyclic Redundancy Check Generation Using Multiple Lookup Table Algorithms by Indu I. and Manu T.S. from TKM Institute of Technology, Kollam, in Kerala (my mother's home state in South India). I remember noting that there was something odd about the paper, but not having time enough to give it more than a passing glance. This time, I spent a while reading it, and it's certainly very odd.

ABSTRACT: The primary goal of this paper is to generate cyclic redundancy check (CRC) using multiple lookup table algorithms. A compact architecture of CRC algorithm (Slicing-by-N algorithm) based on multiple lookup tables (LUT) approach is proposed. This algorithm can ideally read large amounts of data at a time, while optimizing their memory requirement to meet the constraints of specific computer architectures. The focus of this paper is the comparison of two algorithms. These two algorithms are Slicing by-N-algorithm and Sarwate algorithm, in which slicing by-N-algorithm can read arbitrarily 512 bits at a time, but Sarwate algorithm, which can read only 8 bits at a time. This paper proposes the generation of CRC using slicing by 8 algorithm. In this, message bits are chunked to 8 blocks. All are processed at a time. Proposed Slicing-by-8 algorithm can read 64 bits of input data at a time and it doubles the performance of existing implementations of Sarwate algorithm.

Is this paper claiming to have invented the slicing-by-N algorithm? It's hard to tell from the blather in the abstract, but going through the remaining blather (and effort that, in retrospect, I cannot in good conscience commend to the reader) suggests that this is indeed the case.

Recently time is the major concern. So in order to process large amount of data at a time, Multiple Lookup based approach is more efficient. Multiple Lookup based approach contains five CRC algorithms, called Slicing by-N algorithm (N ϵ 4, 8, 16, 32, 64), which is used to read up to 512 bits at a time. So performance of the system should be increased. Here proposing Slicing by-8 algorithm to read 64 bits at a time. Here proposed an efficient design of CRC generator using Slicing by-N algorithm (N=8). In this algorithm, input message stream is sliced into N slices and each slice has 8 bits. So using this Slicing by-8 algorithm, it can read 64 bits at a time and it triples the performance of existing implementation of Sarwate algorithm.

Oho, so it triples the performance of existing implementations of the Sarwate algorithm, does it? Funny the abstract claims a paltry doubling in performance then. The paper goes on to describe CRC computation with block diagrams, and then has some more blather about VHDL and MATLAB and some screenshots of “simulation waveforms”, all of which seems to amount to showing that the various CRC algorithms produce the same results and that processing more input bytes at a time is faster than not doing so. I made judicious use of the fast-forward button to reach the conclusion, which begins with

The design of CRC generator using Multiple Look Up based approach is proposed. In this paper, slicing by-8 algorithm is designed, and compares this algorithm with the existing algorithms, that is, with Sarwate algorithm and LFSR method.

So yeah, they're claiming in a slightly roundabout way to have invented the slicing-by-8 CRC algorithm. However, the authors cite the Kounavis and Berry paper anyway, perhaps so that any criticism can be blamed on some sort of unfortunate misunderstanding. I didn't find any citations of this paper in the minute or two I spent on the search, but Citeseer and Researchgate link to it, and it's quite prominent in search results, so it's probably only a matter of time before someone cites it.

The paper was published in "International Journal of Modern Engineering Research” (ijmer.com) in 2012; the journal's name alone reminded me of the scammers, Academic Journals Online, whom I encountered a few years ago. IJMER does not, however, appear to be one of the AJO journals. Perhaps it's a second cousin.

Unfortunately, the authors include no contact information in the paper, so I was not able to send them a link to this page.

Strange pricing for books on the Amazon marketplace

2015-12-16

Lars Svensson's classic “Identification Guide to European Passerines” was first published a few decades ago. It is no longer available from Amazon, but I have been keeping an eye on copies from other sellers on the Amazon marketplace, and I am increasingly puzzled by their proposed prices.

Amazon pricing screenshot

The absurdly high price isn't because the book is new, because there's a used copy for sale at $1847.20. Even the cheapest used copy right now is priced at $458.60, and that's still far more than I can imagine anyone wanting to pay.

The sellers don't look shady at first glance, and many are highly rated over a significant period. Maybe they didn't notice that the book is available elsewhere for twenty-odd euro? But no, it's probably an “algorithm” (note: those are scare quotes) at work.

A crocodile selfie

2015-12-04

I was at the UP Bird Festival in Chambal, tempted mostly by the memory of seeing many crocodiles. There were very few crocodiles this time, but we found a medium-sized Mugger Crocodylus porosus sunning itself on the bank towards the end of our trip.

I had promised Hassath that I would take a crocodile selfie but alas, I managed to omit the actual crocodile. The crocodile-shaped object up on the bank is (what else?) a log.

Failed crocodile selfie

The crocodile was there, though, just beyond the edge of the frame.

Actual crocodile

Big brother, the tourist attraction

2015-11-24

Every morning, children stream past our house in both directions on their way to school. There are the nearly grown-up, very self-conscious young ladies on the way to the inter-college, dressed in blue and white with neatly plaited and be-ribboned hair. There are groups of brown-and-white children, always squabbling over some snack. There are tiny red-and-blue primary school kids who drift past like tumbleweed—so easily distracted that it's a marvel that they ever make it to school.

And then there are the troublemakers, the wretched blue-and-brown boys who derive entertainment from pinching the valve-caps off our car tyres, or snapping off the occasional windshield wiper. We stuck a webcam in the window overlooking the car to keep an eye on these miscreants. It worked pretty well. A few of the smaller children still write their names on the windows when the car is dusty, but we haven't lost any more valve caps.

Webcam view

But now the webcam has become a local attraction, and we hear children of every colour walking past talking about the “CCTV”, bringing their friends around to point it out, and waving or posing (or dancing!) for the camera. A blue-and-white pair—not yet as serious as their elder sisters–recently made faces at it and ran away horrified but giggling when I replied with a cheerful “Hi”.

Ubiquitous surveillance? What fun!

On Sarah Sharp leaving Linux development

2015-11-05

A month ago, Sarah Sharp posted to say I'm not a Linux kernel developer any more and I am no longer a part of the Linux kernel community.

She added USB 3.0 support to Linux (via the new xHCI host driver) and contributed in many other ways to the USB stack. She also did her best, in the face of an absurd level of outright hostility, to ask the kernel development community to adopt a more personally respectful tone of communication (which has nothing to do with political correctness, but is just about basic human decency).

It's a real shame that Linux has lost a valuable contributor and member of the (newly-formed) Technical Advisory Board. A shame, but hardly a surprise. If anything, it's surprising that she stuck around long enough to achieve as much as she did.

Incidentally, a Documentation/CodeOfConflict file was added to the kernel in February 2015. Among other (fairly vague and conciliatory) things, it says:

If however, anyone feels personally abused, threatened, or otherwise uncomfortable due to this process, that is not acceptable.

I'm sure the people who wrote this had good intentions, but I wish they could have found wording to make it clear that it's the personal abuse, threats, and other such behaviour that are unacceptable, not feel[ing] … abused, threatened, or otherwise uncomfortable.

Or maybe the wording is fine. Hard to tell, really.

Academic Journals Online is a scam

2013-02-18

Some weeks ago, I received a very spammy-looking email Request for Papers from the editorial team of Academic Journals Online (cfp@academicjournalsonline.co.in). Nine times out of ten, I would have deleted it without a second thought, but something about it annoyed me enough to investigate further.

Academic Journals Online (AJO) is a peer-reviewed online International journals [sic] that publishes manuscripts monthly.

I went to their web site (academicjournalsonline.co.in) and had a quick look around. The journals are all named "International Journal of Trends in …" (Computer Science, Multidisciplinary Engineering, Medical Science, etc.). The site claims repeatedly that they are open-access, but charges INR 500 to see more than the abstract of any paper. (The list of papers was accessible in mid-January, but has been made "Members-only" now.) I found no credible independent references to any of these journals.

Read more…

Translations or linkbait?

2013-02-04

I get more email from readers about my git-website-howto article than about anything else on my web site. I'm a little surprised by this (I always thought people would like the git-central-repo-howto better), but I'm happy to hear from people who found the article useful. (Aside: many of the notes say "Thanks, this was helpful"; some also ask a question or two. A few are incomprehensible requests for assistance, but I can remember only one of those ever turning unpleasant.)

A few people have contacted me to ask for permission to translate the article—into Belarusian, Bulgarian, Brazilian Portuguese, and Ukrainian so far. I know none of these languages, but the requests were polite and did not set off any mental alarms, so I gave permission and added links to the resulting translations.

Last year, a Bulgarian reader wrote to (say thanks for the howto and) tell me that the quality of the Bulgarian translation was terrible. He thought it was probably done mechanically (e.g. using Google Translate), and pointed out that the site where the translation was hosted also had many other dubious translations of technical articles. I took the link down (though it may have been too late to reverse whatever SEO benefit they had enjoyed in the meantime).

Last week, I received a request for permission to translate the article into Ukrainian. It just so happens that I can read the Cyrillic alphabet quite fluently (but I know next to no Russian), so I compared the output from translate.google.com and the translated version I was requested to link to. Surprise! It was almost identical. I wanted to do the same with the Belarusian translation, but the link was dead and redirected to the webhostingrating.net index page.

On the other hand, the Brazilian Portuguese translation by Thiago Belem appears to be genuine. Translating it back to English with Google Translate reveals certain changes in the article, which are certainly not the result of mechanical translation. So it seems unfair to reject all such offers, and I would certainly like to acknowledge the work put in by people doing genuine translations.

This leaves me with many questions:

Comments are welcome, especially from technical authors who have had their work widely translated.

Update (2013-02-05): Viacheslav Tykhanovskyi was kind enough to read through the Ukrainian translation and confirm that it's 100% crap.

Buying a textbook

2012-04-02

I walked down to the market this evening to buy some textbooks for my daughter's new school year. I bought three out of the six needed at the first shop, and another one at the next. The third shop I went to had an old man slowly adding up the prices of a stack of items, while a crowd of customers gathered around.

In the shop was also the man's young daughter, who was doing much of the fetching and carrying. She darted out to ask for some change from a shop nearby, then returned and started dealing with the waiting customers. I asked for the two remaining textbooks, and she checked and said they had only one. Another customer asked for a certain kind of pen, and she went to see if they had any. A third customer gave up on the man and stepped around to her side of the counter.

Throughout all this, the father was (while still adding up numbers) grumbling about her. When she left to find change, he complained that she had not braced the flip-top counter correctly. When she asked him where something was, he would reply as if greatly put upon (paying no heed to the customers). He ignored some of her questions, and snapped at her when she repeated herself to double-check if they had the other textbook.

She brought the textbook from a shelf to the counter. Then she took a sheet of plastic and a roll of sellotape, and covered the book with a few swift, well-practiced movements. She took my hundred-rupee note, and politely asked another customer what they wanted while looking for change in the till. When she couldn't find change, I offered to come back for it later; but she asked if I was sure I didn't need anything else (pen? notebook? file?), and I realised that I could use a new pen.

By this time, her father had finished with the stack, and moved on to the next customer. Then he scolded her for being in his way.

I've always been bad at judging ages, but the girl looked only a couple of years older than my fifteen-year-old daughter. Or perhaps it was her spectacles that made her look older. I smiled and thanked her when she handed me the neatly-wrapped textbook and pen, but she was already turning away to attend to another order.

I never did get that last textbook.

Newton's laws of motion

2011-12-22

Ammu is studying Newton's laws of motion this year in Physics, but she can never remember what the three laws are (partly because they don't seem to be stated clearly in her textbook).

I learned the three laws from a very old-fashioned British textbook that belonged to my great-grandfather—long before I had internet access—so it was a treat to be able to look up Newton's original formulation in the "Philosophiæ naturalis Principia Mathematica" on Gutenberg. After the preface and a series of definitions followed by explanatory notes, the laws of motion are presented in a section entitiled “Axiomata sive Leges Motus” (“Axioms; or, The Laws of Motion”).

Lex. I.

Corpus omne perseverare in statu suo quiescendi vel movendi uniformiter in directum, nisi quatenus a viribus impressis cogitur statum illum mutare.

In other words, “Every body perseveres in its state of rest or uniform motion in a straight line, unless a force upon it compels it to change that state.”

Read more…

Typing faster

2011-10-04

I learned how to type by trial and error on my mother's green Olivetti typewriter, and after a few years of using computers, my four-fingered technique served me well enough that I didn't think about learning how to touch-type "properly".

Over the past few months, however, I've tried to use more fingers efficiently, so that I don't have to move my hands as much. I adopted the conventional finger positions for a week, found they didn't suit my big hands (reaching non-alphanumeric keys on the keyboard was painful). Now I've found a more comfortable position, and I can type accurately and reasonably fast (though not as fast as before) without hurting my hands.

Unfortunately, the price I paid was in losing the ability to fall back to my earlier technique, and the loss of speed—though not an impediment in practice—has rankled.

Today I discovered the perfect cure: ztype

This is a "Space Invaders"-style game where you have to type falling words to eliminate them before they reach you. It has a few annoying bugs (sometimes the word you're typing is hidden behind others), but it's extremely addictive nevertheless.