Agree, although perhaps I was not sufficiently clear in the article when I spoke to libraries. The challenge for libraries is they have to address a whole slew of use cases, e.g. intersecting just a few arrays of 20 items each (which will show almost no performance difference across implementations) or dealing with 10 arrays of 10,000 items each.
Like others, I have found that micro-optimization of applications is rarely useful ... its the architecture that counts. However, frameworks and libraries are different.