LLVM Weekly - #153, Dec 5th 2016

Welcome to the one hundred and fifty-third issue of LLVM Weekly, a weekly newsletter (published every Monday) covering developments in LLVM, Clang, and related projects. LLVM Weekly is brought to you by Alex Bradbury. Subscribe to future issues at https://llvmweekly.org and pass it on to anyone else you think may be interested. Please send any tips or feedback to asb@asbradbury.org, or @llvmweekly or @asbradbury on Twitter.

I've been at the fifth RISC-V Workshop in Mountain View last week. If you're interested in what went on, you may want to check out my notes from day one and day two.

News and articles from around the web

Farzad Sadeghi has announced mutator, a Misra-C 2004 checker.It is not yet incomplete, but a number of rules have been implemented. The hope is that in the future it will support code transformations as well.

The Zurich LLVM social will be held on Thursday December 8th at ETH Zurch and will feature a talk 'Transparent Live Code Offloading on FPGA' by Alberto Dassatti. Please register if you would like to attend.

On the mailing lists

Hal Finkel posted an RFC that attempts to clarify LLVM/Clang's current linkage behaviour and propose changes to make it more consistent. In short, Clang will quite happily inline functions with global symbols, and this means that techniques such as LD_PRELOAD to insert an alternative implementation may not always work. Hal suggests keeping this behaviour as the default.
Dean Michael Berris has kicked off an RFC discussing what functionality an XRay library should expose. The main questions resolve around trace formats and backwards/forwards compatibility requirements. Renato Golin's has a thoughtful response with some good examples of where other file formats have gone wrong.
Bekket McClane has started a discussion on parallelizing target-independent instruction selection. He proposes an approach to parallelise selection in a fine-grained manner, although the results so far are not promising. Some of following responses suggest there may not be much to be gained from parallelising this part of the compilation process.
Oren Ben Simhon has posted an RFC on supporting the x86-64 vectorcall calling convention. This uses more registers for arguments than the default or __fastcall conventions.
Francesco Petrogalli has shared details of some recent proposed patches to enable the vectorisation of loops that have function calls to routines marked with #pragma omp declare simd.

LLVM commits

Compilation time of Chromium with CFI and UBSan has been substantially improved through a new heuristic to handle regular expressions consisting of simple wildcards. Before, compilation of Chromium's content_message_generator.cc with CFI took 44 seconds, and it takes only 23 seconds after. r288303.
Support for optimisation remarks was added to Global Value Numbering. r288370.
llvm-objdump now supports the wasm (Web Assembly) file format. r288251.
SystemZ has improved its use of conditional instructions and added support for load-and-trap instructions. r288028, r288030.
The llvm-modextract tool was added. This is a simple tool for testing features that rely on multi-module bitcode files. It will extract one of the modules and write it to the output file. r288201.
Support has been committed for profile-based loop peeling. If the profiled trip count is low, then peel the first several iterations. r288274.
The rules for reserved registers have been clarified. r288277.
FrameInstruction has been moved from MachineModuleInfo to MachineFunction. This will require a trivial change for out-of-tree backends. r288291.

Clang commits

Clang gained a whole bunch of unit tests for Linux distribution detection. r288062.
Support was added for constant expression evaluation of wchar_t versions of simple string functions. r288193.

Other project commits

Identical code folding (ICF) in LLD has been parallelised. Because of this, LLD can now link a 1.59GB chromium binary with ICF enabled in 10.28s, vs 40.94s for gold. The commit message includes a detailed discussion of the algorithm used. This parallel ICF has also been ported to COFF, where the Clang benchark goes from 11.73s to 6.94s vs 83.02s for MSVC link. r288373. r288487.
The Scudo Hardened Allocator in compiler-rt now support 32-bit architectures. r288255.
libcxx gained an implementation of C++17 <variant>. r288547.