Use LLVM Polly for better optimizations
LLVM opt has additional Polly options:
Polly is a high-level loop and data-locality optimizer and optimization infrastructure for LLVM. It uses an abstract mathematical representation based on integer polyhedra to analyze and optimize the memory access pattern of a program. We currently perform classical loop transformations, especially tiling and loop fusion to improve data-locality. Polly can also exploit OpenMP level parallelism, expose SIMDization opportunities. Work has also be done in the area of automatic GPU code generation.
Polly Options:
Configure the polly loop optimizer
--polly - Enable the polly optimizer (with -O1, -O2 or -O3)
--polly-2nd-level-tiling - Enable a 2nd level loop of loop tiling
--polly-ast-print-accesses - Print memory access functions
--polly-context=<isl parameter set> - Provide additional constraints on the context parameters
--polly-dce-precise-steps=<int> - The number of precise steps between two approximating iterations. (A value of -1 schedules another approximation stage before the actual dead code elimination.
--polly-delicm-max-ops=<int> - Maximum number of isl operations to invest for lifetime analysis; 0=no limit
--polly-detect-full-functions - Allow the detection of full functions
--polly-dump-after - Dump module after Polly transformations into a file suffixed with "-after"
--polly-dump-after-file=<string> - Dump module after Polly transformations to the given file
--polly-dump-before - Dump module before Polly transformations into a file suffixed with "-before"
--polly-dump-before-file=<string> - Dump module before Polly transformations to the given file
--polly-enable-simplify - Simplify SCoP after optimizations
--polly-ignore-func=<string> - Ignore functions that match a regex. Multiple regexes can be comma separated. Scop detection will ignore all functions that match ANY of the regexes provided.
--polly-isl-arg=<argument> - Option passed to ISL
--polly-on-isl-error-abort - Abort if an isl error is encountered
--polly-only-func=<string> - Only run on functions that match a regex. Multiple regexes can be comma separated. Scop detection will run on all functions that match ANY of the regexes provided.
--polly-only-region=<identifier> - Only run on certain regions (The provided identifier must appear in the name of the region's entry block
--polly-only-scop-detection - Only run scop detection, but no other optimizations
--polly-optimized-scops - Polly - Dump polyhedral description of Scops optimized with the isl scheduling optimizer and the set of post-scheduling transformations is applied on the schedule tree
--polly-parallel - Generate thread parallel code (isl codegen only)
--polly-parallel-force - Force generation of thread parallel code ignoring any cost model
--polly-pattern-matching-based-opts - Perform optimizations based on pattern matching
--polly-postopts - Apply post-rescheduling optimizations such as tiling (requires -polly-reschedule)
--polly-pragma-based-opts - Apply user-directed transformation from metadata
--polly-pragma-ignore-depcheck - Skip the dependency check for pragma-based transformations
--polly-process-unprofitable - Process scops that are unlikely to benefit from Polly optimizations.
--polly-register-tiling - Enable register tiling
--polly-report - Print information about the activities of Polly
--polly-reschedule - Optimize SCoPs using ISL
--polly-show - Highlight the code regions that will be optimized in a (CFG BBs and LLVM-IR instructions)
--polly-show-only - Highlight the code regions that will be optimized in a (CFG only BBs)
--polly-stmt-granularity=<value> - Algorithm to use for splitting basic blocks into multiple statements
=bb - One statement per basic block
=scalar-indep - Scalar independence heuristic
=store - Store-level granularity
--polly-target=<value> - The hardware to target
=cpu - generate CPU code
--polly-tiling - Enable loop tiling
--polly-vectorizer=<value> - Select the vectorization strategy
=none - No Vectorization
=polly - Polly internal vectorizer
=stripmine - Strip-mine outer loops for the loop-vectorizer to trigger
It would be great if c3c would also use Polly.
Polly is pretty new (comparatively speaking) and the LTO/ThinLTO part isn't integrated yet, nor is lld currently correctly working well on all CI targets. I would prefer to not include polly until it's actually used in the compiler.
Polly is pretty new
Hmm?
commit 758053788bde4747953f5f276ded345cd01323b1 Author: Tobias Grosser [email protected] Date: Fri Apr 29 06:27:02 2011 +0000
Add initial version of Polly
This version is equivalent to commit ba26ebece8f5be84e9bd6315611d412af797147e in the old git repository.
llvm-svn: 130476
Yeah, it's still not part of the main LLVM libraries I believe, and not all benchmarks necessarily show improvements, although some do.
clang has -mllvm <value> option. It would be nice to have it in c3c. :)
Unfortunately -mllvm is implemented by Clang, so all that functionality would need to be implemented by hand if added.