The top-level contains scripts used to build, run, and compare the results of the Java benchmarks and the APK compilation process statistics. Other tools are available under tools/ for example to gather memory statistics or gather profiling information. See the [Tools][] section.
All scripts must include a --help
or -h
command-line option displaying a useful help message.
For statistical t-test and Wilcoxon tests you will need scipy. On Ubuntu 14.04 you will need the following apt packages: python3-numpy python3-scipy .
Statistics can be obtained with the run.py
script on host with
./run.py
To obtain the results on target, dx
and adb
need to be available in your PATH
. This will be the case if you run from your Android environment.
./run.py --target ./run.py --target=<adb target device>
run.py
provides multiple options.
./run.py --target --iterations=5
./build.sh
On host
cd build/classes java org/linaro/bench/RunBench --help # Run all the benchmarks. java org/linaro/bench/RunBench # Run a specific benchmark. java org/linaro/bench/RunBench benchmarks/micro/Base64 # Run a specific sub-benchmark. java org/linaro/bench/RunBench benchmarks/micro/Base64.Encode # Run the specified class directly without auto-calibration. java benchmarks/micro/Base64
And similarly on target
cd build/ adb push bench.apk /data/local/tmp adb shell "cd /data/local/tmp && dalvikvm -cp /data/local/tmp/bench.apk org/linaro/bench/RunBench" adb shell "cd /data/local/tmp && dalvikvm -cp /data/local/tmp/bench.apk org/linaro/bench/RunBench benchmarks/micro/Base64" adb shell "cd /data/local/tmp && dalvikvm -cp /data/local/tmp/bench.apk org/linaro/bench/RunBench benchmarks/micro/Base64.Encode" adb shell "cd /data/local/tmp && dalvikvm -cp /data/local/tmp/bench.apk benchmarks/micro/Base64"
The results of run.py
can be compared using compare.py
.
./run.py --target --iterations=10 --output-json=/tmp/res1.json ./run.py --target --iterations=10 --output-json=/tmp/res2.json ./compare.py /tmp/res1.json /tmp/res2.json
This repository includes other development tools and utilities.
The run.py
and compare.py
scripts in tools/benchmarks
allow collecting and comparing the run times of the Java benchmarks. The options for these scripts are similar to the API for the top-level scripts. See tools/benchmarks/run.py --help
and tools/benchmarks/compare.py --help
.
The run.py
and compare.py
scripts in tools/compilation_statistics
allow collecting and comparing statistics about the APK compilation process on target. The options for these scripts are similar to the API for the top-level scripts. See tools/compilation_statistics/run.py --help
and tools/compilation_statistics/compare.py --help
.
The tools/perf
directory includes tools to profile the Java benchmarks on target and generate an html output. See tools/perf/PERF.README
for details.
This convert.py
python script converts the .json
output of run.py
scripts into the format required by bm-plotter. bm-plotter
is a tool offering a graphical output representing results. You can generate the result image for example with:
./run.py --target --iterations=10 --output-json=base.json git checkout patch_1 ./run.py --target --iterations=10 --output-json=patch_1.json git checkout patch_2 ./run.py --target --iterations=10 --output-json=patch_2.json ./tools/bm-plotter/convert.py base.json patch_1.json patch_2.json > /tmp/bm_out <path/to/bm-plotter>/plot /tmp/bm_out
Each set of related benchmarks is implemented as a Java class and kept in the benchmarks/ folder.
Before contributing, make sure that test/test.py
passes.
Similar to writing a benchmark, above guidelines also applies to porting an existing benchmark. Besides, developers should also notice:
Licenses: Make sure the benchmark has appropriate license for us to integrate it freely into our test framework. Apache-v2.0, BSD, MIT licenses are well- known and preferred. Check with the gatekeepers for other licenses. The original license header in the ported benchmark MUST be preserved and unmodified.
Porting a java benchmark should be done in two commits: (1) Add untouched original file with its license and copyright header. (2) Modify the benchmark as necessary. This allows easily showing (git diff <first commit> <second commit>
) what modifications have been made to the original benchmarks.
Keep the original code as it is: This includes indents, spaces, tabs, etc. Only make changes to original code when you have to (e.g. fit into our framework), but keep the changes as minimal as possible. When we have to investigate why we're getting different results than other projects or developers using the same benchmark, a 'diff' should show as few changes as possible. If the original code has some coding style which cannot pass our 'checkstyle' script, use 'CHECKSTYLE.OFF' to bypass.
Header comment: When you have modified the code, make sure you comply with the license terms. Provide a full copy of the license (Apache2, BSD, MIT, etc.) and notices stating that you changed the files (required by Apache2, etc) in the header comment. Also, please put description in the header: where did you find the benchmark source code and a link to original source.
verify
methods should not depend on the benchmark having run before it is called.tools/benchmarks/run.py --target --dont-auto-calibrate
)public class MyBenchmark { private final static int N = 1000; private int[] a; public static void main(String [] args) { MyBenchmark b = new MyBenchmark(); b.setupArray(); long before = System.currentTimeMillis(); b.timeSumArray(1000); b.timeTestAdd(1000); b.timeSfib(600); long after = System.currentTimeMillis(); System.out.println("MyBenchmark: " + (after - before)); } public void setupArray() { a = new int[N]; for (int i = 0; i < N; ++i) { a[i] = i; } } private int sumArray(int[] a) { int n = a.length; int result = 0; for (int i = 0; i < n; ++i) { result += a[i]; } return result; } public int timeSumArray(int iters) { int result = 0; for (int i = 0; i < iters; ++i) { result += sumArray(a); } return result; } // +----> test method prefix should be "time..." // | // ignored <---+ | +-------> No need to set iterations. Test // | | | framework will try to fill a // | | | reasonable value automatically. // | | | public int timeTestAdd(int iters) { int result = 0; for (int i = 0; i < iters; i++) { // test code result += i; testAddResults[i] = result; } return result; } public static boolean verifyTestAdd() { return timeTestAdd(0) == 0 && timeTestAdd(1) == 1 && timeTestAdd(2) == 3 && timeTestAdd(100) == 5050 && timeTestAdd(123) == 7626; } // If you want to fill iterations with your own value. Write a method like: // Don't warm up test <-----+ +---------> Your choice // | | @IterationsAnnotation(noWarmup=true, iterations=600) public long timeSfib(int iters) { long sum = 0; for (int i = 0; i < iters; i++) { sum += sfib(20); } return sum; } } // Please refer to existing benchmarks for further examples.
The performance history of AOSP ART Tip running this benchmark suite is tracked on website: https://art-reports.linaro.org/.
To maintain the performance history data and allow the team to track the performance of ART easily, the following existing benchmarks should have no new changes:
The following benchmarks are allowed to have new changes (e.g. new cases introduced):
TODO: Detail all benchmarks here, especially what they are intended to achieve.
Description, License (if any), Main Focus, Secondary Focus, Additional Comments
Control flow recursive is ported from: https://github.com/WebKit/webkit/blob/main/PerformanceTests/SunSpider/tests/sunspider-1.0.2/controlflow-recursive.js
License is Revised BSD licence: http://benchmarksgame.alioth.debian.org/license.html
Benchmark for hash map, which is converted from: http://browserbench.org/JetStream/sources/hash-map.js
License is Apache 2.0.
Large portions Copyright (c) 2000-2015 The Legion of the Bouncy Castle Inc. (http://www.bouncycastle.org)
See BitfieldRotate.java header for license text.
License iS BSD-like.