Setup
There are two options to execute this tool explained below.
Using a VM
You can find both tools and subject programs in the VM image. You are required to install the following software, described below.
- VirtualBox 5.2.22
- vagrant 2.2.2
To setup and enter the VM, please run the following:
vagrant up
vagrant ssh
cd /vagrant
Running on a Linux Machine
You are required to install the following software, described below.
- Stack 1.9.3 (Installation instructions here)
- After installation, run
stack upgrade --binary-version 1.9.3
- After installation, run
- Java 8
- Git
- Python 3
Setup
Some of the tools require setup. You can run the setup for all the projects with the following command:
./jdebloat.py setup
Running the tools
The tool can be executed through the interface provided by the jdebloat.py script
The usage for the script can be listed with the help([-h]) option as follows.
./jdebloat.py -h
usage: jdebloat.py [-h] {clean,setup,run}
positional arguments:
{clean,setup,run}
optional arguments:
-h, --help show this help message and exit
The three positional arguments available for each tool in the package:
- Setup - to perform setup and compilation for the tool
- Run - to execute the tool with the benchmark projects
- Clean - to perform cleanup for the tool
Examples:
Run all 3 debloat tools in sequence.
./jdebloat.py setup
./jdebloat.py run
./jdebloat.py clean
To run the JReduce tool, run:
./jdebloat.py setup jreduce
./jdebloat.py run jreduce
To run the JShrink tool, run:
./jdebloat.py setup jshrink
./jdebloat.py run jshrink
To run the JInline tool, run:
./jdebloat.py setup jinline
./jdebloat.py run jinline
Directory Structure
- results [Directory containing the benchmark results]
- data [Contains misc. data used by the tools]
- jdebloat.py [The script which runs JDebloat]
- output [The output directory]
- README.mkd [The setup README]
- scripts [Contains scripts used by jdebloat.py to run the tools]
- tools [Contains the JShrink, JReduce, and JInline tools]
- javaq [Contains the javaq tool, used for data collection]
- jinline [Contains the JInline tool]
- README.md [The JInline tool README file]
- jshrink [Contains the JShrink tool]
- README.md [The JShrink README file]
- jreduce [Contains the JReduce tool]
- README.md [The JReduce README file]
Benchmark Results
We tested JDebloat on 25 benchmarks and found the following reductions:
Name | Reduction |
---|---|
aragozin/jvm-tools | 64.20% |
ata4/disunity | 25.64% |
Bukkit/Bukkit | 66.49% |
eirslett/frontend-maven-plugin | 99.99% |
google/gson | 30.05% |
JakeWharton/DiskLruCache | 20.20% |
JakeWharton/retrofit1-okhttp3-client | 22.70% |
JakeWharton/RxReplayingShare | 47.70% |
JCTools/JCTools | 90.70% |
junit-team/junit4 | 20.21% |
kevinsawicki/http-request | 19.80% |
mabe02/lanterna | 24.99% |
pagehelper/Mybatis-PageHelper | 30.25% |
pedrovgs/Algorithms | 36.74% |
qiujiayu/AutoLoadCache | 71.02% |
square/javapoet | 20.51% |
square/moshi | 99.56% |
takari/maven-wrapper | 74.45% |
alibaba/TProfiler | 97.15% |
dieforfree/qart4j | 100.00% |
dubboclub/dubbokeeper | 80.09% |
JakeWharton/RxRelay | 27.80% |
sockeqwe/fragmentargs | 23.73% |
tomighty/tomighty | 29.13% |
zeroturnaround/zt-zip | 26.61% |
The links to all of these repositories, as well as the commits we used, are listed in data/benchmarks.csv
JShrink
JShrink takes a java project as input and removed uninvoked methods and classes based on static and dynamic call graph analysis. While this functionality is similar to JRed, it differs in three major ways. First, in order to identify call targets invoked using Java reflection, JShrink uses TamiFlex reflection call analysis, thus improving the safety of method removal. We also use JMtrace, a native profiling agent using JVM TI API, which captures the use of dynamic features in Java code and augments static reachability analysis in JShrink. Secondly, we remove the body of each uninvoked method and enable the inserting of a custom warning message to indicate where debloating has been applied. Third, we allow various options for entry points such as all main methods, all public methods (excluding tests), and/or all JUnit tests.
Warning : The current version being released is a first prototype and still in active development. During the duration of the ONR-TPCP project, we will be making continuous improvements and releasing the upgraded version in a timely manner.
Technical Details
JShrink works by generating a static call graph of an input program. It proceeds to remove methods that are not used based on static call graph analysis. When using JShrink, the user is required to specify entry points for constructing the call graph. JShrink provides three pre-programmed options: (1) all main methods, (2) all public methods (excluding tests), and/or (3) all JUnit Tests. The user may also specify custom entry points if required.
Using the Soot Bytecode optimization framework, we remove unused Java bytecode methods. The user has the option of either completely removing the method, removing the method's body, or replacing the method's body with a RuntimeException.
Due to Java's Reflection functionality, we are incapable of creating a complete call graph with standard call graph analysis libraries alone. To overcome this, we use TamiFlex. TamiFlex observes the execution of a Java program under the given test suite and notes the reflective method invocations --- where these reflective calls are made within a Java application, and what are the call targets.
JShrink runs TamiFlex with the target Java project's existing test cases as input. We then extract all method invocations that were made via reflection. JMtrace is used to extract any additional method invocations which might have resulted from the use of Java dynamic features such as dynamic classloading, dynamic proxy, JNI, etc.
We set these as additional entry points for the static call graph analysis. This thereby results in safer debloating.
Current Restrictions and Limitations
- JShrink works only with Java 1.8.
- It requires a user to specify an entry point.
- Handling reflective calls and other dynamic features is enabled for Maven projects
only. In other words, the
--tamiflex
and--jmtrace
options only work when targeting a Maven Project. - If the
--tamiflex
option is specified, the--test-entry
option is automatically set, since Tamiflex uses tests as entry points to analyze reflective calls. --use-spark
will use the Spark Call Graph analysis. Spark is not as conservative as the default call graph analysis (CHA) and may cause errors (we know of instance where Spark does not produce a complete call graph).
Usage
To execute the JShrink tool with the benchmarks, simply run
./jdebloat.py run jshrink
in the VM provided. The debloated programs, can be found in
output/JShrink
, along with a summary of the size reduction achieved
in output/JShrink/<BENCHMARK>/size_info.dat
.
If running the tool independently is required, please read the following usage notes:
usage: JShrink.jar [-a <arg>] [-c <arg>] [-ch <path>] [-d] [-e <Exception Message>]
[-f <TamiFlex Jar>] [-h] [-i <arg>] [-jm <path>] [-k] [-l <arg>] [-m] [-n <arg>]
[-o] [-p] [-r] [-s] [-t <arg>] [-u] [--usecache] [-v]
An application to get the call-graph analysis of an application and to
wipe unused methods
-a,--app-classpath <arg> Specify the application
classpath
-c,--custom-entry <arg> Specify custom entry points
in syntax of
'<[classname]:[public?]
[static?] [returnType]
[methodName]([args...?])>'
-ch,--checkpoint <path> Maintain and revert to checkpoints in
case a transformation leads to test failure
-d,--debug Run JShrink in 'debug'
mode. Used for testing
-e,--include-exception <Exception Message> Specify if an exception
message should be included
in a wiped method (Optional
argument: the message)
-f,--tamiflex <TamiFlex Jar> Enable TamiFlex
-jm,--jmtrace <path/to/jmtrace/folder> Enable Dynamic Profiling
-h,--help Help
-i,--ignore-classes <arg> Specify classes that should
not be delete or modified
-k,--use-spark Use Spark call graph
analysis (Uses CHA by
default)
-l,--lib-classpath <arg> Specify the classpath for
libraries
-m,--main-entry Include the main method as
an entry point
-n,--maven-project <arg> Instead of targeting using
lib/app/test classpaths, a
Maven project directory may
be specified
-o,--remove-classes Remove unused classes
-p,--prune-app Prune the application
classes as well
-r,--remove-methods Remove methods header and
body (by default, the bodies
are wiped)
-s,--test-entry Include the test methods as
entry points
-t,--test-classpath <arg> Specify the test classpath
-u,--public-entry Include public methods as
entry points
--use-cache Cache static analysis call graph of project
-v,--verbose Run JShrink in 'verbose'
mode. Outputs analysed
methods and touched methods
Example usage case 1: Use a Maven project as an application, specify entry points as all main methods, all public methods, and all existing testcases, and consider Java reflective calls using Tamiflex
java -jar jshrink.jar --maven-project <PROJECT_DIR> --public-entry
--main-entry --test-entry --prune-app --remove-methods --tamiflex
<TAMFLEX_JAR>
--maven-project <PROJECT_DIR>
specifies the Maven project to be debloated.
--public-entry --main-entry --test-entry
states that all entry points
(all public, the main methods, and test methods) should be used as entry
points to generate the call graph.
--prune-app
specifies that that the application code should be
debloated as well as the dependency code.
--remove-methods
specifies that methods should be removed in their
entirety. By default, only their bodies are removed.
--tamiflex <TAMIFLEX_JAR>
specifies that TamiFlex should be used to find
reflective calls. The argument is the location of the TamiFlex Jar.
Example usage case 2: Use a non-Maven project as an application, specify main methods as an entry point, and do not consider reflective calls using Tamiflex
java -jar jshrink.jar --app-classpath <APP_CLASSPATH> --lib-classpath
<LIBRARY_CLASSPATH> --test-classpath <TEST_CLASSPATH>
--include-exception "ERROR, METHOD REMOVED"
--app-classpath <APP_CLASSPATH> --lib-classpath<LIBRARY_CLASSPATH>
--test-classpath <TEST_CLASSPATH>
specifies the application, library,
and test classpaths of the target.
--include-exception "ERROR, METHOD REMOVE"
specifies that when a
method's body is wiped it should be replaced with a Runtime exception
with the message "ERROR, METHOD REMOVE".
Example usage case 3: Use a Maven project as an application, perform call graph analysis with Spark, and remove unused classes
java -jar jshrink.jar --maven-project <PROJECT_DIR> --main-entry
--remove-classes --use-spark
--remove-classes
specifies that classes whose methods are all
removed, and contain no accessible static methods, are to be removed
completely.
--use-spark
specifies that Spark Call Graph analysis should be used.
Results
Running our tool on the benchmarks yields the following result.
Benchmark | Size Before Debloat (Bytes) | Size after Debloat (Bytes) | Reduction |
---|---|---|---|
JavaPoet | 234746 | 230375 | 1.86% |
JavaVerbalExpressions | 14746 | 14746 | 0.00% |
Curator | 10427613 | 8071252 | 22.60% |
RxRelay | 5108491 | 4574410 | 10.45% |
Descriptions of benchmark applications
- JavaPoet is a Java API for
generating
.java
source files. - DiskLruCache is a library that provides a cache bounded by an amount of space on a file-system.
- JavaVerbalExpression is a Java library that helps in the construction of difficult regular expressions.
- Curator is a set of Java libraries to improve Apache ZooKeeper.
- JUnit4 is a framework to write repeatable tests for Java.
- RxRelay is a Relay library for RxJava.
Results on other projects.
Benchmark | Reduction |
---|---|
alibaba_TProfiler | 10.17% |
aragozin_jvm-tools | 4.20% |
Bukkit_Bukkit | 18.54% |
dieforfree_qart4j | 46.82% |
dubboclub_dubbokeeper | 17.32% |
eirslett_frontend-maven-plugin | 22.44% |
google_gson | 5.52% |
JakeWharton_DiskLruCache | 1.65% |
JakeWharton_retrofit1-okhttp3-client | 11.46% |
JakeWharton_RxRelay | 17.47% |
JakeWharton_RxReplayingShare | 22.13% |
junit-team_junit4 | 6.93% |
kevinsawicki_http-request | 6.55% |
mabe02_lanterna | 1.96% |
notnoop_java-apns | 18.88% |
pagehelper_Mybatis-PageHelper | 23.91% |
pedrovgs_Algorithms | 5.46% |
qiujiayu_AutoLoadCache | 20.19% |
sockeqwe_fragmentargs | 11.59% |
square_moshi | 0.22% |
tomighty_tomighty | 20.10% |
zeroturnaround_zt-zip | 11.32% |
Method wiping
In our tool, the default behavior is to wipe the method body of each uninvoked method. We show below an example of a Java method in the Jimple format
.method public static staticShortMethodNoParams()Ljava/lang/Short;
.limit stack 2
.limit locals 1
getstatic java/lang/System/out Ljava/io/PrintStream;
astore_0
aload_0
ldc "staticShortMethodNoParams touched"
invokevirtual java/io/PrintStream/println(Ljava/lang/String;)V
iconst_3
invokestatic java/lang/Short/valueOf(S)Ljava/lang/Short;
astore_0
aload_0
areturn
.end method
After this method's body is wiped, it leaves the method header, while removing the body to the maximum possible extent permissible by the JVM. This is shown below:
.method public static staticShortMethodNoParams()Ljava/lang/Short;
.limit stack 1
.limit locals 0
aconst_null
areturn
.end method
JReduce
JReduce is a tool that uses a variant of delta-debugging to reduce the classes of a project given a property. The tool was originally build to reduce the bytecode that caused bugs in decompilers. We categorized a bug as any set of classes with no external dependencies, was not able to decompile and then compile again. The goal was therefore to find the smallest set of classes, that still had all their dependencies.
In this case, we use the test suite as the property that we want to preserve. Then we reduce the number of classes to the smallest possible where the tests still succeed.
JReduce is the focus of a paper accepted at FSE'19 which showed a 12x faster reduction of Java ByteCode than previous techniques. JReduce is currently in active development and the progress can be followed on the open source repository.
Technical Details
JReduce works by calculating a dependency graph from classes to other classes. We create the graph by creating an edge from a class to another if the first class mentions the second class.
Using the graph, we calculate all the strongly connected components (SCC). If we include one of the classes in an SCC, all classes in the SCC needs to be included. This means that we can reduce the program, by reducing this list of SCCs.
It is rare that a program runs classes outside the SCC that contains the main class, but can happens if the program uses reflection. We have therefore developed a new reduction technique called Binary Reduction, which can quickly search the list for the few SCC needed to satisfy the predicate.
In our tool, it is also possible to provide a set of core classes. The core classes should not be removed. If a SCC contains a class from the core, it will not be removed. In our case, we set the test-cases as a core.
Usage
To run JReduce on the benchmarks of this project, first setup the
tool by running ./jdebloat.py setup jreduce
.
Then run ./jdebloat.py run jreduce
and the output can be found in the
output/jreduce
folder.
You can also run the tool on your own benchmarks. Either use
the scripts/runjreduce.sh
script or you can run JReduce directly:
jreduce -v -o output --cp test.jar -t app.jar -c @classes-in-core.txt \
<runpredicate> <args..>
Where runpredicate.sh
is a script that takes a reduced app.jar
and
has exit code 0 if the predicate succeeded. In the runjreduce.sh
script we use runtest.sh
with the test.jar
and test.classes.txt
.
In the case you want to be adventures; consult the help notes:
Usage: jreduce [-v] [-q] [-D|--log-depth ARG] [-c|--core CORE] [--cp CLASSPATH]
[--stdlib] [--jre JRE] (-t|--target FILE) (-o|--output FILE)
[-R|--reducer ARG] [-W|--work-folder ARG] [-K|--keep-folders]
[-E|--exit-code CODE] [--stdout] [--stderr] [-T|--timelimit SECS]
CMD [ARG..]
A command line tool for reducing java programs.
Available options:
-v make it more verbose.
-q make it more quiet.
-D,--log-depth ARG set the log depth. (default: -1)
-c,--core CORE the core classes to not reduce.
--cp CLASSPATH the library classpath, of things not reduced.
--stdlib load the standard library.
--jre JRE the location of the stdlib.
-t,--target FILE the path to the jar or folder to reduce.
-o,--output FILE the path output folder.
-R,--reducer ARG the reducing algorithm to use. (default: Binary)
-W,--work-folder ARG the work folder.
-K,--keep-folders keep the work folders after use?
-E,--exit-code CODE preserve exit-code (default: 0)
--stdout preserve stdout.
--stderr preserve stderr.
-T,--timelimit SECS the maximum number of seconds to run the process,
negative means no timelimit. (default: -1.0)
CMD the command to run
ARG.. arguments to the command.
-h,--help Show this help text
JInline
JInline
takes a Java program and statically inlines methods
read from a database.
Technical Details
We first provide aggressive inline parameters to the JVM. While these parameters are not suitable for running programs, they provide better inlining information. We extract the inlining decisions from the JVM into a database for later use.
We use to our customized database to inform our static inliner. First, we filter out aggressive inlinings which would cause the Java program to miscompile. Using this information, our Inliner tool uses the Soot Bytecode optimization framework to statically inline method calls without affecting the semantics of the program.
Our technique finds inline targets that might not otherwise be detectable by purely static approaches. Our tool produces a new JAR with our modified class files containing inlined methods. We successfully ran our modified JAR on the original tests cases without errors.
Usage
To run JInline on the provided benchmarks, simply run
./jdebloat.py run jinline
.
The output programs will be found in output/jinline as jars. If running the tool independently is required, please read the following usage notes:
usage: run-jinline.py [-h] [-o OUTPUT_JAR]
test_jar test_classes app_lib_jar output_dir
Run inliner tool.
positional arguments:
test_jar JAR containing the test suite
test_classes Text file of test classes
app_lib_jar JAR containing application and libraries
output_dir Output directory
optional arguments:
-h, --help show this help message and exit
-o OUTPUT_JAR Modified JAR file path