Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

Difference between pages "Update Site Optimization" and "Pack200"

(Difference between pages)
 
 
Line 1: Line 1:
== The Problem ==
+
==Overview==
The Eclipse Install/Update design concept includes grouping artifacts called features which are published on an Update Site located on a remote serverA feature consists of the feature manifest file and other resources placed in a single JAR archive. When directed at the update site, Eclipse Update Manager must download each of these JARs and parse the manifest in order to perform activities such as site browsing, searching, dependency checking etc.
+
<p>Pack200 is a compression technology included in Java 1.5.0It was designed for compressing jars and works most efficiently on Java class files.  Using Pack200 compression can reduce the size of a jar by about 60%. By packing the jars placed on an update site and enabling update to unpack those jars after download, the amount of data downloaded during an update can be greatly reduced.</p>
 
+
<br>
This approach works reasonably well for moderate update sites, but does not scale well for large sites like [http://www.eclipse.org/projects/callisto.php Callisto]Each of the feature JARs is small, but opening a connection and downloading this small JAR is costly and adds up. Even worse, users need to pay this price BEFORE they even decide if they want to install anything from the site. A solution is needed to reduce the number of connections simply to browse or search the update site.
+
===Signing===
 +
<p>Pack200 is not a lossless compressionPacking and unpacking will produce a jar that is semantically the same as the original, but classfile structures will be rearranged; the resulting jar will not be identical to the original.  However, this reordering is idempotent so a second pack-unpack will not further change the jar.</p>
 +
<p>Signing a jar hashes the contents and stores the hash codes in the manifest. Since packing and unpacking a jar will modify the contents, the jar must be normalized prior to signing.  Normalizing the jar will also be refered to as repacking the jar.</p>
  
Once the features to install have been selected, Update needs to physically download plug-in JARs onto user's machine. At this point, payload size ceases to be trivial - a full Callisto download is several hundred megabytes. A technique to reduce the payload size would benefit users who are downloading the full Callisto set.
+
== Jar Processor ==
 +
<p>The Jar Processor is a tool provided by the org.eclipse.update.core bundle that will recursively run the pack200, signing, and unpack200 tools on a jar and its nested jars.  The jar processor can be used during a build to repack, sign and pack jars for an update site. It is also used by eclipse itself to unpack compressed jars downloaded from an update site.</p>
  
== The Solution ==
+
<p>To use the jar processor in a The jar processor can be exported into a self contained jar using the <code>org.eclipse.update.internal.jarprocessor/jarprocessor.jardesc</code> jar description file. The jar processor can also be accessed through the <code>org.eclipse.update.core.siteOptimizer</code> application. The jar processor requires a 1.5 jre to perform the pack and unpack.</p>
The solution comes in two parts: the site digest, and the use of [[Pack200]].  The site digest is produced by merging all the information needed for browsing and searching a site into one file that is archived for size and can be downloaded using one connection instead the many separate connections needed to download the features. Pack200 is a jar compression utility that is part of J2SE 5.0 that will reduce the size of the jars significantly.
+
<pre>java -jar /eclipse/startup.jar -application org.eclipse.update.core.siteOptimizer -jarProcessor [options] [input] </pre>
 
+
The jarprocessor understands the following command-line options:
Both these solutions require enhancements of the Install/Update code to make Update capable of consuming these artifacts. However these performance enhancements are optional and Install/Update should continue to perform as normal in their absence.
+
{| border="1" cellpadding="2"
 
+
!width="150"|Option
 
+
!width="500"|Effect
== Builds, Update Sites and the Site Optimizer ==
+
|-
<p>There are two sides to this solution, steps that must be taken during a component's build, and steps that are taken on the update site itself.</p>
+
|&ndash;repack
<p>
+
|Normalize the jar by calling the pack200 tool with the -repack option.
To ensure that the jars downloaded from an update site are the same as jars downloaded in a zip distribution, the jars need to be normalized (or repacked) during the build process (see the [[Pack200|Pack200 wiki page]]). This is especially true if the jars will be signed. If the jars are being [[JAR Signing|sent to the Eclipse Foundation to be signed]], then this repacking will be done at that time.  The actual build of the digest and packing of the jars can be considered a separate step and can be done on the update site itself.</p>
+
|-
 +
|&ndash;pack
 +
|Pack the jar using the pack200 tool
 +
|-
 +
|&ndash;sign <command>
 +
|Sign the jar using the provided command.  The sign command will be provided the name of the jar to sign as its first argument.
 +
|-
 +
|&ndash;unpack
 +
|Unpack a pack.gz file into a jar using the unpack200 tool.
 +
|-
 +
|&ndash;outputDir <directory>
 +
|The directory in which to place the results.
 +
|-
 +
|&ndash;verbose
 +
|Use verbose mode
 +
|}
 +
<p>The repack, sign and pack options can be specified together.  When specifying all 3, the input jar will first be normalized, then signed, then packed.  The output will be the signed jar and a packed jar.pack.gz file.</p>
 
<br>
 
<br>
===The Site Optimizer===
+
===Input===
The org.eclipse.update.core bundle provides an application extension named org.eclipse.update.core.siteOptimizer which can be invoked from the command line.
+
<p>The jar processor takes as input either a .zip file, a .jar (or .pack.gz) file, or a directory.
 +
*If the input is a zip file, then each contained .jar (or .pack.gz if unpacking) will be processed.  A new .zip file will be created in the output directory containing the results.
 +
*If the input is a single .jar (or .pack.gz file) then that file is processed.
 +
*If the input is a directory then all .jar (or .pack.gz files) in that directory, and its subdirectories, will be processed.
 +
</p><br>
 +
<p>If the input is a zip file, then additional options may be specified by placing a <code>pack.properties</code> file in the root of the zip.  This file is a java properties file and the following properties are supported:
 +
*pack.excludes : A comma-delimited list of JARs that should not be packed or repacked.
 +
*sign.excludes : A comma-delimited list of JARs that should not be signed.
 +
*<jarname>.pack.args : A comma-delimited list of additional arguments that should be passed to pack200 when packing any jar with name <jarname>.
 +
</p>
 +
<br>
 +
===Examples===
 
<pre>
 
<pre>
java -jar /eclipse/startup.jar -application org.eclipse.update.core.siteOptimizer [options]
+
java -jar /eclipse/startup.jar -application org.eclipse.update.core.siteOptimizer -jarProcessor
 +
-repack -sign signing-script.sh -outputDir ./out eclipse-SDK.zip
 
</pre>
 
</pre>
The site optimizer application exposes the digest builder and the jar processor.  The digest builder is the tool that creates the actual site digest, the jar processor is a tool that can repack, sign, pack or unpack a jar and all its nested jars recursively.
+
<p>This will run the jar processor using the siteOptimizer applicationFor each jar file in eclipse-SDK.zip, the jar processor will normalize the jar by repacking it, then sign it by executing <code>signing-script.sh <jar></code>.  A zip ./out/eclipse-SDK.zip will be created containing the repacked signed jarsAny non-jar file in the input eclipse-SDK.zip will be copied over to the ./out/eclipse-SDK.zip as is.</p>
<p>
+
The site optimizer can be used during a build to do the repacking of the jarsExactly when it should be called depends on how the build is organizedIf the build first builds update jars that are repackaged into the download zips, then the optimizer should be run on those update jars before they are repackaged. If the build produces the download zips first, then the optimizer should be run on the download zips. In both cases, we have either a zip full of jars, or a zip full of directories that contain jars. The site optimizer can take this zip as input and output a similarly shaped zip containing the repacked (and optionally signed) jars:
+
 
<pre>
 
<pre>
 
java -jar /eclipse/startup.jar -application org.eclipse.update.core.siteOptimizer -jarProcessor
 
java -jar /eclipse/startup.jar -application org.eclipse.update.core.siteOptimizer -jarProcessor
  -repack -outputDir ./out sdk.zip
+
-repack -sign signing-script.sh -pack -outputDir ./out eclipse-SDK.zip
 +
</pre>
 +
<p>This command will do the same as the first example, but will also pack the signed jars.  The output ./out/eclipse-SDK.zip file will contain both the signed jars and the .jar.pack.gz versions.</p>
 +
<pre>
 +
java -jar /eclipse/startup.jar -application org.eclipse.update.core.siteOptimizer -jarProcessor
 +
-pack myJar.jar
  
 
java -jar /eclipse/startup.jar -application org.eclipse.update.core.siteOptimizer -jarProcessor
 
java -jar /eclipse/startup.jar -application org.eclipse.update.core.siteOptimizer -jarProcessor
  -repack -sign sign_script.sh -outputDir ./out  sdk.zip
+
-unpack myJar.jar.pack.gz
 
</pre>
 
</pre>
</p>
+
<p>These two commands are the inverses of each other.  The first will pack <code>myJar.jar</code> and produce a <code>myJar.jar.pack.gz</code>.  Since no outputDir is specified, the .pack.gz file will be created in the current directory.  If <code>myJar.jar</code> contained a nested jar, then that nested jar will be packed first and the resulting <code>myJar.jar.pack.gz</code> would contain a nested <code>nested.jar.pack.gz</code>.  The second command will unpack the <code>myJar.jar.pack.gz</code> file and produce a <code>myJar.jar</code>.  The nested <code>nested.jar.pack.gz</code> will also be unpacked.  Again, because no outputDir is specified, the output will be in the current directory, overwriting the original <code>myJar.jar</code>.</p>
<p>See the [[Pack200#Jar Processor|jar processor]] page for details on the options available for the jar processor.</p>
+
 
<br>
 
<br>
===The Update Site===
+
===Pack200 Executable location===
<p>
+
<p>By default, the jar processor will look for the pack200 and unpack200 executables first in the <code>${java.home}/bin</code> directory and then on the system search pathHowever, the location of these tools can also be specified using the java system property <code>org.eclipse.update.jarprocessor.pack200</code>. The value is expected to be the directory containing the pack200 and unpack200 executables or one of the following special values:
If the update site is going to contain packed jars, then the site.xml file should specify that it supports pack200 by setting the pack200 attribute: <code><site pack200="true"></code>.  This lets the Update Manager know that the site contains packed jars, and it will look for a .jar.pack.gz file beside the .jar file that it would normally downloadIf the .jar.pack.gz file is found, it will be downloaded and unpacked, otherwise the .jar file is downloaded as normal.</p>
+
*"@jre" - find unpack200 in ${java.home}/bin
<p>
+
*"@path" - find unpack200 on the search path
The site optimizer is used on the update site to build the digest and do the actual packing of the jars:
+
*"@none" - pack200 not supported, download normal jars from update sites.
<pre>
+
</p>
java -jar /eclipse/startup.jar -application org.eclipse.update.core.siteOptimizer -digestBuilder
+
  -digestOutputDir=/eclipse/digest -siteXML=/eclipse/site/site.xml  -jarProcessor -pack -outputDir /eclipse/site /eclipse/site
+
</pre>
+
This command will build the digest and traverse the /eclipse/site directory structure and pack all the jars it finds.  The output of a pack is a .pack.gz file, so the result is that beside each jar, there will be a jar.pack.gz file.</p>
+
 
<br>
 
<br>
 
== Related Pages ==
 
== Related Pages ==
 +
*[[Update Site Optimization]]
 
*[[Callisto Coordinated Update Sites]]
 
*[[Callisto Coordinated Update Sites]]
 
*[[JAR Signing]]
 
*[[JAR Signing]]
*[[Pack200|Pack200 and the Jar Processor]]
+
===External Links===
 +
[http://java.sun.com/j2se/1.5.0/docs/guide/deployment/deployment-guide/pack200.html Pack200 and Compression]<br>
 +
[http://java.sun.com/j2se/1.5.0/docs/tooldocs/share/pack200.html JAR Packing tool]<br>
 +
[http://java.sun.com/j2se/1.5.0/docs/tooldocs/share/unpack200.html JAR Unpacking tool]<br>
 +
[http://java.sun.com/j2se/1.5.0/docs/tooldocs/solaris/jarsigner.html JAR Signing and Verification tool]<br>

Revision as of 17:51, 21 April 2006

Overview

Pack200 is a compression technology included in Java 1.5.0. It was designed for compressing jars and works most efficiently on Java class files. Using Pack200 compression can reduce the size of a jar by about 60%. By packing the jars placed on an update site and enabling update to unpack those jars after download, the amount of data downloaded during an update can be greatly reduced.


Signing

Pack200 is not a lossless compression. Packing and unpacking will produce a jar that is semantically the same as the original, but classfile structures will be rearranged; the resulting jar will not be identical to the original. However, this reordering is idempotent so a second pack-unpack will not further change the jar.

Signing a jar hashes the contents and stores the hash codes in the manifest. Since packing and unpacking a jar will modify the contents, the jar must be normalized prior to signing. Normalizing the jar will also be refered to as repacking the jar.

Jar Processor

The Jar Processor is a tool provided by the org.eclipse.update.core bundle that will recursively run the pack200, signing, and unpack200 tools on a jar and its nested jars. The jar processor can be used during a build to repack, sign and pack jars for an update site. It is also used by eclipse itself to unpack compressed jars downloaded from an update site.

To use the jar processor in a The jar processor can be exported into a self contained jar using the org.eclipse.update.internal.jarprocessor/jarprocessor.jardesc jar description file. The jar processor can also be accessed through the org.eclipse.update.core.siteOptimizer application. The jar processor requires a 1.5 jre to perform the pack and unpack.

java -jar /eclipse/startup.jar -application org.eclipse.update.core.siteOptimizer -jarProcessor [options] [input] 

The jarprocessor understands the following command-line options:

Option Effect
–repack Normalize the jar by calling the pack200 tool with the -repack option.
–pack Pack the jar using the pack200 tool
–sign <command> Sign the jar using the provided command. The sign command will be provided the name of the jar to sign as its first argument.
–unpack Unpack a pack.gz file into a jar using the unpack200 tool.
–outputDir <directory> The directory in which to place the results.
–verbose Use verbose mode

The repack, sign and pack options can be specified together. When specifying all 3, the input jar will first be normalized, then signed, then packed. The output will be the signed jar and a packed jar.pack.gz file.


Input

The jar processor takes as input either a .zip file, a .jar (or .pack.gz) file, or a directory.

  • If the input is a zip file, then each contained .jar (or .pack.gz if unpacking) will be processed. A new .zip file will be created in the output directory containing the results.
  • If the input is a single .jar (or .pack.gz file) then that file is processed.
  • If the input is a directory then all .jar (or .pack.gz files) in that directory, and its subdirectories, will be processed.


If the input is a zip file, then additional options may be specified by placing a pack.properties file in the root of the zip. This file is a java properties file and the following properties are supported:

  • pack.excludes : A comma-delimited list of JARs that should not be packed or repacked.
  • sign.excludes : A comma-delimited list of JARs that should not be signed.
  • <jarname>.pack.args : A comma-delimited list of additional arguments that should be passed to pack200 when packing any jar with name <jarname>.


Examples

java -jar /eclipse/startup.jar -application org.eclipse.update.core.siteOptimizer -jarProcessor
 -repack -sign signing-script.sh -outputDir ./out eclipse-SDK.zip

This will run the jar processor using the siteOptimizer application. For each jar file in eclipse-SDK.zip, the jar processor will normalize the jar by repacking it, then sign it by executing signing-script.sh <jar>. A zip ./out/eclipse-SDK.zip will be created containing the repacked signed jars. Any non-jar file in the input eclipse-SDK.zip will be copied over to the ./out/eclipse-SDK.zip as is.

java -jar /eclipse/startup.jar -application org.eclipse.update.core.siteOptimizer -jarProcessor
 -repack -sign signing-script.sh -pack -outputDir ./out eclipse-SDK.zip

This command will do the same as the first example, but will also pack the signed jars. The output ./out/eclipse-SDK.zip file will contain both the signed jars and the .jar.pack.gz versions.

java -jar /eclipse/startup.jar -application org.eclipse.update.core.siteOptimizer -jarProcessor
 -pack myJar.jar

java -jar /eclipse/startup.jar -application org.eclipse.update.core.siteOptimizer -jarProcessor
 -unpack myJar.jar.pack.gz

These two commands are the inverses of each other. The first will pack myJar.jar and produce a myJar.jar.pack.gz. Since no outputDir is specified, the .pack.gz file will be created in the current directory. If myJar.jar contained a nested jar, then that nested jar will be packed first and the resulting myJar.jar.pack.gz would contain a nested nested.jar.pack.gz. The second command will unpack the myJar.jar.pack.gz file and produce a myJar.jar. The nested nested.jar.pack.gz will also be unpacked. Again, because no outputDir is specified, the output will be in the current directory, overwriting the original myJar.jar.


Pack200 Executable location

By default, the jar processor will look for the pack200 and unpack200 executables first in the ${java.home}/bin directory and then on the system search path. However, the location of these tools can also be specified using the java system property org.eclipse.update.jarprocessor.pack200. The value is expected to be the directory containing the pack200 and unpack200 executables or one of the following special values:

  • "@jre" - find unpack200 in ${java.home}/bin
  • "@path" - find unpack200 on the search path
  • "@none" - pack200 not supported, download normal jars from update sites.


Related Pages

External Links

Pack200 and Compression
JAR Packing tool
JAR Unpacking tool
JAR Signing and Verification tool

Back to the top