mirror of
https://gitflic.ru/project/openide/openide.git
synced 2025-12-13 15:52:01 +07:00
99 lines
3.4 KiB
Plaintext
99 lines
3.4 KiB
Plaintext
@startuml
|
|
!include jb-plantuml-theme.puml
|
|
|
|
skinparam linetype ortho
|
|
|
|
top to bottom direction
|
|
|
|
header
|
|
A [[https://en.wikipedia.org/wiki/ZIP_(file_format) ZIP]] file format with optimized metadata.
|
|
endheader
|
|
|
|
component "File Entry 1" as FE1
|
|
component "File Entry N" as FE2
|
|
|
|
note right of FE1
|
|
The relative offset of the local file header does not point directly to the data,
|
|
but rather to the header itself. This means that you need to perform two seeks
|
|
in order to locate the actual data, as the size of the local file header can vary.
|
|
|
|
As an optimization, you can attempt to precompute the data offset
|
|
when reading the central directory file header.
|
|
This optimization is implemented in the HashMapZipFile class.
|
|
However, ImmutableZipFile uses a special index for this purpose, as explained below.
|
|
end note
|
|
|
|
FE1 -- FE2
|
|
|
|
component "File entry ~__index__" as INDEX {
|
|
component "A list of keys along with their corresponding offsets and sizes." as INDEX_M
|
|
note right of INDEX_M
|
|
A list of pairs consisting of long values.
|
|
Each pair includes a key, represented as a 64-bit XXH3 hash of an entry name,
|
|
and an offset and size represented as two ints packed into a single long value.
|
|
This list enables the retrieval of data locations for all entries in a single bulk read operation.
|
|
It contains no file names or other unnecessary metadata.
|
|
end note
|
|
|
|
component "class package hashes" as INDEX_PC
|
|
note right of INDEX_PC
|
|
A list of long values representing the 64-bit XXH3 hash of a package name.
|
|
This list is not used by the ZipFile implementation but is consumed by the class loader.
|
|
It allows for a quick determination of whether a class name is located within a ZIP file or not.
|
|
While it does not provide much benefit for a single ZIP file, as name lookup can be done with a single map lookup,
|
|
it enables the clustering of multiple ZIP files.
|
|
This clustering helps avoid a linear search across all ZIP files in a classpath.
|
|
end note
|
|
|
|
component "resource package hashes" as INDEX_PR
|
|
note right of INDEX_PR
|
|
The same concept applies to resource package hashes.
|
|
However, there are two different sets of hashes since there is no correlation
|
|
between class packages and resource packages.
|
|
end note
|
|
|
|
component names {
|
|
component "name lengths" as INDEX_NL
|
|
note right of INDEX_NL
|
|
A list of name lengths represented as shorts.
|
|
This list enables the reading of integers in a single bulk read operation,
|
|
directly from native memory.
|
|
end note
|
|
|
|
component "names" as INDEX_NS
|
|
note right of INDEX_NS
|
|
List of strings.
|
|
end note
|
|
|
|
INDEX_NL -down- INDEX_NS
|
|
}
|
|
|
|
note bottom of names
|
|
Entry names. They are not loaded into memory when the ZipFile is opened;
|
|
instead, they are loaded only when requested.
|
|
This is useful, for instance, when you want to process entries based on their names,
|
|
such as finding entries by a specific prefix.
|
|
end note
|
|
|
|
INDEX_M -- INDEX_PC
|
|
INDEX_PC -- INDEX_PR
|
|
INDEX_PR -- names
|
|
}
|
|
|
|
note top of INDEX
|
|
The Zip specification is not violated.
|
|
The index data represents a regular file entry.
|
|
end note
|
|
|
|
FE2 -- INDEX
|
|
|
|
component "Central directory" as CD
|
|
note right of CD
|
|
The index format version is stored in the 'File comment' field.
|
|
Only the latest format is supported.
|
|
If a ZIP file does not have a comment or the index version is not equal to the latest,
|
|
a fallback implementation is used that is capable of reading any ZIP file.
|
|
end note
|
|
INDEX -- CD
|
|
|
|
@enduml |