Diff: rfc9841v1.txt - rfc9841.txt

	rfc9841v1.txt		rfc9841.txt

	Internet Engineering Task Force (IETF) J. Alakuijala		Internet Engineering Task Force (IETF) J. Alakuijala
	Request for Comments: 9841 T. Duong		Request for Comments: 9841 T. Duong
	Updates: 7932 E. Kliuchnikov		Updates: 7932 E. Kliuchnikov
	Category: Informational Z. Szabadka		Category: Informational Z. Szabadka
	ISSN: 2070-1721 L. Vandevenne, Ed.		ISSN: 2070-1721 L. Vandevenne, Ed.

	Google, Inc		Google, Inc.
	August 2025		September 2025

	Shared Brotli Compressed Data Format		Shared Brotli Compressed Data Format

	Abstract		Abstract

	This specification defines a data format for shared brotli		This specification defines a data format for shared brotli
	compression, which adds support for shared dictionaries, large		compression, which adds support for shared dictionaries, large
	window, and a container format to brotli (RFC 7932). Shared		window, and a container format to brotli (RFC 7932). Shared
	dictionaries and large window support allow significant compression		dictionaries and large window support allow significant compression

	gains compared to regular brotli. This document updates RFC 7932.		gains compared to regular brotli. This document specifies an
			extension to the method defined in RFC 7932.

	Status of This Memo		Status of This Memo

	This document is not an Internet Standards Track specification; it is		This document is not an Internet Standards Track specification; it is
	published for informational purposes.		published for informational purposes.

	This document is a product of the Internet Engineering Task Force		This document is a product of the Internet Engineering Task Force
	(IETF). It represents the consensus of the IETF community. It has		(IETF). It represents the consensus of the IETF community. It has
	received public review and has been approved for publication by the		received public review and has been approved for publication by the
	Internet Engineering Steering Group (IESG). Not all documents		Internet Engineering Steering Group (IESG). Not all documents

	skipping to change at line 163 ¶		skipping to change at line 164 ¶
	For this specification, a byte is exactly 8 bits, even on machines		For this specification, a byte is exactly 8 bits, even on machines
	that store a character on a number of bits different from eight.		that store a character on a number of bits different from eight.
	See below for the numbering of bits within a byte.		See below for the numbering of bits within a byte.

	String: A sequence of arbitrary bytes.		String: A sequence of arbitrary bytes.

	Bytes stored within a computer do not have a "bit order" since they		Bytes stored within a computer do not have a "bit order" since they
	are always treated as a unit. However, a byte considered as an		are always treated as a unit. However, a byte considered as an
	integer between 0 and 255 does have a most significant bit (MSB) and		integer between 0 and 255 does have a most significant bit (MSB) and
	least significant bit (LSB), and since we write numbers with the most		least significant bit (LSB), and since we write numbers with the most

	significant digit on the left, bytes with the MSB are also written on		significant digit on the left, we also write bytes with the MSB on
	the left. In the diagrams below, the bits of a byte are written so		the left. In the diagrams below, the bits of a byte are written so
	that bit 0 is the LSB, i.e., the bits are numbered as follows:		that bit 0 is the LSB, i.e., the bits are numbered as follows:

	+--------+		+--------+
	\|76543210\|		\|76543210\|
	+--------+		+--------+

	Within a computer, a number may occupy multiple bytes. All multi-		Within a computer, a number may occupy multiple bytes. All multi-
	byte numbers in the format described here are unsigned and stored		byte numbers in the format described here are unsigned and stored
	with the least significant byte first (at the lower memory address).		with the least significant byte first (at the lower memory address).

	skipping to change at line 264 ¶		skipping to change at line 265 ¶
	original dictionary in the custom dictionary.		original dictionary in the custom dictionary.

	If no shared dictionary is set, the decoder behaves the same as in		If no shared dictionary is set, the decoder behaves the same as in
	[RFC7932] on a brotli stream.		[RFC7932] on a brotli stream.

	If a shared dictionary is set, then it can set LZ77 dictionaries,		If a shared dictionary is set, then it can set LZ77 dictionaries,
	override static dictionary words, and/or override transforms.		override static dictionary words, and/or override transforms.

	3.1. Custom Static Dictionaries		3.1. Custom Static Dictionaries


	If a custom word list is set, then the following behavior of the RFC		If a custom word list is set, then the following behaviors of the
	7932 decoder [RFC7932] is overridden:		decoder defined in [RFC7932] are overridden:

	Instead of the Static Dictionary Data from Appendix A of		Instead of the Static Dictionary Data from Appendix A of
	[RFC7932], one or more word lists from the custom static		[RFC7932], one or more word lists from the custom static
	dictionary data are used.		dictionary data are used.

	Instead of NDBITS at the end of Appendix A of [RFC7932], a custom		Instead of NDBITS at the end of Appendix A of [RFC7932], a custom
	SIZE_BITS_BY_LENGTH per custom word list is used.		SIZE_BITS_BY_LENGTH per custom word list is used.

	The copy length for a static dictionary reference must be between		The copy length for a static dictionary reference must be between
	4 and 31 and may not be a value for which SIZE_BITS_BY_LENGTH of		4 and 31 and may not be a value for which SIZE_BITS_BY_LENGTH of
	this dictionary is 0.		this dictionary is 0.

	If a custom transforms list is set without context dependency, then		If a custom transforms list is set without context dependency, then

	the following behavior of the RFC 7932 decoder [RFC7932] is		the following behaviors of the decoder defined in [RFC7932] are
	overridden:		overridden:

	The "List of Word Transformations" from Appendix B of [RFC7932] is		The "List of Word Transformations" from Appendix B of [RFC7932] is
	overridden by one or more lists of custom prefixes, suffixes, and		overridden by one or more lists of custom prefixes, suffixes, and
	transform operations.		transform operations.

	The transform_id must be smaller than the number of transforms		The transform_id must be smaller than the number of transforms
	given in the custom transforms list.		given in the custom transforms list.

	If the dictionary is context dependent, it includes a lookup table of		If the dictionary is context dependent, it includes a lookup table of

	a 64-word list and transform list combinations. When resolving a		64 word list and transform list combinations. When resolving a
	static dictionary word, the decoder computes the literal Context ID		static dictionary word, the decoder computes the literal Context ID
	as described in Section 7.1 of [RFC7932]. The literal Context ID is		as described in Section 7.1 of [RFC7932]. The literal Context ID is
	used as the index in the lookup tables to select the word list and		used as the index in the lookup tables to select the word list and
	transforms to use. If the dictionary is not context dependent, this		transforms to use. If the dictionary is not context dependent, this
	ID is implicitly 0 instead.		ID is implicitly 0 instead.

	If a distance goes beyond the dictionary for the current ID and		If a distance goes beyond the dictionary for the current ID and

	multiple word/transform list combinations are defined, then a next		multiple word/transform list combinations are defined, then the next
	dictionary is used in the following order: if not context dependent,		dictionary is used in the following order:
	the same order as defined in the shared dictionary. If context
	dependent, the index matching the current context is used first, the		* If context dependent:
	same order as defined in the shared dictionary excluding the current
	context are used next.		- use the index matching the current context first, and then

			- use the same order as defined in the shared dictionary
			(excluding the current context) next.

			* If not context dependent:

			- use the same order as defined in the shared dictionary.

	3.1.1. Transform Operations		3.1.1. Transform Operations

	A shared dictionary may include custom word transformations to		A shared dictionary may include custom word transformations to
	replace those specified in Section 8 and Appendix B of [RFC7932]. A		replace those specified in Section 8 and Appendix B of [RFC7932]. A

	transform consists of a possible prefix, a transform operation, for		transform consists of a possible prefix, a transform operation, a
	some operations a parameter, and a possible suffix. In the shared		parameter (for some operations), and a possible suffix. In the
	dictionary format, the transform operation is represented by a		shared dictionary format, the transform operation is represented by a
	numerical ID, which is listed in the table below.		numerical ID, which is listed in the table below.

	+====+===========================+		+====+===========================+
	\| ID \| Operation \|		\| ID \| Operation \|
	+====+===========================+		+====+===========================+
	\| 0 \| Identity \|		\| 0 \| Identity \|
	+----+---------------------------+		+----+---------------------------+
	\| 1 \| OmitLast1 \|		\| 1 \| OmitLast1 \|
	+----+---------------------------+		+----+---------------------------+
	\| 2 \| OmitLast2 \|		\| 2 \| OmitLast2 \|

	skipping to change at line 464 ¶		skipping to change at line 472 ¶
	4. Varint Encoding		4. Varint Encoding

	A varint is encoded in base 128 in one or more bytes as follows:		A varint is encoded in base 128 in one or more bytes as follows:

	+--------+--------+ +--------+		+--------+--------+ +--------+
	\|1xxxxxxx\|1xxxxxxx\| {0-8 times} \|0xxxxxxx\|		\|1xxxxxxx\|1xxxxxxx\| {0-8 times} \|0xxxxxxx\|
	+--------+--------+ +--------+		+--------+--------+ +--------+

	where the "x" bits of the first byte are the LSBs of the value and		where the "x" bits of the first byte are the LSBs of the value and
	the "x" bits of the last byte are the MSBs of the value. The last		the "x" bits of the last byte are the MSBs of the value. The last

	byte must have its MSB set to 0, all other bytes to 1 to indicate		byte must have its MSB set to 0 and all other bytes must have their
	there is a next byte.		MSBs set to 1 to indicate there is a next byte.

	The maximum allowed amount of bits to read is 63 bits; if the 9th		The maximum allowed amount of bits to read is 63 bits; if the 9th
	byte is present and has its MSB set, then the stream must be		byte is present and has its MSB set, then the stream must be
	considered as invalid.		considered as invalid.

	5. Shared Dictionary Stream		5. Shared Dictionary Stream

	The shared dictionary stream encodes a custom dictionary for brotli,		The shared dictionary stream encodes a custom dictionary for brotli,
	including custom words and/or custom transformations. A shared		including custom words and/or custom transformations. A shared
	dictionary may appear as a standalone or as contents of a resource in		dictionary may appear as a standalone or as contents of a resource in
	a framing format container.		a framing format container.

	A compliant shared brotli dictionary stream must have the following		A compliant shared brotli dictionary stream must have the following
	format:		format:


	2 bytes: File signature, in hexadecimal the bytes 91, 0.		2 bytes: File signature in hexadecimal format (bytes 91 and 0).


	varint: LZ77_DICTIONARY_LENGTH. The number of bytes for an LZ7711		varint: LZ77_DICTIONARY_LENGTH. The number of bytes for an LZ77
	dictionary or 0 if there is none. The maximum allowed value is		dictionary, or 0 if there is none. The maximum allowed value is
	the maximum possible sliding window size of brotli or large window		the maximum possible sliding window size of brotli or large window
	brotli.		brotli.

	LZ77_DICTIONARY_LENGTH bytes: Contents of the LZ77 dictionary.		LZ77_DICTIONARY_LENGTH bytes: Contents of the LZ77 dictionary.


	1 byte: NUM_CUSTOM_WORD_LISTS. May have a value of 0 to 64.		1 byte: NUM_CUSTOM_WORD_LISTS. May have a value in range 0 to 64.

	NUM_CUSTOM_WORD_LISTS times a word list with the following format		NUM_CUSTOM_WORD_LISTS times a word list with the following format
	for each word list:		for each word list:

	28 bytes: SIZE_BITS_BY_LENGTH. An array of 28 unsigned 8-bit		28 bytes: SIZE_BITS_BY_LENGTH. An array of 28 unsigned 8-bit
	integers, indexed by word lengths 4 to 31. The value		integers, indexed by word lengths 4 to 31. The value
	represents log2(number of words of this length), with the		represents log2(number of words of this length), with the
	exception of 0 meaning 0 words of this length. The max allowed		exception of 0 meaning 0 words of this length. The max allowed
	length value is 15 bits. OFFSETS_BY_LENGTH is computed from		length value is 15 bits. OFFSETS_BY_LENGTH is computed from
	this as OFFSETS_BY_LENGTH[i + 1] = OFFSETS_BY_LENGTH[i] +		this as OFFSETS_BY_LENGTH[i + 1] = OFFSETS_BY_LENGTH[i] +
	(SIZE_BITS_BY_LENGTH[i] ? (i << SIZE_BITS_BY_LENGTH[i]) : 0).		(SIZE_BITS_BY_LENGTH[i] ? (i << SIZE_BITS_BY_LENGTH[i]) : 0).

	N bytes: Words dictionary data, where N is OFFSETS_BY_LENGTH[31]		N bytes: Words dictionary data, where N is OFFSETS_BY_LENGTH[31]
	+ (SIZE_BITS_BY_LENGTH[31] ? (31 << SIZE_BITS_BY_LENGTH[31]) :		+ (SIZE_BITS_BY_LENGTH[31] ? (31 << SIZE_BITS_BY_LENGTH[31]) :
	0), with all the words of shortest length first, then all words		0), with all the words of shortest length first, then all words
	of the next length, and so on, where there are either 0 or a		of the next length, and so on, where there are either 0 or a
	positive power of two number of words for each length.		positive power of two number of words for each length.


	1 byte: NUM_CUSTOM_TRANSFORM_LISTS. May have a value of 0 to 64.		1 byte: NUM_CUSTOM_TRANSFORM_LISTS. May have a value in range 0 to
			64.

	NUM_CUSTOM_TRANSFORM_LISTS times a transform list with the		NUM_CUSTOM_TRANSFORM_LISTS times a transform list with the
	following format for each transform list:		following format for each transform list:

	2 bytes: PREFIX_SUFFIX_LENGTH. The length of prefix/suffix data.		2 bytes: PREFIX_SUFFIX_LENGTH. The length of prefix/suffix data.
	Must be at least 1 because the list must always end with a		Must be at least 1 because the list must always end with a
	zero-length stringlet even if it is empty.		zero-length stringlet even if it is empty.

	NUM_PREFIX_SUFFIX times: Prefix/suffix stringlet.		NUM_PREFIX_SUFFIX times: Prefix/suffix stringlet.
	NUM_PREFIX_SUFFIX is the number of stringlets parsed and must		NUM_PREFIX_SUFFIX is the number of stringlets parsed and must

	skipping to change at line 533 ¶		skipping to change at line 542 ¶
	for the last (terminating) entry of the transform list. For		for the last (terminating) entry of the transform list. For
	other entries, STRING_LENGTH must be in range 1..255. The 0		other entries, STRING_LENGTH must be in range 1..255. The 0
	entry must be present and must be the last byte of the		entry must be present and must be the last byte of the
	PREFIX_SUFFIX_LENGTH bytes of prefix/suffix data, else the		PREFIX_SUFFIX_LENGTH bytes of prefix/suffix data, else the
	stream must be rejected as invalid.		stream must be rejected as invalid.

	STRING_LENGTH bytes: Contents of the prefix/suffix.		STRING_LENGTH bytes: Contents of the prefix/suffix.

	1 byte: NTRANSFORMS. Number of transformation triplets.		1 byte: NTRANSFORMS. Number of transformation triplets.


	NTRANSFORMS times: Data for each transform:		NTRANSFORMS times the data for each transform listed below:

	1 byte: Index of prefix in prefix/suffix data; must be less		1 byte: Index of prefix in prefix/suffix data; must be less
	than NUM_PREFIX_SUFFIX.		than NUM_PREFIX_SUFFIX.

	1 byte: Index of suffix in prefix/suffix data; must be less		1 byte: Index of suffix in prefix/suffix data; must be less
	than NUM_PREFIX_SUFFIX.		than NUM_PREFIX_SUFFIX.

	1 byte: Operation index; must be an index in the table of		1 byte: Operation index; must be an index in the table of
	operations listed in Section 3.1.1.		operations listed in Section 3.1.1.

	If and only if at least one transform has operation index		If and only if at least one transform has operation index

	ShiftFirst or ShiftAll:		ShiftFirst or ShiftAll, then NTRANSFORMS times the following:

	NTRANSFORMS times:


	2 bytes: Parameters for the transform. If the transform		2 bytes: Parameters for the transform. If the transform does
	does not have type ShiftFirst or ShiftAll, the value must		not have type ShiftFirst or ShiftAll, the value must be 0.
	be 0. ShiftFirst and ShiftAll interpret these bytes as		ShiftFirst and ShiftAll interpret these bytes as an unsigned
	an unsigned 16-bit integer.		16-bit integer.

	If NUM_CUSTOM_WORD_LISTS > 0 or NUM_CUSTOM_TRANSFORM_LISTS > 0		If NUM_CUSTOM_WORD_LISTS > 0 or NUM_CUSTOM_TRANSFORM_LISTS > 0
	(else implicitly NUM_DICTIONARIES is 1 and points to the brotli		(else implicitly NUM_DICTIONARIES is 1 and points to the brotli
	built-in and there is no context map):		built-in and there is no context map):


	1 byte: NUM_DICTIONARIES. May have value 1 to 64. Each		1 byte: NUM_DICTIONARIES. May have a value in range 1 to 64.
	dictionary is a combination of a word list and a transform		Each dictionary is a combination of a word list and a transform
	list. Each next dictionary is used when the distance goes		list. Each next dictionary is used when the distance goes
	beyond the previous. If a CONTEXT_MAP is enabled, then the		beyond the previous. If a CONTEXT_MAP is enabled, then the
	dictionary matching the context is moved to the front in the		dictionary matching the context is moved to the front in the
	order for this context.		order for this context.


	NUM_DICTIONARIES times: The DICTIONARY_MAP:		NUM_DICTIONARIES times the DICTIONARY_MAP, which contains:

	1 byte: Index into a custom word list or value		1 byte: Index into a custom word list or value
	NUM_CUSTOM_WORD_LISTS to indicate using the brotli [RFC7932]		NUM_CUSTOM_WORD_LISTS to indicate using the brotli [RFC7932]
	built-in default word list.		built-in default word list.

	1 byte: Index into a custom transform list or value		1 byte: Index into a custom transform list or value
	NUM_CUSTOM_TRANSFORM_LISTS to indicate using the brotli		NUM_CUSTOM_TRANSFORM_LISTS to indicate using the brotli
	[RFC7932] built-in default transform list.		[RFC7932] built-in default transform list.

	1 byte: CONTEXT_ENABLED. If 0, there is no context map. If 1, a		1 byte: CONTEXT_ENABLED. If 0, there is no context map. If 1, a

	skipping to change at line 592 ¶		skipping to change at line 599 ¶
	first dictionary to use for this context.		first dictionary to use for this context.

	6. Large Window Brotli Compressed Data Stream		6. Large Window Brotli Compressed Data Stream

	Large window brotli allows a sliding window beyond the 24-bit maximum		Large window brotli allows a sliding window beyond the 24-bit maximum
	of regular brotli [RFC7932].		of regular brotli [RFC7932].

	The compressed data stream is backwards compatible to brotli		The compressed data stream is backwards compatible to brotli
	[RFC7932] and may optionally have the following differences:		[RFC7932] and may optionally have the following differences:


	Encoding of WBITS in the stream header: The following new pattern of		In the encoding of WBITS in the stream header, the following new
	14 bits is supported:		pattern of 14 bits is supported:

	8 bits: Value 00010001 to indicate a large window brotli stream.		8 bits: Value 00010001 to indicate a large window brotli stream.

	6 bits: WBITS. Must have value in range 10 to 62.		6 bits: WBITS. Must have value in range 10 to 62.

	Distance alphabet: If the stream is a large window brotli stream,		Distance alphabet: If the stream is a large window brotli stream,
	the maximum number of extra bits is 62 and the theoretical maximum		the maximum number of extra bits is 62 and the theoretical maximum
	size of the distance alphabet is (16 + NDIRECT + (124 <<		size of the distance alphabet is (16 + NDIRECT + (124 <<
	NPOSTFIX)). This overrides the value for the distance alphabet		NPOSTFIX)). This overrides the value for the distance alphabet
	size given in Section 3.3 of [RFC7932] and affects the number of		size given in Section 3.3 of [RFC7932] and affects the number of

	skipping to change at line 638 ¶		skipping to change at line 645 ¶
	* The stream may have the format of regular brotli [RFC7932] or the		* The stream may have the format of regular brotli [RFC7932] or the
	format of large window brotli as described in Section 6.		format of large window brotli as described in Section 6.

	8. Shared Brotli Framing Format Stream		8. Shared Brotli Framing Format Stream

	A compliant shared brotli framing format stream has the format		A compliant shared brotli framing format stream has the format
	described below.		described below.

	8.1. Main Format		8.1. Main Format


	4 bytes: File signature, in hexadecimal the bytes 0x91, 0x0a, 0x42,		4 bytes: File signature in hexadecimal format (bytes 0x91, 0x0a,
	0x52. The first byte contains the invalid WBITS combination for		0x42, and 0x52). The first byte contains the invalid WBITS
	brotli [RFC7932] and large window brotli.		combination for brotli [RFC7932] and large window brotli.

	1 byte: Container flags that are 8 bits and have the following		1 byte: Container flags that are 8 bits and have the following
	meanings:		meanings:


	bit 0 and 1: Version indicator that must be b'00. Otherwise, the		bits 0 and 1: Version indicator that must be b'00. Otherwise,
	decoder must reject the data stream as invalid.		the decoder must reject the data stream as invalid.

	bit 2: If 0, the file contains no final footer, may not contain		bit 2: If 0, the file contains no final footer, may not contain
	any metadata chunks, may not contain a central directory, and		any metadata chunks, may not contain a central directory, and
	may encode only a single resource (using one or more data		may encode only a single resource (using one or more data
	chunks). If 1, the file may contain one or more resources,		chunks). If 1, the file may contain one or more resources,
	metadata, and a central directory, and it must contain a final		metadata, and a central directory, and it must contain a final
	footer.		footer.

	multiple times: A chunk, each with the format specified in		multiple times: A chunk, each with the format specified in
	Section 8.2.		Section 8.2.

	skipping to change at line 707 ¶		skipping to change at line 714 ¶
	can be 1 serialized dictionary and 15 prefix dictionaries		can be 1 serialized dictionary and 15 prefix dictionaries
	maximum (a serialized dictionary may already contain one of		maximum (a serialized dictionary may already contain one of
	those). Circular references are not allowed (any dictionary		those). Circular references are not allowed (any dictionary
	reference that directly or indirectly uses this chunk itself as		reference that directly or indirectly uses this chunk itself as
	dictionary).		dictionary).

	Per dictionary reference:		Per dictionary reference:

	1 byte: Flags:		1 byte: Flags:


	bit 0 and 1: Dictionary source:		bits 0 and 1: Dictionary source:

	00: Internal dictionary reference to a full resource by		00: Internal dictionary reference to a full resource by
	pointer, which can span one or more chunks. Must		pointer, which can span one or more chunks. Must
	point to a full data chunk or a first partial data		point to a full data chunk or a first partial data
	chunk.		chunk.

	01: Internal dictionary reference to single chunk		01: Internal dictionary reference to single chunk
	contents by pointer. May point to any chunk with		contents by pointer. May point to any chunk with
	content (data or metadata). If a partial data		content (data or metadata). If a partial data
	chunk, only this part is the dictionary. In this		chunk, only this part is the dictionary. In this

	skipping to change at line 731 ¶		skipping to change at line 738 ¶
	10: Reference to a dictionary by hash code of a		10: Reference to a dictionary by hash code of a
	resource. The dictionary can come from an external		resource. The dictionary can come from an external
	source, such as a different container. The user of		source, such as a different container. The user of
	the decoder must be able to provide the dictionary		the decoder must be able to provide the dictionary
	contents given its hash code (even if it comes from		contents given its hash code (even if it comes from
	this container itself) or treat it as an error when		this container itself) or treat it as an error when
	the user does not have it available.		the user does not have it available.

	11: Invalid bit combination		11: Invalid bit combination


	bit 2 and 3: Dictionary type:		bits 2 and 3: Dictionary type:

	00: Prefix dictionary, set in front of the sliding		00: Prefix dictionary, set in front of the sliding
	window		window

	01: Serialized dictionary in the shared brotli format as		01: Serialized dictionary in the shared brotli format as
	specified in Section 5.		specified in Section 5.

	10: Invalid bit combination		10: Invalid bit combination

	11: Invalid bit combination		11: Invalid bit combination


	bit 4-7: Must be 0		bits 4-7: Must be 0

	If hash-based:		If hash-based:

	1 byte: Type of hash used. Only supported value: 3,		1 byte: Type of hash used. Only supported value: 3,
	indicating 256-bit HighwayHash [HWYHASH].		indicating 256-bit HighwayHash [HWYHASH].

	32 bytes: 256-bit HighwayHash checksum to refer to		32 bytes: 256-bit HighwayHash checksum to refer to
	dictionary.		dictionary.

	If pointer based: Varint-encoded pointer to its chunk in this		If pointer based: Varint-encoded pointer to its chunk in this
	container. The chunk must come in the container earlier		container. The chunk must come in the container earlier
	than the current chunk.		than the current chunk.

	X bytes: Extra header bytes, depending on CHUNK_TYPE. If present,		X bytes: Extra header bytes, depending on CHUNK_TYPE. If present,
	they are specified in the subsequent sections.		they are specified in the subsequent sections.


	remaining bytes: The chunk contents. The uncompressed data in		remaining bytes: The chunk contents. The uncompressed data in the
	the chunk content depends on CHUNK_TYPE and is specified in the		chunk content depends on CHUNK_TYPE and is specified in the
	subsequent sections. The compressed data has following format		subsequent sections. The compressed data has following format
	depending on CODEC:		depending on CODEC:


	* uncompressed: The raw bytes.		* uncompressed: The raw bytes.


	* If "keep decoder", the continuation of the compressed stream		* If "keep decoder", the continuation of the compressed stream
	that was interrupted at the end of the previous chunk. The		that was interrupted at the end of the previous chunk. The
	decoder from the previous chunk must be used and its state		decoder from the previous chunk must be used and its state it
	it had at the end of the previous chunk must be kept at the		had at the end of the previous chunk must be kept at the start
	start of the decoding of this chunk.		of the decoding of this chunk.


	* brotli: The bytes are in brotli format [RFC7932].		* brotli: The bytes are in brotli format [RFC7932].


	* shared brotli: The bytes are in the shared brotli format		* shared brotli: The bytes are in the shared brotli format
	specified in Section 7.		specified in Section 7.

	8.3. Metadata Format		8.3. Metadata Format

	All the metadata chunk types use the following format for the		All the metadata chunk types use the following format for the
	uncompressed content:		uncompressed content:

	Per field:		Per field:
	2 bytes: Code to identify this metadata field. This must be two		2 bytes: Code to identify this metadata field. This must be two
	lowercase or two uppercase alpha ASCII characters. If the		lowercase or two uppercase alpha ASCII characters. If the
	decoder encounters a lowercase field that it does not recognize		decoder encounters a lowercase field that it does not recognize

	skipping to change at line 828 ¶		skipping to change at line 835 ¶

	This chunk contains metadata that applies to the resource whose		This chunk contains metadata that applies to the resource whose
	beginning is encoded in the subsequent data chunk or first partial		beginning is encoded in the subsequent data chunk or first partial
	data chunk.		data chunk.

	The contents of this chunk follows the format described in		The contents of this chunk follows the format described in
	Section 8.3.		Section 8.3.

	The following field types are recognized:		The following field types are recognized:


	id: Name field. May appear 0 or 1 times. Has the following format:		id (N bytes): Name field. May appear 0 or 1 times. Has the
			following format: name in UTF-8 encoding, length determined by the
	N bytes: Name in UTF-8 encoding, length determined by the field		field length. Treated generically but may be used as a filename.
	length. Treated generically but may be used as a filename. If		If used as a filename, forward slashes '/' should be used as
	used as a filename, forward slashes '/' should be used as		directory separators, relative paths should be used, and filenames
	directory separators, relative paths should be used, and		ending in a slash with 0-length content in the matching data chunk
	filenames ending in a slash with 0-length content in the		should be treated as an empty directory.
	matching data chunk should be treated as an empty directory.

	mt: Modification type. May appear 0 or 1 times. Has the following
	format:


	8 bytes: Microseconds since epoch, as a little-endian, signed		mt (8 bytes): Modification type. May appear 0 or 1 times. Has the
	two's complement 64-bit integer.		following format: contains microseconds since epoch, as a little-
			endian, signed two's complement 64-bit integer.

	custom user field: Any two uppercase ASCII characters.		custom user field: Any two uppercase ASCII characters.

	8.4.3. Data Chunk (Type 2)		8.4.3. Data Chunk (Type 2)

	A data chunk contains the actual data of a resource.		A data chunk contains the actual data of a resource.

	This chunk has the following extra header bytes:		This chunk has the following extra header bytes:

	1 byte: Flags:		1 byte: Flags:

	bit 0: If true, indicates this is not a resource that should be		bit 0: If true, indicates this is not a resource that should be
	output implicitly as part of extracting resources from this		output implicitly as part of extracting resources from this
	container. Instead, it may be referred to only explicitly,		container. Instead, it may be referred to only explicitly,
	e.g., as a dictionary reference by hash code or offset. This		e.g., as a dictionary reference by hash code or offset. This
	flag should be set for data used as dictionary to improve		flag should be set for data used as dictionary to improve
	compression of actual resources.		compression of actual resources.


	bit 1: If true, hash code is given		bit 1: If true, hash code is given.

	bits 2-7: Must be zero.		bits 2-7: Must be zero.

	If hash code is given:		If hash code is given:

	1 byte: Type of hash used. Only supported value: 3, indicating		1 byte: Type of hash used. Only supported value: 3, indicating
	256-bit HighwayHash [HWYHASH].		256-bit HighwayHash [HWYHASH].

	32 bytes: 256-bit HighwayHash checksum of the uncompressed data.		32 bytes: 256-bit HighwayHash checksum of the uncompressed data.


	skipping to change at line 1003 ¶		skipping to change at line 1007 ¶
	8.4.10. Central Directory Chunk (Type 9)		8.4.10. Central Directory Chunk (Type 9)

	The central directory chunk along with the repeat metadata chunks		The central directory chunk along with the repeat metadata chunks
	allow quickly finding and listing compressed resources in the		allow quickly finding and listing compressed resources in the
	container file.		container file.

	The central directory chunk is always uncompressed and does not have		The central directory chunk is always uncompressed and does not have
	the codec byte. It instead has the following format:		the codec byte. It instead has the following format:

	varint: Pointer into the file where the repeat metadata chunks are		varint: Pointer into the file where the repeat metadata chunks are

	located or 0 if they are not present per chunk listed:		located or 0 if they are not present.

			per chunk listed:

	varint: Pointer into the file where this chunk begins.		varint: Pointer into the file where this chunk begins.

	varint: Number of header bytes N used below.		varint: Number of header bytes N used below.

	N bytes: Copy of all the header bytes of the pointed at chunk,		N bytes: Copy of all the header bytes of the pointed at chunk,
	including total size, chunk type byte, codec, uncompressed		including total size, chunk type byte, codec, uncompressed
	size, dictionary references, and X extra header bytes. The		size, dictionary references, and X extra header bytes. The
	content is not repeated here.		content is not repeated here.

	The last listed chunk is reached when the end of the contents of the		The last listed chunk is reached when the end of the contents of the
	central directory are reached. If the end does not match the last		central directory are reached. If the end does not match the last
	byte of the central directory, the decoder must reject the data		byte of the central directory, the decoder must reject the data
	stream as invalid.		stream as invalid.

	If present, the central directory must list all data and metadata		If present, the central directory must list all data and metadata
	chunks of all types.		chunks of all types.

	8.4.11. Final Footer Chunk (Type 10)		8.4.11. Final Footer Chunk (Type 10)


	The final footer chunk closes the file and is only present if in the		The final footer chunk closes the file and is only present if bit 2
	initial container header flags bit 2 was set.		of the initial container flags was set.

	This chunk has the following content, which is always uncompressed:		This chunk has the following content, which is always uncompressed:

	reversed varint: Size of this entire framing format file, including		reversed varint: Size of this entire framing format file, including
	these bytes themselves, or 0 if this size is not given.		these bytes themselves, or 0 if this size is not given.

	reversed varint: Pointer to the start of the central directory, or 0		reversed varint: Pointer to the start of the central directory, or 0
	if there is none.		if there is none.

	A reversed varint has the same format as a varint but its bytes are		A reversed varint has the same format as a varint but its bytes are

	skipping to change at line 1092 ¶		skipping to change at line 1098 ¶
	The dictionary must be treated with the same security precautions as		The dictionary must be treated with the same security precautions as
	the content because a change to the dictionary can result in a change		the content because a change to the dictionary can result in a change
	to the decompressed content.		to the decompressed content.

	The CRIME attack [CRIME] shows that it's a bad idea to compress data		The CRIME attack [CRIME] shows that it's a bad idea to compress data
	from mixed (e.g., public and private) sources -- the data sources		from mixed (e.g., public and private) sources -- the data sources
	include not only the compressed data but also the dictionaries. For		include not only the compressed data but also the dictionaries. For
	example, if you compress secret cookies using a public-data-only		example, if you compress secret cookies using a public-data-only
	dictionary, you still leak information about the cookies.		dictionary, you still leak information about the cookies.


	Not only can the dictionary reveal information about the compressed		The dictionary can reveal information about the compressed data and
	data, but vice versa; data compressed with the dictionary can reveal		vice versa. That is, data compressed with the dictionary can reveal
	the contents of the dictionary when an adversary can control parts of		contents of the dictionary when an adversary can control parts of the
	data to compress and see the compressed size. On the other hand, if		data to compress and see the compressed size. On the other hand, if
	the adversary can control the dictionary, the adversary can learn		the adversary can control the dictionary, the adversary can learn
	information about the compressed data.		information about the compressed data.

	The most robust defense against CRIME is not to compress private		The most robust defense against CRIME is not to compress private
	data, e.g., sensitive headers like cookies or any content with		data, e.g., sensitive headers like cookies or any content with
	personally identifiable information (PII). The challenge has been to		personally identifiable information (PII). The challenge has been to
	identify secrets within a vast amount of data to be compressed.		identify secrets within a vast amount of data to be compressed.
	Cloudflare uses a regular expression [CLOUDFLARE]. Another idea is		Cloudflare uses a regular expression [CLOUDFLARE]. Another idea is
	to extend existing web template systems (e.g., Soy [SOY]) to allow		to extend existing web template systems (e.g., Soy [SOY]) to allow

	skipping to change at line 1173 ¶		skipping to change at line 1179 ¶
	[CRIME] CVE Program, "CVE-2012-4929",		[CRIME] CVE Program, "CVE-2012-4929",
	<https://www.cve.org/CVERecord?id=CVE-2012-4929>.		<https://www.cve.org/CVERecord?id=CVE-2012-4929>.

	[LZ77] Ziv, J. and A. Lempel, "A Universal Algorithm for		[LZ77] Ziv, J. and A. Lempel, "A Universal Algorithm for
	Sequential Data Compression", IEEE Transactions on		Sequential Data Compression", IEEE Transactions on
	Information Theory, vol. 23, no. 3, pp. 337-343,		Information Theory, vol. 23, no. 3, pp. 337-343,
	DOI 10.1109/TIT.1977.1055714, May 1977,		DOI 10.1109/TIT.1977.1055714, May 1977,
	<https://doi.org/10.1109/TIT.1977.1055714>.		<https://doi.org/10.1109/TIT.1977.1055714>.

	[SOY] Google Developers, "Closure Tools",		[SOY] Google Developers, "Closure Tools",

	<https://developers.google.com/closure/templates/>.		<https://developers.google.com/closure>.

	Acknowledgments		Acknowledgments

	The authors would like to thank Robert Obryk for suggesting		The authors would like to thank Robert Obryk for suggesting
	improvements to the format and the text of the specification.		improvements to the format and the text of the specification.

	Authors' Addresses		Authors' Addresses

	Jyrki Alakuijala		Jyrki Alakuijala
	Google, Inc.		Google, Inc.

End of changes. 36 change blocks.
	79 lines changed or deleted		85 lines changed or added
This html diff was produced by rfcdiff 1.48.