logo

oasis-root

Compiled tree of Oasis Linux based on own branch at <https://hacktivis.me/git/oasis/> git clone https://anongit.hacktivis.me/git/oasis-root.git

gitformat-commit-graph.5 (12234B)


  1. '\" t
  2. .\" Title: gitformat-commit-graph
  3. .\" Author: [FIXME: author] [see http://www.docbook.org/tdg5/en/html/author]
  4. .\" Generator: DocBook XSL Stylesheets v1.79.2 <http://docbook.sf.net/>
  5. .\" Date: 2025-03-14
  6. .\" Manual: Git Manual
  7. .\" Source: Git 2.49.0
  8. .\" Language: English
  9. .\"
  10. .TH "GITFORMAT\-COMMIT\-GRAPH" "5" "2025-03-14" "Git 2\&.49\&.0" "Git Manual"
  11. .\" -----------------------------------------------------------------
  12. .\" * Define some portability stuff
  13. .\" -----------------------------------------------------------------
  14. .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  15. .\" http://bugs.debian.org/507673
  16. .\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html
  17. .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  18. .ie \n(.g .ds Aq \(aq
  19. .el .ds Aq '
  20. .\" -----------------------------------------------------------------
  21. .\" * set default formatting
  22. .\" -----------------------------------------------------------------
  23. .\" disable hyphenation
  24. .nh
  25. .\" disable justification (adjust text to left margin only)
  26. .ad l
  27. .\" -----------------------------------------------------------------
  28. .\" * MAIN CONTENT STARTS HERE *
  29. .\" -----------------------------------------------------------------
  30. .SH "NAME"
  31. gitformat-commit-graph \- Git commit\-graph format
  32. .SH "SYNOPSIS"
  33. .sp
  34. .nf
  35. $GIT_DIR/objects/info/commit\-graph
  36. $GIT_DIR/objects/info/commit\-graphs/*
  37. .fi
  38. .SH "DESCRIPTION"
  39. .sp
  40. The Git commit\-graph stores a list of commit OIDs and some associated metadata, including:
  41. .sp
  42. .RS 4
  43. .ie n \{\
  44. \h'-04'\(bu\h'+03'\c
  45. .\}
  46. .el \{\
  47. .sp -1
  48. .IP \(bu 2.3
  49. .\}
  50. The generation number of the commit\&.
  51. .RE
  52. .sp
  53. .RS 4
  54. .ie n \{\
  55. \h'-04'\(bu\h'+03'\c
  56. .\}
  57. .el \{\
  58. .sp -1
  59. .IP \(bu 2.3
  60. .\}
  61. The root tree OID\&.
  62. .RE
  63. .sp
  64. .RS 4
  65. .ie n \{\
  66. \h'-04'\(bu\h'+03'\c
  67. .\}
  68. .el \{\
  69. .sp -1
  70. .IP \(bu 2.3
  71. .\}
  72. The commit date\&.
  73. .RE
  74. .sp
  75. .RS 4
  76. .ie n \{\
  77. \h'-04'\(bu\h'+03'\c
  78. .\}
  79. .el \{\
  80. .sp -1
  81. .IP \(bu 2.3
  82. .\}
  83. The parents of the commit, stored using positional references within the graph file\&.
  84. .RE
  85. .sp
  86. .RS 4
  87. .ie n \{\
  88. \h'-04'\(bu\h'+03'\c
  89. .\}
  90. .el \{\
  91. .sp -1
  92. .IP \(bu 2.3
  93. .\}
  94. The Bloom filter of the commit carrying the paths that were changed between the commit and its first parent, if requested\&.
  95. .RE
  96. .sp
  97. These positional references are stored as unsigned 32\-bit integers corresponding to the array position within the list of commit OIDs\&. Due to some special constants we use to track parents, we can store at most (1 << 30) + (1 << 29) + (1 << 28) \- 1 (around 1\&.8 billion) commits\&.
  98. .SH "COMMIT\-GRAPH FILES HAVE THE FOLLOWING FORMAT:"
  99. .sp
  100. In order to allow extensions that add extra data to the graph, we organize the body into "chunks" and provide a binary lookup table at the beginning of the body\&. The header includes certain values, such as number of chunks and hash type\&.
  101. .sp
  102. All multi\-byte numbers are in network byte order\&.
  103. .SS "HEADER:"
  104. .sp
  105. .if n \{\
  106. .RS 4
  107. .\}
  108. .nf
  109. 4\-byte signature:
  110. The signature is: {\*(AqC\*(Aq, \*(AqG\*(Aq, \*(AqP\*(Aq, \*(AqH\*(Aq}
  111. .fi
  112. .if n \{\
  113. .RE
  114. .\}
  115. .sp
  116. .if n \{\
  117. .RS 4
  118. .\}
  119. .nf
  120. 1\-byte version number:
  121. Currently, the only valid version is 1\&.
  122. .fi
  123. .if n \{\
  124. .RE
  125. .\}
  126. .sp
  127. .if n \{\
  128. .RS 4
  129. .\}
  130. .nf
  131. 1\-byte Hash Version
  132. We infer the hash length (H) from this value:
  133. 1 => SHA\-1
  134. 2 => SHA\-256
  135. If the hash type does not match the repository\*(Aqs hash algorithm, the
  136. commit\-graph file should be ignored with a warning presented to the
  137. user\&.
  138. .fi
  139. .if n \{\
  140. .RE
  141. .\}
  142. .sp
  143. .if n \{\
  144. .RS 4
  145. .\}
  146. .nf
  147. 1\-byte number (C) of "chunks"
  148. .fi
  149. .if n \{\
  150. .RE
  151. .\}
  152. .sp
  153. .if n \{\
  154. .RS 4
  155. .\}
  156. .nf
  157. 1\-byte number (B) of base commit\-graphs
  158. We infer the length (H*B) of the Base Graphs chunk
  159. from this value\&.
  160. .fi
  161. .if n \{\
  162. .RE
  163. .\}
  164. .SS "CHUNK LOOKUP:"
  165. .sp
  166. .if n \{\
  167. .RS 4
  168. .\}
  169. .nf
  170. (C + 1) * 12 bytes listing the table of contents for the chunks:
  171. First 4 bytes describe the chunk id\&. Value 0 is a terminating label\&.
  172. Other 8 bytes provide the byte\-offset in current file for chunk to
  173. start\&. (Chunks are ordered contiguously in the file, so you can infer
  174. the length using the next chunk position if necessary\&.) Each chunk
  175. ID appears at most once\&.
  176. .fi
  177. .if n \{\
  178. .RE
  179. .\}
  180. .sp
  181. .if n \{\
  182. .RS 4
  183. .\}
  184. .nf
  185. The CHUNK LOOKUP matches the table of contents from
  186. the chunk\-based file format, see linkgit:gitformat\-chunk[5]
  187. .fi
  188. .if n \{\
  189. .RE
  190. .\}
  191. .sp
  192. .if n \{\
  193. .RS 4
  194. .\}
  195. .nf
  196. The remaining data in the body is described one chunk at a time, and
  197. these chunks may be given in any order\&. Chunks are required unless
  198. otherwise specified\&.
  199. .fi
  200. .if n \{\
  201. .RE
  202. .\}
  203. .SS "CHUNK DATA:"
  204. .sp
  205. .it 1 an-trap
  206. .nr an-no-space-flag 1
  207. .nr an-break-flag 1
  208. .br
  209. .ps +1
  210. \fBOID Fanout (ID: {O, I, D, F}) (256 * 4 bytes)\fR
  211. .RS 4
  212. .sp
  213. .if n \{\
  214. .RS 4
  215. .\}
  216. .nf
  217. The ith entry, F[i], stores the number of OIDs with first
  218. byte at most i\&. Thus F[255] stores the total
  219. number of commits (N)\&.
  220. .fi
  221. .if n \{\
  222. .RE
  223. .\}
  224. .RE
  225. .sp
  226. .it 1 an-trap
  227. .nr an-no-space-flag 1
  228. .nr an-break-flag 1
  229. .br
  230. .ps +1
  231. \fBOID Lookup (ID: {O, I, D, L}) (N * H bytes)\fR
  232. .RS 4
  233. .sp
  234. .if n \{\
  235. .RS 4
  236. .\}
  237. .nf
  238. The OIDs for all commits in the graph, sorted in ascending order\&.
  239. .fi
  240. .if n \{\
  241. .RE
  242. .\}
  243. .RE
  244. .sp
  245. .it 1 an-trap
  246. .nr an-no-space-flag 1
  247. .nr an-break-flag 1
  248. .br
  249. .ps +1
  250. \fBCommit Data (ID: {C, D, A, T }) (N * (H + 16) bytes)\fR
  251. .RS 4
  252. .sp
  253. .RS 4
  254. .ie n \{\
  255. \h'-04'\(bu\h'+03'\c
  256. .\}
  257. .el \{\
  258. .sp -1
  259. .IP \(bu 2.3
  260. .\}
  261. The first H bytes are for the OID of the root tree\&.
  262. .RE
  263. .sp
  264. .RS 4
  265. .ie n \{\
  266. \h'-04'\(bu\h'+03'\c
  267. .\}
  268. .el \{\
  269. .sp -1
  270. .IP \(bu 2.3
  271. .\}
  272. The next 8 bytes are for the positions of the first two parents of the ith commit\&. Stores value 0x70000000 if no parent in that position\&. If there are more than two parents, the second value has its most\-significant bit on and the other bits store an array position into the Extra Edge List chunk\&.
  273. .RE
  274. .sp
  275. .RS 4
  276. .ie n \{\
  277. \h'-04'\(bu\h'+03'\c
  278. .\}
  279. .el \{\
  280. .sp -1
  281. .IP \(bu 2.3
  282. .\}
  283. The next 8 bytes store the topological level (generation number v1) of the commit and the commit time in seconds since EPOCH\&. The generation number uses the higher 30 bits of the first 4 bytes, while the commit time uses the 32 bits of the second 4 bytes, along with the lowest 2 bits of the lowest byte, storing the 33rd and 34th bit of the commit time\&.
  284. .RE
  285. .RE
  286. .sp
  287. .it 1 an-trap
  288. .nr an-no-space-flag 1
  289. .nr an-break-flag 1
  290. .br
  291. .ps +1
  292. \fBGeneration Data (ID: {G, D, A, 2 }) (N * 4 bytes) [Optional]\fR
  293. .RS 4
  294. .sp
  295. .RS 4
  296. .ie n \{\
  297. \h'-04'\(bu\h'+03'\c
  298. .\}
  299. .el \{\
  300. .sp -1
  301. .IP \(bu 2.3
  302. .\}
  303. This list of 4\-byte values store corrected commit date offsets for the commits, arranged in the same order as commit data chunk\&.
  304. .RE
  305. .sp
  306. .RS 4
  307. .ie n \{\
  308. \h'-04'\(bu\h'+03'\c
  309. .\}
  310. .el \{\
  311. .sp -1
  312. .IP \(bu 2.3
  313. .\}
  314. If the corrected commit date offset cannot be stored within 31 bits, the value has its most\-significant bit on and the other bits store the position of corrected commit date into the Generation Data Overflow chunk\&.
  315. .RE
  316. .sp
  317. .RS 4
  318. .ie n \{\
  319. \h'-04'\(bu\h'+03'\c
  320. .\}
  321. .el \{\
  322. .sp -1
  323. .IP \(bu 2.3
  324. .\}
  325. Generation Data chunk is present only when commit\-graph file is written by compatible versions of Git and in case of split commit\-graph chains, the topmost layer also has Generation Data chunk\&.
  326. .RE
  327. .RE
  328. .sp
  329. .it 1 an-trap
  330. .nr an-no-space-flag 1
  331. .nr an-break-flag 1
  332. .br
  333. .ps +1
  334. \fBGeneration Data Overflow (ID: {G, D, O, 2 }) [Optional]\fR
  335. .RS 4
  336. .sp
  337. .RS 4
  338. .ie n \{\
  339. \h'-04'\(bu\h'+03'\c
  340. .\}
  341. .el \{\
  342. .sp -1
  343. .IP \(bu 2.3
  344. .\}
  345. This list of 8\-byte values stores the corrected commit date offsets for commits with corrected commit date offsets that cannot be stored within 31 bits\&.
  346. .RE
  347. .sp
  348. .RS 4
  349. .ie n \{\
  350. \h'-04'\(bu\h'+03'\c
  351. .\}
  352. .el \{\
  353. .sp -1
  354. .IP \(bu 2.3
  355. .\}
  356. Generation Data Overflow chunk is present only when Generation Data chunk is present and at least one corrected commit date offset cannot be stored within 31 bits\&.
  357. .RE
  358. .RE
  359. .sp
  360. .it 1 an-trap
  361. .nr an-no-space-flag 1
  362. .nr an-break-flag 1
  363. .br
  364. .ps +1
  365. \fBExtra Edge List (ID: {E, D, G, E}) [Optional]\fR
  366. .RS 4
  367. .sp
  368. .if n \{\
  369. .RS 4
  370. .\}
  371. .nf
  372. This list of 4\-byte values store the second through nth parents for
  373. all octopus merges\&. The second parent value in the commit data stores
  374. an array position within this list along with the most\-significant bit
  375. on\&. Starting at that array position, iterate through this list of commit
  376. positions for the parents until reaching a value with the most\-significant
  377. bit on\&. The other bits correspond to the position of the last parent\&.
  378. .fi
  379. .if n \{\
  380. .RE
  381. .\}
  382. .RE
  383. .sp
  384. .it 1 an-trap
  385. .nr an-no-space-flag 1
  386. .nr an-break-flag 1
  387. .br
  388. .ps +1
  389. \fBBloom Filter Index (ID: {B, I, D, X}) (N * 4 bytes) [Optional]\fR
  390. .RS 4
  391. .sp
  392. .RS 4
  393. .ie n \{\
  394. \h'-04'\(bu\h'+03'\c
  395. .\}
  396. .el \{\
  397. .sp -1
  398. .IP \(bu 2.3
  399. .\}
  400. The ith entry, BIDX[i], stores the number of bytes in all Bloom filters from commit 0 to commit i (inclusive) in lexicographic order\&. The Bloom filter for the i\-th commit spans from BIDX[i\-1] to BIDX[i] (plus header length), where BIDX[\-1] is 0\&.
  401. .RE
  402. .sp
  403. .RS 4
  404. .ie n \{\
  405. \h'-04'\(bu\h'+03'\c
  406. .\}
  407. .el \{\
  408. .sp -1
  409. .IP \(bu 2.3
  410. .\}
  411. The BIDX chunk is ignored if the BDAT chunk is not present\&.
  412. .RE
  413. .RE
  414. .sp
  415. .it 1 an-trap
  416. .nr an-no-space-flag 1
  417. .nr an-break-flag 1
  418. .br
  419. .ps +1
  420. \fBBloom Filter Data (ID: {B, D, A, T}) [Optional]\fR
  421. .RS 4
  422. .sp
  423. .RS 4
  424. .ie n \{\
  425. \h'-04'\(bu\h'+03'\c
  426. .\}
  427. .el \{\
  428. .sp -1
  429. .IP \(bu 2.3
  430. .\}
  431. It starts with header consisting of three unsigned 32\-bit integers:
  432. .sp
  433. .RS 4
  434. .ie n \{\
  435. \h'-04'\(bu\h'+03'\c
  436. .\}
  437. .el \{\
  438. .sp -1
  439. .IP \(bu 2.3
  440. .\}
  441. Version of the hash algorithm being used\&. We currently support value 2 which corresponds to the 32\-bit version of the murmur3 hash implemented exactly as described in
  442. \m[blue]\fBhttps://en\&.wikipedia\&.org/wiki/MurmurHash#Algorithm\fR\m[]
  443. and the double hashing technique using seed values 0x293ae76f and 0x7e646e2 as described in
  444. \m[blue]\fBhttps://doi\&.org/10\&.1007/978\-3\-540\-30494\-4_26\fR\m[]
  445. "Bloom Filters in Probabilistic Verification"\&. Version 1 Bloom filters have a bug that appears when char is signed and the repository has path names that have characters >= 0x80; Git supports reading and writing them, but this ability will be removed in a future version of Git\&.
  446. .RE
  447. .sp
  448. .RS 4
  449. .ie n \{\
  450. \h'-04'\(bu\h'+03'\c
  451. .\}
  452. .el \{\
  453. .sp -1
  454. .IP \(bu 2.3
  455. .\}
  456. The number of times a path is hashed and hence the number of bit positions that cumulatively determine whether a file is present in the commit\&.
  457. .RE
  458. .sp
  459. .RS 4
  460. .ie n \{\
  461. \h'-04'\(bu\h'+03'\c
  462. .\}
  463. .el \{\
  464. .sp -1
  465. .IP \(bu 2.3
  466. .\}
  467. The minimum number of bits
  468. \fIb\fR
  469. per entry in the Bloom filter\&. If the filter contains
  470. \fIn\fR
  471. entries, then the filter size is the minimum number of 64\-bit words that contain n*b bits\&.
  472. .RE
  473. .RE
  474. .sp
  475. .RS 4
  476. .ie n \{\
  477. \h'-04'\(bu\h'+03'\c
  478. .\}
  479. .el \{\
  480. .sp -1
  481. .IP \(bu 2.3
  482. .\}
  483. The rest of the chunk is the concatenation of all the computed Bloom filters for the commits in lexicographic order\&.
  484. .RE
  485. .sp
  486. .RS 4
  487. .ie n \{\
  488. \h'-04'\(bu\h'+03'\c
  489. .\}
  490. .el \{\
  491. .sp -1
  492. .IP \(bu 2.3
  493. .\}
  494. Note: Commits with no changes or more than 512 changes have Bloom filters of length one, with either all bits set to zero or one respectively\&.
  495. .RE
  496. .sp
  497. .RS 4
  498. .ie n \{\
  499. \h'-04'\(bu\h'+03'\c
  500. .\}
  501. .el \{\
  502. .sp -1
  503. .IP \(bu 2.3
  504. .\}
  505. The BDAT chunk is present if and only if BIDX is present\&.
  506. .RE
  507. .RE
  508. .sp
  509. .it 1 an-trap
  510. .nr an-no-space-flag 1
  511. .nr an-break-flag 1
  512. .br
  513. .ps +1
  514. \fBBase Graphs List (ID: {B, A, S, E}) [Optional]\fR
  515. .RS 4
  516. .sp
  517. .if n \{\
  518. .RS 4
  519. .\}
  520. .nf
  521. This list of H\-byte hashes describe a set of B commit\-graph files that
  522. form a commit\-graph chain\&. The graph position for the ith commit in this
  523. file\*(Aqs OID Lookup chunk is equal to i plus the number of commits in all
  524. base graphs\&. If B is non\-zero, this chunk must exist\&.
  525. .fi
  526. .if n \{\
  527. .RE
  528. .\}
  529. .RE
  530. .SS "TRAILER:"
  531. .sp
  532. .if n \{\
  533. .RS 4
  534. .\}
  535. .nf
  536. H\-byte HASH\-checksum of all of the above\&.
  537. .fi
  538. .if n \{\
  539. .RE
  540. .\}
  541. .SH "HISTORICAL NOTES:"
  542. .sp
  543. The Generation Data (GDA2) and Generation Data Overflow (GDO2) chunks have the number \fI2\fR in their chunk IDs because a previous version of Git wrote possibly erroneous data in these chunks with the IDs "GDAT" and "GDOV"\&. By changing the IDs, newer versions of Git will silently ignore those older chunks and write the new information without trusting the incorrect data\&.
  544. .SH "GIT"
  545. .sp
  546. Part of the \fBgit\fR(1) suite