summarylogtreecommitdiffstats
path: root/CHANGELOG.md
blob: c6e1463e1f58854ee976330749f492ff2c3265ed (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
# librdkafka v2.6.1

librdkafka v2.6.1 is a maintenance release:

* Fix for a Fetch regression when connecting to Apache Kafka < 2.7 (#4871).
* Fix for an infinite loop happening with cooperative-sticky assignor
  under some particular conditions (#4800).
* Fix for retrieving offset commit metadata when it contains
  zeros and configured with `strndup` (#4876)


## Fixes

### Consumer fixes

* Issues: #4870
  Fix for a Fetch regression when connecting to Apache Kafka < 2.7, causing
  fetches to fail.
  Happening since v2.6.0 (#4871)
* Issues: #4783.
  A consumer configured with the `cooperative-sticky` partition assignment
  strategy could get stuck in an infinite loop, with corresponding spike of
  main thread CPU usage.
  That happened with some particular orders of members and potential 
  assignable partitions.
  Solved by removing the infinite loop cause.
  Happening since: 1.6.0 (#4800).
* Issues: #4649.
  When retrieving offset metadata, if the binary value contained zeros
  and librdkafka was configured with `strndup`, part of
  the buffer after first zero contained uninitialized data
  instead of rest of metadata. Solved by avoiding to use
  `strndup` for copying metadata.
  Happening since: 0.9.0 (#4876).



# librdkafka v2.6.0

librdkafka v2.6.0 is a feature release:

 * [KIP-460](https://cwiki.apache.org/confluence/display/KAFKA/KIP-460%3A+Admin+Leader+Election+RPC) Admin Leader Election RPC (#4845)
 * [KIP-714] Complete consumer metrics support (#4808).
 * [KIP-714] Produce latency average and maximum metrics support for parity with Java client (#4847).
 * [KIP-848] ListConsumerGroups Admin API now has an optional filter to return only groups
   of given types.
 * Added Transactional id resource type for ACL operations (@JohnPreston, #4856).
 * Fix for permanent fetch errors when using a newer Fetch RPC version with an older
   inter broker protocol (#4806).



## Fixes

### Consumer fixes

 * Issues: #4806
   Fix for permanent fetch errors when brokers support a Fetch RPC version greater than 12 
   but cluster is configured to use an inter broker protocol that is less than 2.8.
   In this case returned topic ids are zero valued and Fetch has to fall back
   to version 12, using topic names.
   Happening since v2.5.0 (#4806)



# librdkafka v2.5.3

librdkafka v2.5.3 is a feature release.

* Fix an assert being triggered during push telemetry call when no metrics matched on the client side. (#4826)

## Fixes

### Telemetry fixes

* Issue: #4833
Fix a regression introduced with [KIP-714](https://cwiki.apache.org/confluence/display/KAFKA/KIP-714%3A+Client+metrics+and+observability) support in which an assert is triggered during **PushTelemetry** call. This happens when no metric is matched on the client side among those requested by broker subscription.
Happening since 2.5.0 (#4826).

*Note: there were no v2.5.1 and v2.5.2 librdkafka releases*


# librdkafka v2.5.0

> [!WARNING]
This version has introduced a regression in which an assert is triggered during **PushTelemetry** call. This happens when no metric is matched on the client side among those requested by broker subscription. 
>
> You won't face any problem if:
> * Broker doesn't support [KIP-714](https://cwiki.apache.org/confluence/display/KAFKA/KIP-714%3A+Client+metrics+and+observability).
> * [KIP-714](https://cwiki.apache.org/confluence/display/KAFKA/KIP-714%3A+Client+metrics+and+observability) feature is disabled on the broker side.
> * [KIP-714](https://cwiki.apache.org/confluence/display/KAFKA/KIP-714%3A+Client+metrics+and+observability) feature is disabled on the client side. This is enabled by default. Set configuration `enable.metrics.push` to `false`.
> * If [KIP-714](https://cwiki.apache.org/confluence/display/KAFKA/KIP-714%3A+Client+metrics+and+observability) is enabled on the broker side and there is no subscription configured there.
> * If [KIP-714](https://cwiki.apache.org/confluence/display/KAFKA/KIP-714%3A+Client+metrics+and+observability) is enabled on the broker side with subscriptions that match the [KIP-714](https://cwiki.apache.org/confluence/display/KAFKA/KIP-714%3A+Client+metrics+and+observability) metrics defined on the client.
> 
> Having said this, we strongly recommend using `v2.5.3` and above to not face this regression at all.

librdkafka v2.5.0 is a feature release.

* [KIP-951](https://cwiki.apache.org/confluence/display/KAFKA/KIP-951%3A+Leader+discovery+optimisations+for+the+client)
  Leader discovery optimisations for the client (#4756, #4767).
* Fix segfault when using long client id because of erased segment when using flexver. (#4689)
* Fix for an idempotent producer error, with a message batch not reconstructed
  identically when retried (#4750)
* Removed support for CentOS 6 and CentOS 7 (#4775).
* [KIP-714](https://cwiki.apache.org/confluence/display/KAFKA/KIP-714%3A+Client+metrics+and+observability) Client 
  metrics and observability (#4721).

## Upgrade considerations

 * CentOS 6 and CentOS 7 support was removed as they reached EOL
   and security patches aren't publicly available anymore.
   ABI compatibility from CentOS 8 on is maintained through pypa/manylinux,
   AlmaLinux based.
   See also [Confluent supported OSs page](https://docs.confluent.io/platform/current/installation/versions-interoperability.html#operating-systems) (#4775).

## Enhancements

  * Update bundled lz4 (used when `./configure --disable-lz4-ext`) to
    [v1.9.4](https://github.com/lz4/lz4/releases/tag/v1.9.4), which contains
    bugfixes and performance improvements (#4726).
  * [KIP-951](https://cwiki.apache.org/confluence/display/KAFKA/KIP-951%3A+Leader+discovery+optimisations+for+the+client)
    With this KIP leader updates are received through Produce and Fetch responses
    in case of errors corresponding to leader changes and a partition migration
    happens before refreshing the metadata cache (#4756, #4767).


## Fixes

### General fixes

*  Issues: [confluentinc/confluent-kafka-dotnet#2084](https://github.com/confluentinc/confluent-kafka-dotnet/issues/2084)
   Fix segfault when a segment is erased and more data is written to the buffer.
   Happens since 1.x when a portion of the buffer (segment) is erased for flexver or compression.
   More likely to happen since 2.1.0, because of the upgrades to flexver, with certain string sizes like a long client id (#4689).

### Idempotent producer fixes

 * Issues: #4736
   Fix for an idempotent producer error, with a message batch not reconstructed
   identically when retried. Caused the error message "Local: Inconsistent state: Unable to reconstruct MessageSet".
   Happening on large batches. Solved by using the same backoff baseline for all messages
   in the batch.
   Happens since 2.2.0 (#4750).



# librdkafka v2.4.0

librdkafka v2.4.0 is a feature release:

 * [KIP-848](https://cwiki.apache.org/confluence/display/KAFKA/KIP-848%3A+The+Next+Generation+of+the+Consumer+Rebalance+Protocol): The Next Generation of the Consumer Rebalance Protocol.
   **Early Access**: This should be used only for evaluation and must not be used in production. Features and contract of this KIP might change in future (#4610).
 * [KIP-467](https://cwiki.apache.org/confluence/display/KAFKA/KIP-467%3A+Augment+ProduceResponse+error+messaging+for+specific+culprit+records): Augment ProduceResponse error messaging for specific culprit records (#4583).
 * [KIP-516](https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers)
   Continue partial implementation by adding a metadata cache by topic id
   and updating the topic id corresponding to the partition name (#4676)
 * Upgrade OpenSSL to v3.0.12 (while building from source) with various security fixes,
   check the [release notes](https://www.openssl.org/news/cl30.txt).
 * Integration tests can be started in KRaft mode and run against any
   GitHub Kafka branch other than the released versions.
 * Fix pipeline inclusion of static binaries (#4666)
 * Fix to main loop timeout calculation leading to a tight loop for a
   max period of 1 ms (#4671).
 * Fixed a bug causing duplicate message consumption from a stale
   fetch start offset in some particular cases (#4636)
 * Fix to metadata cache expiration on full metadata refresh (#4677).
 * Fix for a wrong error returned on full metadata refresh before joining
   a consumer group (#4678).
 * Fix to metadata refresh interruption (#4679).
 * Fix for an undesired partition migration with stale leader epoch (#4680).
 * Fix hang in cooperative consumer mode if an assignment is processed
   while closing the consumer (#4528).
 * Upgrade OpenSSL to v3.0.13 (while building from source) with various security fixes,
   check the [release notes](https://www.openssl.org/news/cl30.txt)
   (@janjwerner-confluent, #4690).
 * Upgrade zstd to v1.5.6, zlib to v1.3.1, and curl to v8.8.0 (@janjwerner-confluent, #4690).



## Upgrade considerations

 * With KIP 467, INVALID_MSG (Java: CorruptRecordExpection) will
   be retried automatically. INVALID_RECORD (Java: InvalidRecordException) instead
   is not retriable and will be set only to the records that caused the
   error. Rest of records in the batch will fail with the new error code
   _INVALID_DIFFERENT_RECORD (Java: KafkaException) and can be retried manually,
   depending on the application logic (#4583).


## Early Access

### [KIP-848](https://cwiki.apache.org/confluence/display/KAFKA/KIP-848%3A+The+Next+Generation+of+the+Consumer+Rebalance+Protocol): The Next Generation of the Consumer Rebalance Protocol
 * With this new protocol the role of the Group Leader (a member) is removed and
   the assignment is calculated by the Group Coordinator (a broker) and sent
   to each member through heartbeats.

   The feature is still _not production-ready_.
   It's possible to try it in a non-production enviroment.

   A [guide](INTRODUCTION.md#next-generation-of-the-consumer-group-protocol-kip-848) is available
   with considerations and steps to follow to test it (#4610).


## Fixes

### General fixes

 * Issues: [confluentinc/confluent-kafka-go#981](https://github.com/confluentinc/confluent-kafka-go/issues/981).
   In librdkafka release pipeline a static build containing libsasl2
   could be chosen instead of the alternative one without it.
   That caused the libsasl2 dependency to be required in confluent-kafka-go
   v2.1.0-linux-musl-arm64 and v2.3.0-linux-musl-arm64.
   Solved by correctly excluding the binary configured with that library,
   when targeting a static build.
   Happening since v2.0.2, with specified platforms,
   when using static binaries (#4666).
 * Issues: #4684.
   When the main thread loop was awakened less than 1 ms
   before the expiration of a timeout, it was serving with a zero timeout,
   leading to increased CPU usage until the timeout was reached.
   Happening since 1.x.
 * Issues: #4685.
   Metadata cache was cleared on full metadata refresh, leading to unnecessary
   refreshes and occasional `UNKNOWN_TOPIC_OR_PART` errors. Solved by updating
   cache for existing or hinted entries instead of clearing them.
   Happening since 2.1.0 (#4677).
 * Issues: #4589.
   A metadata call before member joins consumer group,
   could lead to an `UNKNOWN_TOPIC_OR_PART` error. Solved by updating
   the consumer group following a metadata refresh only in safe states.
   Happening since 2.1.0 (#4678).
 * Issues: #4577.
   Metadata refreshes without partition leader change could lead to a loop of
   metadata calls at fixed intervals. Solved by stopping metadata refresh when
   all existing metadata is non-stale. Happening since 2.3.0 (#4679).
 * Issues: #4687.
   A partition migration could happen, using stale metadata, when the partition
   was undergoing a validation and being retried because of an error.
   Solved by doing a partition migration only with a non-stale leader epoch.
   Happening since 2.1.0 (#4680).

### Consumer fixes

 * Issues: #4686.
   In case of subscription change with a consumer using the cooperative assignor
   it could resume fetching from a previous position.
   That could also happen if resuming a partition that wasn't paused.
   Fixed by ensuring that a resume operation is completely a no-op when
   the partition isn't paused.
   Happening since 1.x (#4636).
 * Issues: #4527.
   While using the cooperative assignor, given an assignment is received while closing the consumer
   it's possible that it gets stuck in state WAIT_ASSIGN_CALL, while the method is converted to
   a full unassign. Solved by changing state from WAIT_ASSIGN_CALL to WAIT_UNASSIGN_CALL
   while doing this conversion.
   Happening since 1.x (#4528).



# librdkafka v2.3.0

librdkafka v2.3.0 is a feature release:

 * [KIP-516](https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers)
   Partial support of topic identifiers. Topic identifiers in metadata response
   available through the new `rd_kafka_DescribeTopics` function (#4300, #4451).
 * [KIP-117](https://cwiki.apache.org/confluence/display/KAFKA/KIP-117%3A+Add+a+public+AdminClient+API+for+Kafka+admin+operations) Add support for AdminAPI `DescribeCluster()` and `DescribeTopics()`
  (#4240, @jainruchir).
 * [KIP-430](https://cwiki.apache.org/confluence/display/KAFKA/KIP-430+-+Return+Authorized+Operations+in+Describe+Responses):
   Return authorized operations in Describe Responses.
   (#4240, @jainruchir).
 * [KIP-580](https://cwiki.apache.org/confluence/display/KAFKA/KIP-580%3A+Exponential+Backoff+for+Kafka+Clients): Added Exponential Backoff mechanism for
   retriable requests with `retry.backoff.ms` as minimum backoff and `retry.backoff.max.ms` as the
   maximum backoff, with 20% jitter (#4422).
 * [KIP-396](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=97551484): completed the implementation with
   the addition of ListOffsets (#4225).
 * Fixed ListConsumerGroupOffsets not fetching offsets for all the topics in a group with Apache Kafka version below 2.4.0.
 * Add missing destroy that leads to leaking partition structure memory when there
   are partition leader changes and a stale leader epoch is received (#4429).
 * Fix a segmentation fault when closing a consumer using the
   cooperative-sticky assignor before the first assignment (#4381).
 * Fix for insufficient buffer allocation when allocating rack information (@wolfchimneyrock, #4449).
 * Fix for infinite loop of OffsetForLeaderEpoch requests on quick leader changes. (#4433).
 * Fix to add leader epoch to control messages, to make sure they're stored
   for committing even without a subsequent fetch message (#4434).
 * Fix for stored offsets not being committed if they lacked the leader epoch (#4442).
 * Upgrade OpenSSL to v3.0.11 (while building from source) with various security fixes,
   check the [release notes](https://www.openssl.org/news/cl30.txt)
   (#4454, started by @migarc1).
 * Fix to ensure permanent errors during offset validation continue being retried and
   don't cause an offset reset (#4447).
 * Fix to ensure max.poll.interval.ms is reset when rd_kafka_poll is called with
   consume_cb (#4431).
 * Fix for idempotent producer fatal errors, triggered after a possibly persisted message state (#4438).
 * Fix `rd_kafka_query_watermark_offsets` continuing beyond timeout expiry (#4460).
 * Fix `rd_kafka_query_watermark_offsets` not refreshing the partition leader
   after a leader change and subsequent `NOT_LEADER_OR_FOLLOWER` error (#4225).


## Upgrade considerations

 * `retry.backoff.ms`:
   If it is set greater than `retry.backoff.max.ms` which has the default value of 1000 ms then it is assumes the value of `retry.backoff.max.ms`.
   To change this behaviour make sure that `retry.backoff.ms` is always less than `retry.backoff.max.ms`.
   If equal then the backoff will be linear instead of exponential.

 * `topic.metadata.refresh.fast.interval.ms`:
   If it is set greater than `retry.backoff.max.ms` which has the default value of 1000 ms then it is assumes the value of `retry.backoff.max.ms`.
   To change this behaviour make sure that `topic.metadata.refresh.fast.interval.ms` is always less than `retry.backoff.max.ms`.
   If equal then the backoff will be linear instead of exponential.


## Fixes

### General fixes

 * An assertion failed with insufficient buffer size when allocating
   rack information on 32bit architectures.
   Solved by aligning all allocations to the maximum allowed word size (#4449).
 * The timeout for `rd_kafka_query_watermark_offsets` was not enforced after
   making the necessary ListOffsets requests, and thus, it never timed out in
   case of broker/network issues. Fixed by setting an absolute timeout (#4460).

### Idempotent producer fixes

 * After a possibly persisted error, such as a disconnection or a timeout, next expected sequence
   used to increase, leading to a fatal error if the message wasn't persisted and
   the second one in queue failed with an `OUT_OF_ORDER_SEQUENCE_NUMBER`.
   The error could contain the message "sequence desynchronization" with
   just one possibly persisted error or "rewound sequence number" in case of
   multiple errored messages.
   Solved by treating the possible persisted message as _not_ persisted,
   and expecting a `DUPLICATE_SEQUENCE_NUMBER` error in case it was or
   `NO_ERROR` in case it wasn't, in both cases the message will be considered
   delivered (#4438).

### Consumer fixes

  * Stored offsets were excluded from the commit if the leader epoch was
    less than committed epoch, as it's possible if leader epoch is the default -1.
    This didn't happen in Python, Go and .NET bindings when stored position was
    taken from the message.
    Solved by checking only that the stored offset is greater
    than committed one, if either stored or committed leader epoch is -1 (#4442).
  * If an OffsetForLeaderEpoch request was being retried, and the leader changed
    while the retry was in-flight, an infinite loop of requests was triggered,
    because we weren't updating the leader epoch correctly.
    Fixed by updating the leader epoch before sending the request (#4433).
  * During offset validation a permanent error like host resolution failure
    would cause an offset reset.
    This isn't what's expected or what the Java implementation does.
    Solved by retrying even in case of permanent errors (#4447).
  * If using `rd_kafka_poll_set_consumer`, along with a consume callback, and then
    calling `rd_kafka_poll` to service the callbacks, would not reset
    `max.poll.interval.ms.` This was because we were only checking `rk_rep` for
    consumer messages, while the method to service the queue internally also
    services the queue forwarded to from `rk_rep`, which is `rkcg_q`.
    Solved by moving the `max.poll.interval.ms` check into `rd_kafka_q_serve` (#4431).
  * After a leader change a `rd_kafka_query_watermark_offsets` call would continue
    trying to call ListOffsets on the old leader, if the topic wasn't included in
    the subscription set, so it started querying the new leader only after
    `topic.metadata.refresh.interval.ms` (#4225).



# librdkafka v2.2.0

librdkafka v2.2.0 is a feature release:

 * Fix a segmentation fault when subscribing to non-existent topics and
   using the consume batch functions (#4273).
 * Store offset commit metadata in `rd_kafka_offsets_store` (@mathispesch, #4084).
 * Fix a bug that happens when skipping tags, causing buffer underflow in
   MetadataResponse (#4278).
 * Fix a bug where topic leader is not refreshed in the same metadata call even if the leader is
   present.
 * [KIP-881](https://cwiki.apache.org/confluence/display/KAFKA/KIP-881%3A+Rack-aware+Partition+Assignment+for+Kafka+Consumers):
   Add support for rack-aware partition assignment for consumers
   (#4184, #4291, #4252).
 * Fix several bugs with sticky assignor in case of partition ownership
   changing between members of the consumer group (#4252).
 * [KIP-368](https://cwiki.apache.org/confluence/display/KAFKA/KIP-368%3A+Allow+SASL+Connections+to+Periodically+Re-Authenticate):
   Allow SASL Connections to Periodically Re-Authenticate
   (#4301, started by @vctoriawu).
 * Avoid treating an OpenSSL error as a permanent error and treat unclean SSL
   closes as normal ones (#4294).
 * Added `fetch.queue.backoff.ms` to the consumer to control how long
   the consumer backs off next fetch attempt. (@bitemyapp, @edenhill, #2879)
 * [KIP-235](https://cwiki.apache.org/confluence/display/KAFKA/KIP-235%3A+Add+DNS+alias+support+for+secured+connection):
   Add DNS alias support for secured connection (#4292).
 * [KIP-339](https://cwiki.apache.org/confluence/display/KAFKA/KIP-339%3A+Create+a+new+IncrementalAlterConfigs+API):
   IncrementalAlterConfigs API (started by @PrasanthV454, #4110).
 * [KIP-554](https://cwiki.apache.org/confluence/display/KAFKA/KIP-554%3A+Add+Broker-side+SCRAM+Config+API): Add Broker-side SCRAM Config API (#4241).


## Enhancements

 * Added `fetch.queue.backoff.ms` to the consumer to control how long
   the consumer backs off next fetch attempt. When the pre-fetch queue
   has exceeded its queuing thresholds: `queued.min.messages` and
   `queued.max.messages.kbytes` it backs off for 1 seconds.
   If those parameters have to be set too high to hold 1 s of data,
   this new parameter allows to back off the fetch earlier, reducing memory
   requirements.


## Fixes

### General fixes

 * Fix a bug that happens when skipping tags, causing buffer underflow in
   MetadataResponse. This is triggered since RPC version 9 (v2.1.0),
   when using Confluent Platform, only when racks are set,
   observers are activated and there is more than one partition.
   Fixed by skipping the correct amount of bytes when tags are received.
 * Avoid treating an OpenSSL error as a permanent error and treat unclean SSL
   closes as normal ones. When SSL connections are closed without `close_notify`,
   in OpenSSL 3.x a new type of error is set and it was interpreted as permanent
   in librdkafka. It can cause a different issue depending on the RPC.
   If received when waiting for OffsetForLeaderEpoch response, it triggers
   an offset reset following the configured policy.
   Solved by treating SSL errors as transport errors and
   by setting an OpenSSL flag that allows to treat unclean SSL closes as normal
   ones. These types of errors can happen it the other side doesn't support `close_notify` or if there's a TCP connection reset.


### Consumer fixes

  * In case of multiple owners of a partition with different generations, the
    sticky assignor would pick the earliest (lowest generation) member as the
    current owner, which would lead to stickiness violations. Fixed by
    choosing the latest (highest generation) member.
  * In case where the same partition is owned by two members with the same
    generation, it indicates an issue. The sticky assignor had some code to
    handle this, but it was non-functional, and did not have parity with the
    Java assignor. Fixed by invalidating any such partition from the current
    assignment completely.



# librdkafka v2.1.1

librdkafka v2.1.1 is a maintenance release:

 * Avoid duplicate messages when a fetch response is received
   in the middle of an offset validation request (#4261).
 * Fix segmentation fault when subscribing to a non-existent topic and
   calling `rd_kafka_message_leader_epoch()` on the polled `rkmessage` (#4245).
 * Fix a segmentation fault when fetching from follower and the partition lease
   expires while waiting for the result of a list offsets operation (#4254).
 * Fix documentation for the admin request timeout, incorrectly stating -1 for infinite
   timeout. That timeout can't be infinite.
 * Fix CMake pkg-config cURL require and use
   pkg-config `Requires.private` field (@FantasqueX, @stertingen, #4180).
 * Fixes certain cases where polling would not keep the consumer
   in the group or make it rejoin it (#4256).
 * Fix to the C++ set_leader_epoch method of TopicPartitionImpl,
   that wasn't storing the passed value (@pavel-pimenov, #4267).

## Fixes

### Consumer fixes

 * Duplicate messages can be emitted when a fetch response is received
   in the middle of an offset validation request. Solved by avoiding
   a restart from last application offset when offset validation succeeds.
 * When fetching from follower, if the partition lease expires after 5 minutes,
   and a list offsets operation was requested to retrieve the earliest
   or latest offset, it resulted in segmentation fault. This was fixed by
   allowing threads different from the main one to call
   the `rd_kafka_toppar_set_fetch_state` function, given they hold
   the lock on the `rktp`.
 * In v2.1.0, a bug was fixed which caused polling any queue to reset the
   `max.poll.interval.ms`. Only certain functions were made to reset the timer,
   but it is possible for the user to obtain the queue with messages from
   the broker, skipping these functions. This was fixed by encoding information
   in a queue itself, that, whether polling, resets the timer.



# librdkafka v2.1.0

librdkafka v2.1.0 is a feature release:

* [KIP-320](https://cwiki.apache.org/confluence/display/KAFKA/KIP-320%3A+Allow+fetchers+to+detect+and+handle+log+truncation)
  Allow fetchers to detect and handle log truncation (#4122).
* Fix a reference count issue blocking the consumer from closing (#4187).
* Fix a protocol issue with ListGroups API, where an extra
  field was appended for API Versions greater than or equal to 3 (#4207).
* Fix an issue with `max.poll.interval.ms`, where polling any queue would cause
  the timeout to be reset (#4176).
* Fix seek partition timeout, was one thousand times lower than the passed
  value (#4230).
* Fix multiple inconsistent behaviour in batch APIs during **pause** or **resume** operations (#4208).
  See **Consumer fixes** section below for more information.
* Update lz4.c from upstream. Fixes [CVE-2021-3520](https://github.com/advisories/GHSA-gmc7-pqv9-966m)
  (by @filimonov, #4232).
* Upgrade OpenSSL to v3.0.8 with various security fixes,
  check the [release notes](https://www.openssl.org/news/cl30.txt) (#4215).

## Enhancements

 * Added `rd_kafka_topic_partition_get_leader_epoch()` (and `set..()`).
 * Added partition leader epoch APIs:
   - `rd_kafka_topic_partition_get_leader_epoch()` (and `set..()`)
   - `rd_kafka_message_leader_epoch()`
   - `rd_kafka_*assign()` and `rd_kafka_seek_partitions()` now supports
     partitions with a leader epoch set.
   - `rd_kafka_offsets_for_times()` will return per-partition leader-epochs.
   - `leader_epoch`, `stored_leader_epoch`, and `committed_leader_epoch`
     added to per-partition statistics.


## Fixes

### OpenSSL fixes

 * Fixed OpenSSL static build not able to use external modules like FIPS
   provider module.

### Consumer fixes

 * A reference count issue was blocking the consumer from closing.
   The problem would happen when a partition is lost, because forcibly
   unassigned from the consumer or if the corresponding topic is deleted.
 * When using `rd_kafka_seek_partitions`, the remaining timeout was
   converted from microseconds to milliseconds but the expected unit
   for that parameter is microseconds.
 * Fixed known issues related to Batch Consume APIs mentioned in v2.0.0
   release notes.
 * Fixed `rd_kafka_consume_batch()` and `rd_kafka_consume_batch_queue()`
   intermittently updating `app_offset` and `store_offset` incorrectly when
   **pause** or **resume** was being used for a partition.
 * Fixed `rd_kafka_consume_batch()` and `rd_kafka_consume_batch_queue()`
   intermittently skipping offsets when **pause** or **resume** was being
   used for a partition.


## Known Issues

### Consume Batch API

 * When `rd_kafka_consume_batch()` and `rd_kafka_consume_batch_queue()` APIs are used with
   any of the **seek**, **pause**, **resume** or **rebalancing** operation, `on_consume`
   interceptors might be called incorrectly (maybe multiple times) for not consumed messages.

### Consume API

 * Duplicate messages can be emitted when a fetch response is received
   in the middle of an offset validation request.
 * Segmentation fault when subscribing to a non-existent topic and
   calling `rd_kafka_message_leader_epoch()` on the polled `rkmessage`.



# librdkafka v2.0.2

librdkafka v2.0.2 is a maintenance release:

* Fix OpenSSL version in Win32 nuget package (#4152).



# librdkafka v2.0.1

librdkafka v2.0.1 is a maintenance release:

* Fixed nuget package for Linux ARM64 release (#4150).



# librdkafka v2.0.0

librdkafka v2.0.0 is a feature release:

 * [KIP-88](https://cwiki.apache.org/confluence/display/KAFKA/KIP-88%3A+OffsetFetch+Protocol+Update)
   OffsetFetch Protocol Update (#3995).
 * [KIP-222](https://cwiki.apache.org/confluence/display/KAFKA/KIP-222+-+Add+Consumer+Group+operations+to+Admin+API)
   Add Consumer Group operations to Admin API (started by @lesterfan, #3995).
 * [KIP-518](https://cwiki.apache.org/confluence/display/KAFKA/KIP-518%3A+Allow+listing+consumer+groups+per+state)
   Allow listing consumer groups per state (#3995).
 * [KIP-396](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=97551484)
   Partially implemented: support for AlterConsumerGroupOffsets
   (started by @lesterfan, #3995).
 * OpenSSL 3.0.x support - the maximum bundled OpenSSL version is now 3.0.7 (previously 1.1.1q).
 * Fixes to the transactional and idempotent producer.


## Upgrade considerations

### OpenSSL 3.0.x

#### OpenSSL default ciphers

The introduction of OpenSSL 3.0.x in the self-contained librdkafka bundles
changes the default set of available ciphers, in particular all obsolete
or insecure ciphers and algorithms as listed in the
OpenSSL [legacy](https://www.openssl.org/docs/man3.0/man7/OSSL_PROVIDER-legacy.html)
manual page are now disabled by default.

**WARNING**: These ciphers are disabled for security reasons and it is
highly recommended NOT to use them.

Should you need to use any of these old ciphers you'll need to explicitly
enable the `legacy` provider by configuring `ssl.providers=default,legacy`
on the librdkafka client.

#### OpenSSL engines and providers

OpenSSL 3.0.x deprecates the use of engines, which is being replaced by
providers. As such librdkafka will emit a deprecation warning if
`ssl.engine.location` is configured.

OpenSSL providers may be configured with the new `ssl.providers`
configuration property.

### Broker TLS certificate hostname verification

The default value for `ssl.endpoint.identification.algorithm` has been
changed from `none` (no hostname verification) to `https`, which enables
broker hostname verification (to counter man-in-the-middle
impersonation attacks) by default.

To restore the previous behaviour, set `ssl.endpoint.identification.algorithm` to `none`.

## Known Issues

### Poor Consumer batch API messaging guarantees

The Consumer Batch APIs `rd_kafka_consume_batch()` and `rd_kafka_consume_batch_queue()`
are not thread safe if `rkmessages_size` is greater than 1 and any of the **seek**,
**pause**, **resume** or **rebalancing** operation is performed in parallel with any of
the above APIs. Some of the messages might be lost, or erroneously returned to the
application, in the above scenario.

It is strongly recommended to use the Consumer Batch APIs and the mentioned
operations in sequential order in order to get consistent result.

For **rebalancing** operation to work in sequencial manner, please set `rebalance_cb`
configuration property (refer [examples/rdkafka_complex_consumer_example.c]
(examples/rdkafka_complex_consumer_example.c) for the help with the usage) for the consumer.

## Enhancements

 * Self-contained static libraries can now be built on Linux arm64 (#4005).
 * Updated to zlib 1.2.13, zstd 1.5.2, and curl 7.86.0 in self-contained
   librdkafka bundles.
 * Added `on_broker_state_change()` interceptor
 * The C++ API no longer returns strings by const value, which enables better move optimization in callers.
 * Added `rd_kafka_sasl_set_credentials()` API to update SASL credentials.
 * Setting `allow.auto.create.topics` will no longer give a warning if used by a producer, since that is an expected use case.
  Improvement in documentation for this property.
 * Added a `resolve_cb` configuration setting that permits using custom DNS resolution logic.
 * Added `rd_kafka_mock_broker_error_stack_cnt()`.
 * The librdkafka.redist NuGet package has been updated to have fewer external
   dependencies for its bundled librdkafka builds, as everything but cyrus-sasl
   is now built-in. There are bundled builds with and without linking to
   cyrus-sasl for maximum compatibility.
 * Admin API DescribeGroups() now provides the group instance id
   for static members [KIP-345](https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances) (#3995).


## Fixes

### General fixes

 * Windows: couldn't read a PKCS#12 keystore correctly because binary mode
   wasn't explicitly set and Windows defaults to text mode.
 * Fixed memory leak when loading SSL certificates (@Mekk, #3930)
 * Load all CA certificates from `ssl.ca.pem`, not just the first one.
 * Each HTTP request made when using OAUTHBEARER OIDC would leak a small
   amount of memory.

### Transactional producer fixes

 * When a PID epoch bump is requested and the producer is waiting
   to reconnect to the transaction coordinator, a failure in a find coordinator
   request could cause an assert to fail. This is fixed by retrying when the
   coordinator is known (#4020).
 * Transactional APIs (except `send_offsets_for_transaction()`) that
   timeout due to low timeout_ms may now be resumed by calling the same API
   again, as the operation continues in the background.
 * For fatal idempotent producer errors that may be recovered by bumping the
   epoch the current transaction must first be aborted prior to the epoch bump.
   This is now handled correctly, which fixes issues seen with fenced
   transactional producers on fatal idempotency errors.
 * Timeouts for EndTxn requests (transaction commits and aborts) are now
   automatically retried and the error raised to the application is also
   a retriable error.
 * TxnOffsetCommitRequests were retried immediately upon temporary errors in
   `send_offsets_to_transactions()`, causing excessive network requests.
   These retries are now delayed 500ms.
 * If `init_transactions()` is called with an infinite timeout (-1),
   the timeout will be limited to 2 * `transaction.timeout.ms`.
   The application may retry and resume the call if a retriable error is
   returned.


### Consumer fixes

 * Back-off and retry JoinGroup request if coordinator load is in progress.
 * Fix `rd_kafka_consume_batch()` and `rd_kafka_consume_batch_queue()` skipping
   other partitions' offsets intermittently when **seek**, **pause**, **resume**
   or **rebalancing** is used for a partition.
 * Fix `rd_kafka_consume_batch()` and `rd_kafka_consume_batch_queue()`
   intermittently returing incorrect partitions' messages if **rebalancing**
   happens during these operations.

# librdkafka v1.9.2

librdkafka v1.9.2 is a maintenance release:

 * The SASL OAUTHBEAR OIDC POST field was sometimes truncated by one byte (#3192).
 * The bundled version of OpenSSL has been upgraded to version 1.1.1q for non-Windows builds. Windows builds remain on OpenSSL 1.1.1n for the time being.
 * The bundled version of Curl has been upgraded to version 7.84.0.



# librdkafka v1.9.1

librdkafka v1.9.1 is a maintenance release:

 * The librdkafka.redist NuGet package now contains OSX M1/arm64 builds.
 * Self-contained static libraries can now be built on OSX M1 too, thanks to
   disabling curl's configure runtime check.



# librdkafka v1.9.0

librdkafka v1.9.0 is a feature release:

 * Added KIP-768 OUATHBEARER OIDC support (by @jliunyu, #3560)
 * Added KIP-140 Admin API ACL support (by @emasab, #2676)


## Upgrade considerations

 * Consumer:
   `rd_kafka_offsets_store()` (et.al) will now return an error for any
   partition that is not currently assigned (through `rd_kafka_*assign()`).
   This prevents a race condition where an application would store offsets
   after the assigned partitions had been revoked (which resets the stored
   offset), that could cause these old stored offsets to be committed later
   when the same partitions were assigned to this consumer again - effectively
   overwriting any committed offsets by any consumers that were assigned the
   same partitions previously. This would typically result in the offsets
   rewinding and messages to be reprocessed.
   As an extra effort to avoid this situation the stored offset is now
   also reset when partitions are assigned (through `rd_kafka_*assign()`).
   Applications that explicitly call `..offset*_store()` will now need
   to handle the case where `RD_KAFKA_RESP_ERR__STATE` is returned
   in the per-partition `.err` field - meaning the partition is no longer
   assigned to this consumer and the offset could not be stored for commit.


## Enhancements

 * Improved producer queue scheduling. Fixes the performance regression
   introduced in v1.7.0 for some produce patterns. (#3538, #2912)
 * Windows: Added native Win32 IO/Queue scheduling. This removes the
   internal TCP loopback connections that were previously used for timely
   queue wakeups.
 * Added `socket.connection.setup.timeout.ms` (default 30s).
   The maximum time allowed for broker connection setups (TCP connection as
   well as SSL and SASL handshakes) is now limited to this value.
   This fixes the issue with stalled broker connections in the case of network
   or load balancer problems.
   The Java clients has an exponential backoff to this timeout which is
   limited by `socket.connection.setup.timeout.max.ms` - this was not
   implemented in librdkafka due to differences in connection handling and
   `ERR__ALL_BROKERS_DOWN` error reporting. Having a lower initial connection
   setup timeout and then increase the timeout for the next attempt would
   yield possibly false-positive `ERR__ALL_BROKERS_DOWN` too early.
 * SASL OAUTHBEARER refresh callbacks can now be scheduled for execution
   on librdkafka's background thread. This solves the problem where an
   application has a custom SASL OAUTHBEARER refresh callback and thus needs to
   call `rd_kafka_poll()` (et.al.) at least once to trigger the
   refresh callback before being able to connect to brokers.
   With the new `rd_kafka_conf_enable_sasl_queue()` configuration API and
   `rd_kafka_sasl_background_callbacks_enable()` the refresh callbacks
   can now be triggered automatically on the librdkafka background thread.
 * `rd_kafka_queue_get_background()` now creates the background thread
   if not already created.
 * Added `rd_kafka_consumer_close_queue()` and `rd_kafka_consumer_closed()`.
   This allow applications and language bindings to implement asynchronous
   consumer close.
 * Bundled zlib upgraded to version 1.2.12.
 * Bundled OpenSSL upgraded to 1.1.1n.
 * Added `test.mock.broker.rtt` to simulate RTT/latency for mock brokers.


## Fixes

### General fixes

 * Fix various 1 second delays due to internal broker threads blocking on IO
   even though there are events to handle.
   These delays could be seen randomly in any of the non produce/consume
   request APIs, such as `commit_transaction()`, `list_groups()`, etc.
 * Windows: some applications would crash with an error message like
   `no OPENSSL_Applink()` written to the console if `ssl.keystore.location`
   was configured.
   This regression was introduced in v1.8.0 due to use of vcpkgs and how
   keystore file was read. #3554.
 * Windows 32-bit only: 64-bit atomic reads were in fact not atomic and could
   in rare circumstances yield incorrect values.
   One manifestation of this issue was the `max.poll.interval.ms` consumer
   timer expiring even though the application was polling according to profile.
   Fixed by @WhiteWind (#3815).
 * `rd_kafka_clusterid()` would previously fail with timeout if
   called on cluster with no visible topics (#3620).
   The clusterid is now returned as soon as metadata has been retrieved.
 * Fix hang in `rd_kafka_list_groups()` if there are no available brokers
   to connect to (#3705).
 * Millisecond timeouts (`timeout_ms`) in various APIs, such as `rd_kafka_poll()`,
   was limited to roughly 36 hours before wrapping. (#3034)
 * If a metadata request triggered by `rd_kafka_metadata()` or consumer group rebalancing
   encountered a non-retriable error it would not be propagated to the caller and thus
   cause a stall or timeout, this has now been fixed. (@aiquestion, #3625)
 * AdminAPI `DeleteGroups()` and `DeleteConsumerGroupOffsets()`:
   if the given coordinator connection was not up by the time these calls were
   initiated and the first connection attempt failed then no further connection
   attempts were performed, ulimately leading to the calls timing out.
   This is now fixed by keep retrying to connect to the group coordinator
   until the connection is successful or the call times out.
   Additionally, the coordinator will be now re-queried once per second until
   the coordinator comes up or the call times out, to detect change in
   coordinators.
 * Mock cluster `rd_kafka_mock_broker_set_down()` would previously
   accept and then disconnect new connections, it now refuses new connections.


### Consumer fixes

 * `rd_kafka_offsets_store()` (et.al) will now return an error for any
   partition that is not currently assigned (through `rd_kafka_*assign()`).
   See **Upgrade considerations** above for more information.
 * `rd_kafka_*assign()` will now reset/clear the stored offset.
   See **Upgrade considerations** above for more information.
 * `seek()` followed by `pause()` would overwrite the seeked offset when
   later calling `resume()`. This is now fixed. (#3471).
   **Note**: Avoid storing offsets (`offsets_store()`) after calling
   `seek()` as this may later interfere with resuming a paused partition,
   instead store offsets prior to calling seek.
 * A `ERR_MSG_SIZE_TOO_LARGE` consumer error would previously be raised
   if the consumer received a maximum sized FetchResponse only containing
   (transaction) aborted messages with no control messages. The fetching did
   not stop, but some applications would terminate upon receiving this error.
   No error is now raised in this case. (#2993)
   Thanks to @jacobmikesell for providing an application to reproduce the
   issue.
 * The consumer no longer backs off the next fetch request (default 500ms) when
   the parsed fetch response is truncated (which is a valid case).
   This should speed up the message fetch rate in case of maximum sized
   fetch responses.
 * Fix consumer crash (`assert: rkbuf->rkbuf_rkb`) when parsing
   malformed JoinGroupResponse consumer group metadata state.
 * Fix crash (`cant handle op type`) when using `consume_batch_queue()` (et.al)
   and an OAUTHBEARER refresh callback was set.
   The callback is now triggered by the consume call. (#3263)
 * Fix `partition.assignment.strategy` ordering when multiple strategies are configured.
   If there is more than one eligible strategy, preference is determined by the
   configured order of strategies. The partitions are assigned to group members according
   to the strategy order preference now. (#3818)
 * Any form of unassign*() (absolute or incremental) is now allowed during
   consumer close rebalancing and they're all treated as absolute unassigns.
   (@kevinconaway)


### Transactional producer fixes

 * Fix message loss in idempotent/transactional producer.
   A corner case has been identified that may cause idempotent/transactional
   messages to be lost despite being reported as successfully delivered:
   During cluster instability a restarting broker may report existing topics
   as non-existent for some time before it is able to acquire up to date
   cluster and topic metadata.
   If an idempotent/transactional producer updates its topic metadata cache
   from such a broker the producer will consider the topic to be removed from
   the cluster and thus remove its local partition objects for the given topic.
   This also removes the internal message sequence number counter for the given
   partitions.
   If the producer later receives proper topic metadata for the cluster the
   previously "removed" topics will be rediscovered and new partition objects
   will be created in the producer. These new partition objects, with no
   knowledge of previous incarnations, would start counting partition messages
   at zero again.
   If new messages were produced for these partitions by the same producer
   instance, the same message sequence numbers would be sent to the broker.
   If the broker still maintains state for the producer's PID and Epoch it could
   deem that these messages with reused sequence numbers had already been
   written to the log and treat them as legit duplicates.
   This would seem to the producer that these new messages were successfully
   written to the partition log by the broker when they were in fact discarded
   as duplicates, leading to silent message loss.
   The fix included in this release is to save the per-partition idempotency
   state when a partition is removed, and then recover and use that saved
   state if the partition comes back at a later time.
 * The transactional producer would retry (re)initializing its PID if a
   `PRODUCER_FENCED` error was returned from the
   broker (added in Apache Kafka 2.8), which could cause the producer to
   seemingly hang.
   This error code is now correctly handled by raising a fatal error.
 * If the given group coordinator connection was not up by the time
   `send_offsets_to_transactions()` was called, and the first connection
   attempt failed then no further connection attempts were performed, ulimately
   leading to `send_offsets_to_transactions()` timing out, and possibly
   also the transaction timing out on the transaction coordinator.
   This is now fixed by keep retrying to connect to the group coordinator
   until the connection is successful or the call times out.
   Additionally, the coordinator will be now re-queried once per second until
   the coordinator comes up or the call times out, to detect change in
   coordinators.


### Producer fixes

 * Improved producer queue wakeup scheduling. This should significantly
   decrease the number of wakeups and thus syscalls for high message rate
   producers. (#3538, #2912)
 * The logic for enforcing that `message.timeout.ms` is greather than
   an explicitly configured `linger.ms` was incorrect and instead of
   erroring out early the lingering time was automatically adjusted to the
   message timeout, ignoring the configured `linger.ms`.
   This has now been fixed so that an error is returned when instantiating the
   producer. Thanks to @larry-cdn77 for analysis and test-cases. (#3709)


# librdkafka v1.8.2

librdkafka v1.8.2 is a maintenance release.

## Enhancements

 * Added `ssl.ca.pem` to add CA certificate by PEM string. (#2380)
 * Prebuilt binaries for Mac OSX now contain statically linked OpenSSL v1.1.1l.
   Previously the OpenSSL version was either v1.1.1 or v1.0.2 depending on
   build type.

## Fixes

 * The `librdkafka.redist` 1.8.0 package had two flaws:
   - the linux-arm64 .so build was a linux-x64 build.
   - the included Windows MSVC 140 runtimes for x64 were infact x86.
   The release script has been updated to verify the architectures of
   provided artifacts to avoid this happening in the future.
 * Prebuilt binaries for Mac OSX Sierra (10.12) and older are no longer provided.
   This affects [confluent-kafka-go](https://github.com/confluentinc/confluent-kafka-go).
 * Some of the prebuilt binaries for Linux were built on Ubuntu 14.04,
   these builds are now performed on Ubuntu 16.04 instead.
   This may affect users on ancient Linux distributions.
 * It was not possible to configure `ssl.ca.location` on OSX, the property
   would automatically revert back to `probe` (default value).
   This regression was introduced in v1.8.0. (#3566)
 * librdkafka's internal timers would not start if the timeout was set to 0,
   which would result in some timeout operations not being enforced correctly,
   e.g., the transactional producer API timeouts.
   These timers are now started with a timeout of 1 microsecond.

### Transactional producer fixes

 * Upon quick repeated leader changes the transactional producer could receive
   an `OUT_OF_ORDER_SEQUENCE` error from the broker, which triggered an
   Epoch bump on the producer resulting in an InitProducerIdRequest being sent
   to the transaction coordinator in the middle of a transaction.
   This request would start a new transaction on the coordinator, but the
   producer would still think (erroneously) it was in current transaction.
   Any messages produced in the current transaction prior to this event would
   be silently lost when the application committed the transaction, leading
   to message loss.
   This has been fixed by setting the Abortable transaction error state
   in the producer. #3575.
 * The transactional producer could stall during a transaction if the transaction
   coordinator changed while adding offsets to the transaction (send_offsets_to_transaction()).
   This stall lasted until the coordinator connection went down, the
   transaction timed out, transaction was aborted, or messages were produced
   to a new partition, whichever came first. #3571.



*Note: there was no v1.8.1 librdkafka release*


# librdkafka v1.8.0

librdkafka v1.8.0 is a security release:

 * Upgrade bundled zlib version from 1.2.8 to 1.2.11 in the `librdkafka.redist`
   NuGet package. The updated zlib version fixes CVEs:
   CVE-2016-9840, CVE-2016-9841, CVE-2016-9842, CVE-2016-9843
   See https://github.com/confluentinc/librdkafka/issues/2934 for more information.
 * librdkafka now uses [vcpkg](https://vcpkg.io/) for up-to-date Windows
   dependencies in the `librdkafka.redist` NuGet package:
   OpenSSL 1.1.1l, zlib 1.2.11, zstd 1.5.0.
 * The upstream dependency (OpenSSL, zstd, zlib) source archive checksums are
   now verified when building with `./configure --install-deps`.
   These builds are used by the librdkafka builds bundled with
   confluent-kafka-go, confluent-kafka-python and confluent-kafka-dotnet.


## Enhancements

 * Producer `flush()` now overrides the `linger.ms` setting for the duration
   of the `flush()` call, effectively triggering immediate transmission of
   queued messages. (#3489)

## Fixes

### General fixes

 * Correctly detect presence of zlib via compilation check. (Chris Novakovic)
 * `ERR__ALL_BROKERS_DOWN` is no longer emitted when the coordinator
   connection goes down, only when all standard named brokers have been tried.
   This fixes the issue with `ERR__ALL_BROKERS_DOWN` being triggered on
   `consumer_close()`. It is also now only emitted if the connection was fully
   up (past handshake), and not just connected.
 * `rd_kafka_query_watermark_offsets()`, `rd_kafka_offsets_for_times()`,
   `consumer_lag` metric, and `auto.offset.reset` now honour
   `isolation.level` and will return the Last Stable Offset (LSO)
   when `isolation.level` is set to `read_committed` (default), rather than
   the uncommitted high-watermark when it is set to `read_uncommitted`. (#3423)
 * SASL GSSAPI is now usable when `sasl.kerberos.min.time.before.relogin`
   is set to 0 - which disables ticket refreshes (by @mpekalski, #3431).
 * Rename internal crc32c() symbol to rd_crc32c() to avoid conflict with
   other static libraries (#3421).
 * `txidle` and `rxidle` in the statistics object was emitted as 18446744073709551615 when no idle was known. -1 is now emitted instead. (#3519)


### Consumer fixes

 * Automatically retry offset commits on `ERR_REQUEST_TIMED_OUT`,
   `ERR_COORDINATOR_NOT_AVAILABLE`, and `ERR_NOT_COORDINATOR` (#3398).
   Offset commits will be retried twice.
 * Timed auto commits did not work when only using assign() and not subscribe().
   This regression was introduced in v1.7.0.
 * If the topics matching the current subscription changed (or the application
   updated the subscription) while there was an outstanding JoinGroup or
   SyncGroup request, an additional request would sometimes be sent before
   handling the response of the first. This in turn lead to internal state
   issues that could cause a crash or malbehaviour.
   The consumer will now wait for any outstanding JoinGroup or SyncGroup
   responses before re-joining the group.
 * `auto.offset.reset` could previously be triggered by temporary errors,
   such as disconnects and timeouts (after the two retries are exhausted).
   This is now fixed so that the auto offset reset policy is only triggered
   for permanent errors.
 * The error that triggers `auto.offset.reset` is now logged to help the
   application owner identify the reason of the reset.
 * If a rebalance takes longer than a consumer's `session.timeout.ms`, the
   consumer will remain in the group as long as it receives heartbeat responses
   from the broker.


### Admin fixes

 * `DeleteRecords()` could crash if one of the underlying requests
   (for a given partition leader) failed at the transport level (e.g., timeout).
   (#3476).



# librdkafka v1.7.0

librdkafka v1.7.0 is feature release:

 * [KIP-360](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=89068820) - Improve reliability of transactional producer.
   Requires Apache Kafka 2.5 or later.
 * OpenSSL Engine support (`ssl.engine.location`) by @adinigam and @ajbarb.


## Enhancements

 * Added `connections.max.idle.ms` to automatically close idle broker
   connections.
   This feature is disabled by default unless `bootstrap.servers` contains
   the string `azure` in which case the default is set to <4 minutes to improve
   connection reliability and circumvent limitations with the Azure load
   balancers (see #3109 for more information).
 * Bumped to OpenSSL 1.1.1k in binary librdkafka artifacts.
 * The binary librdkafka artifacts for Alpine are now using Alpine 3.12.
   OpenSSL 1.1.1k.
 * Improved static librdkafka Windows builds using MinGW (@neptoess, #3130).
 * The `librdkafka.redist` NuGet package now has updated zlib, zstd and
   OpenSSL versions (from vcpkg).


## Security considerations

 * The zlib version bundled with the `librdkafka.redist` NuGet package has now been upgraded
   from zlib 1.2.8 to 1.2.11, fixing the following CVEs:
   * CVE-2016-9840: undefined behaviour (compiler dependent) in inflate (decompression) code: this is used by the librdkafka consumer. Risk of successfully exploitation through consumed messages is eastimated very low.
   * CVE-2016-9841: undefined behaviour (compiler dependent) in inflate code: this is used by the librdkafka consumer. Risk of successfully exploitation through consumed messages is eastimated very low.
   * CVE-2016-9842: undefined behaviour in inflateMark(): this API is not used by librdkafka.
   * CVE-2016-9843: issue in crc32_big() which is called from crc32_z(): this API is not used by librdkafka.

## Upgrade considerations

 * The C++ `oauthbearer_token_refresh_cb()` was missing a `Handle *`
   argument that has now been added. This is a breaking change but the original
   function signature is considered a bug.
   This change only affects C++ OAuth developers.
 * [KIP-735](https://cwiki.apache.org/confluence/display/KAFKA/KIP-735%3A+Increase+default+consumer+session+timeout) The consumer `session.timeout.ms`
   default was changed from 10 to 45 seconds to make consumer groups more
   robust and less sensitive to temporary network and cluster issues.
 * Statistics: `consumer_lag` is now using the `committed_offset`,
   while the new `consumer_lag_stored` is using `stored_offset`
   (offset to be committed).
   This is more correct than the previous `consumer_lag` which was using
   either `committed_offset` or `app_offset` (last message passed
   to application).
 * The `librdkafka.redist` NuGet package is now built with MSVC runtime v140
   (VS 2015). Previous versions were built with MSVC runtime v120 (VS 2013).


## Fixes

### General fixes

 * Fix accesses to freed metadata cache mutexes on client termination (#3279)
 * There was a race condition on receiving updated metadata where a broker id
   update (such as bootstrap to proper broker transformation) could finish after
   the topic metadata cache was updated, leading to existing brokers seemingly
   being not available.
   One occurrence of this issue was query_watermark_offsets() that could return
   `ERR__UNKNOWN_PARTITION` for existing partitions shortly after the
   client instance was created.
 * The OpenSSL context is now initialized with `TLS_client_method()`
   (on OpenSSL >= 1.1.0) instead of the deprecated and outdated
   `SSLv23_client_method()`.
 * The initial cluster connection on client instance creation could sometimes
   be delayed up to 1 second if a `group.id` or `transactional.id`
   was configured (#3305).
 * Speed up triggering of new broker connections in certain cases by exiting
   the broker thread io/op poll loop when a wakeup op is received.
 * SASL GSSAPI: The Kerberos kinit refresh command was triggered from
   `rd_kafka_new()` which made this call blocking if the refresh command
   was taking long. The refresh is now performed by the background rdkafka
   main thread.
 * Fix busy-loop (100% CPU on the broker threads) during the handshake phase
   of an SSL connection.
 * Disconnects during SSL handshake are now propagated as transport errors
   rather than SSL errors, since these disconnects are at the transport level
   (e.g., incorrect listener, flaky load balancer, etc) and not due to SSL
   issues.
 * Increment metadata fast refresh interval backoff exponentially (@ajbarb, #3237).
 * Unthrottled requests are no longer counted in the `brokers[].throttle`
   statistics object.
 * Log CONFWARN warning when global topic configuration properties
   are overwritten by explicitly setting a `default_topic_conf`.

### Consumer fixes

 * If a rebalance happened during a `consume_batch..()` call the already
   accumulated messages for revoked partitions were not purged, which would
   pass messages to the application for partitions that were no longer owned
   by the consumer. Fixed by @jliunyu. #3340.
 * Fix balancing and reassignment issues with the cooperative-sticky assignor.
   #3306.
 * Fix incorrect detection of first rebalance in sticky assignor (@hallfox).
 * Aborted transactions with no messages produced to a partition could
   cause further successfully committed messages in the same Fetch response to
   be ignored, resulting in consumer-side message loss.
   A log message along the lines `Abort txn ctrl msg bad order at offset
   7501: expected before or at 7702: messages in aborted transactions may be delivered to the application`
   would be seen.
   This is a rare occurrence where a transactional producer would register with
   the partition but not produce any messages before aborting the transaction.
 * The consumer group deemed cached metadata up to date by checking
   `topic.metadata.refresh.interval.ms`: if this property was set too low
   it would cause cached metadata to be unusable and new metadata to be fetched,
   which could delay the time it took for a rebalance to settle.
   It now correctly uses `metadata.max.age.ms` instead.
 * The consumer group timed auto commit would attempt commits during rebalances,
   which could result in "Illegal generation" errors. This is now fixed, the
   timed auto committer is only employed in the steady state when no rebalances
   are taking places. Offsets are still auto committed when partitions are
   revoked.
 * Retriable FindCoordinatorRequest errors are no longer propagated to
   the application as they are retried automatically.
 * Fix rare crash (assert `rktp_started`) on consumer termination
   (introduced in v1.6.0).
 * Fix unaligned access and possibly corrupted snappy decompression when
   building with MSVC (@azat)
 * A consumer configured with the `cooperative-sticky` assignor did
   not actively Leave the group on unsubscribe(). This delayed the
   rebalance for the remaining group members by up to `session.timeout.ms`.
 * The current subscription list was sometimes leaked when unsubscribing.

### Producer fixes

 * The timeout value of `flush()` was not respected when delivery reports
   were scheduled as events (such as for confluent-kafka-go) rather than
   callbacks.
 * There was a race conditition in `purge()` which could cause newly
   created partition objects, or partitions that were changing leaders, to
   not have their message queues purged. This could cause
   `abort_transaction()` to time out. This issue is now fixed.
 * In certain high-thruput produce rate patterns producing could stall for
   1 second, regardless of `linger.ms`, due to rate-limiting of internal
   queue wakeups. This is now fixed by not rate-limiting queue wakeups but
   instead limiting them to one wakeup per queue reader poll. #2912.

### Transactional Producer fixes

 * KIP-360: Fatal Idempotent producer errors are now recoverable by the
   transactional producer and will raise a `txn_requires_abort()` error.
 * If the cluster went down between `produce()` and `commit_transaction()`
   and before any partitions had been registered with the coordinator, the
   messages would time out but the commit would succeed because nothing
   had been sent to the coordinator. This is now fixed.
 * If the current transaction failed while `commit_transaction()` was
   checking the current transaction state an invalid state transaction could
   occur which in turn would trigger a assertion crash.
   This issue showed up as "Invalid txn state transition: .." crashes, and is
   now fixed by properly synchronizing both checking and transition of state.



# librdkafka v1.6.1

librdkafka v1.6.1 is a maintenance release.

## Upgrade considerations

 * Fatal idempotent producer errors are now also fatal to the transactional
   producer. This is a necessary step to maintain data integrity prior to
   librdkafka supporting KIP-360. Applications should check any transactional
   API errors for the is_fatal flag and decommission the transactional producer
   if the flag is set.
 * The consumer error raised by `auto.offset.reset=error` now has error-code
   set to `ERR__AUTO_OFFSET_RESET` to allow an application to differentiate
   between auto offset resets and other consumer errors.


## Fixes

### General fixes

 * Admin API and transactional `send_offsets_to_transaction()` coordinator
   requests, such as TxnOffsetCommitRequest, could in rare cases be sent
   multiple times which could cause a crash.
 * `ssl.ca.location=probe` is now enabled by default on Mac OSX since the
   librdkafka-bundled OpenSSL might not have the same default CA search paths
   as the system or brew installed OpenSSL. Probing scans all known locations.

### Transactional Producer fixes

 * Fatal idempotent producer errors are now also fatal to the transactional
   producer.
 * The transactional producer could crash if the transaction failed while
   `send_offsets_to_transaction()` was called.
 * Group coordinator requests for transactional
   `send_offsets_to_transaction()` calls would leak memory if the
   underlying request was attempted to be sent after the transaction had
   failed.
 * When gradually producing to multiple partitions (resulting in multiple
   underlying AddPartitionsToTxnRequests) subsequent partitions could get
   stuck in pending state under certain conditions. These pending partitions
   would not send queued messages to the broker and eventually trigger
   message timeouts, failing the current transaction. This is now fixed.
 * Committing an empty transaction (no messages were produced and no
   offsets were sent) would previously raise a fatal error due to invalid state
   on the transaction coordinator. We now allow empty/no-op transactions to
   be committed.

### Consumer fixes

 * The consumer will now retry indefinitely (or until the assignment is changed)
   to retrieve committed offsets. This fixes the issue where only two retries
   were attempted when outstanding transactions were blocking OffsetFetch
   requests with `ERR_UNSTABLE_OFFSET_COMMIT`. #3265





# librdkafka v1.6.0

librdkafka v1.6.0 is feature release:

 * [KIP-429 Incremental rebalancing](https://cwiki.apache.org/confluence/display/KAFKA/KIP-429%3A+Kafka+Consumer+Incremental+Rebalance+Protocol) with sticky
   consumer group partition assignor (KIP-54) (by @mhowlett).
 * [KIP-480 Sticky producer partitioning](https://cwiki.apache.org/confluence/display/KAFKA/KIP-480%3A+Sticky+Partitioner) (`sticky.partitioning.linger.ms`) -
   achieves higher throughput and lower latency through sticky selection
   of random partition (by @abbycriswell).
 * AdminAPI: Add support for `DeleteRecords()`, `DeleteGroups()` and
   `DeleteConsumerGroupOffsets()` (by @gridaphobe)
 * [KIP-447 Producer scalability for exactly once semantics](https://cwiki.apache.org/confluence/display/KAFKA/KIP-447%3A+Producer+scalability+for+exactly+once+semantics) -
   allows a single transactional producer to be used for multiple input
   partitions. Requires Apache Kafka 2.5 or later.
 * Transactional producer fixes and improvements, see **Transactional Producer fixes** below.
 * The [librdkafka.redist](https://www.nuget.org/packages/librdkafka.redist/)
   NuGet package now supports Linux ARM64/Aarch64.


## Upgrade considerations

 * Sticky producer partitioning (`sticky.partitioning.linger.ms`) is
   enabled by default (10 milliseconds) which affects the distribution of
   randomly partitioned messages, where previously these messages would be
   evenly distributed over the available partitions they are now partitioned
   to a single partition for the duration of the sticky time
   (10 milliseconds by default) before a new random sticky partition
   is selected.
 * The new KIP-447 transactional producer scalability guarantees are only
   supported on Apache Kafka 2.5 or later, on earlier releases you will
   need to use one producer per input partition for EOS. This limitation
   is not enforced by the producer or broker.
 * Error handling for the transactional producer has been improved, see
   the **Transactional Producer fixes** below for more information.


## Known issues

 * The Transactional Producer's API timeout handling is inconsistent with the
   underlying protocol requests, it is therefore strongly recommended that
   applications call `rd_kafka_commit_transaction()` and
   `rd_kafka_abort_transaction()` with the `timeout_ms` parameter
   set to `-1`, which will use the remaining transaction timeout.


## Enhancements

 * KIP-107, KIP-204: AdminAPI: Added `DeleteRecords()` (by @gridaphobe).
 * KIP-229: AdminAPI: Added `DeleteGroups()` (by @gridaphobe).
 * KIP-496: AdminAPI: Added `DeleteConsumerGroupOffsets()`.
 * KIP-464: AdminAPI: Added support for broker-side default partition count
   and replication factor for `CreateTopics()`.
 * Windows: Added `ssl.ca.certificate.stores` to specify a list of
   Windows Certificate Stores to read CA certificates from, e.g.,
   `CA,Root`. `Root` remains the default store.
 * Use reentrant `rand_r()` on supporting platforms which decreases lock
   contention (@azat).
 * Added `assignor` debug context for troubleshooting consumer partition
   assignments.
 * Updated to OpenSSL v1.1.1i when building dependencies.
 * Update bundled lz4 (used when `./configure --disable-lz4-ext`) to v1.9.3
   which has vast performance improvements.
 * Added `rd_kafka_conf_get_default_topic_conf()` to retrieve the
   default topic configuration object from a global configuration object.
 * Added `conf` debugging context to `debug` - shows set configuration
   properties on client and topic instantiation. Sensitive properties
   are redacted.
 * Added `rd_kafka_queue_yield()` to cancel a blocking queue call.
 * Will now log a warning when multiple ClusterIds are seen, which is an
   indication that the client might be erroneously configured to connect to
   multiple clusters which is not supported.
 * Added `rd_kafka_seek_partitions()` to seek multiple partitions to
   per-partition specific offsets.


## Fixes

### General fixes

 * Fix a use-after-free crash when certain coordinator requests were retried.
 * The C++ `oauthbearer_set_token()` function would call `free()` on
   a `new`-created pointer, possibly leading to crashes or heap corruption (#3194)

### Consumer fixes

 * The consumer assignment and consumer group implementations have been
   decoupled, simplified and made more strict and robust. This will sort out
   a number of edge cases for the consumer where the behaviour was previously
   undefined.
 * Partition fetch state was not set to STOPPED if OffsetCommit failed.
 * The session timeout is now enforced locally also when the coordinator
   connection is down, which was not previously the case.


### Transactional Producer fixes

 * Transaction commit or abort failures on the broker, such as when the
   producer was fenced by a newer instance, were not propagated to the
   application resulting in failed commits seeming successful.
   This was a critical race condition for applications that had a delay after
   producing messages (or sendings offsets) before committing or
   aborting the transaction. This issue has now been fixed and test coverage
   improved.
 * The transactional producer API would return `RD_KAFKA_RESP_ERR__STATE`
   when API calls were attempted after the transaction had failed, we now
   try to return the error that caused the transaction to fail in the first
   place, such as `RD_KAFKA_RESP_ERR__FENCED` when the producer has
   been fenced, or `RD_KAFKA_RESP_ERR__TIMED_OUT` when the transaction
   has timed out.
 * Transactional producer retry count for transactional control protocol
   requests has been increased from 3 to infinite, retriable errors
   are now automatically retried by the producer until success or the
   transaction timeout is exceeded. This fixes the case where
   `rd_kafka_send_offsets_to_transaction()` would fail the current
   transaction into an abortable state when `CONCURRENT_TRANSACTIONS` was
   returned by the broker (which is a transient error) and the 3 retries
   were exhausted.


### Producer fixes

 * Calling `rd_kafka_topic_new()` with a topic config object with
   `message.timeout.ms` set could sometimes adjust the global `linger.ms`
   property (if not explicitly configured) which was not desired, this is now
   fixed and the auto adjustment is only done based on the
   `default_topic_conf` at producer creation.
 * `rd_kafka_flush()` could previously return `RD_KAFKA_RESP_ERR__TIMED_OUT`
   just as the timeout was reached if the messages had been flushed but
   there were now no more messages. This has been fixed.




# librdkafka v1.5.3

librdkafka v1.5.3 is a maintenance release.

## Upgrade considerations

 * CentOS 6 is now EOL and is no longer included in binary librdkafka packages,
   such as NuGet.

## Fixes

### General fixes

 * Fix a use-after-free crash when certain coordinator requests were retried.
 * Coordinator requests could be left uncollected on instance destroy which
   could lead to hang.
 * Fix rare 1 second stalls by forcing rdkafka main thread wakeup when a new
   next-timer-to-be-fired is scheduled.
 * Fix additional cases where broker-side automatic topic creation might be
   triggered unexpectedly.
 * AdminAPI: The operation_timeout (on-broker timeout) previously defaulted to 0,
   but now defaults to `socket.timeout.ms` (60s).
 * Fix possible crash for Admin API protocol requests that fail at the
   transport layer or prior to sending.


### Consumer fixes

 * Consumer would not filter out messages for aborted transactions
   if the messages were compressed (#3020).
 * Consumer destroy without prior `close()` could hang in certain
   cgrp states (@gridaphobe, #3127).
 * Fix possible null dereference in `Message::errstr()` (#3140).
 * The `roundrobin` partition assignment strategy could get stuck in an
   endless loop or generate uneven assignments in case the group members
   had asymmetric subscriptions (e.g., c1 subscribes to t1,t2 while c2
   subscribes to t2,t3).  (#3159)
 * Mixing committed and logical or absolute offsets in the partitions
   passed to `rd_kafka_assign()` would in previous released ignore the
   logical or absolute offsets and use the committed offsets for all partitions.
   This is now fixed. (#2938)




# librdkafka v1.5.2

librdkafka v1.5.2 is a maintenance release.


## Upgrade considerations

 * The default value for the producer configuration property `retries` has
   been increased from 2 to infinity, effectively limiting Produce retries to
   only `message.timeout.ms`.
   As the reasons for the automatic internal retries vary (various broker error
   codes as well as transport layer issues), it doesn't make much sense to limit
   the number of retries for retriable errors, but instead only limit the
   retries based on the allowed time to produce a message.
 * The default value for the producer configuration property
   `request.timeout.ms` has been increased from 5 to 30 seconds to match
   the Apache Kafka Java producer default.
   This change yields increased robustness for broker-side congestion.


## Enhancements

 * The generated `CONFIGURATION.md` (through `rd_kafka_conf_properties_show())`)
   now include all properties and values, regardless if they were included in
   the build, and setting a disabled property or value through
   `rd_kafka_conf_set()` now returns `RD_KAFKA_CONF_INVALID` and provides
   a more useful error string saying why the property can't be set.
 * Consumer configs on producers and vice versa will now be logged with
   warning messages on client instantiation.

## Fixes

### Security fixes

 * There was an incorrect call to zlib's `inflateGetHeader()` with
   unitialized memory pointers that could lead to the GZIP header of a fetched
   message batch to be copied to arbitrary memory.
   This function call has now been completely removed since the result was
   not used.
   Reported by Ilja van Sprundel.


### General fixes

 * `rd_kafka_topic_opaque()` (used by the C++ API) would cause object
   refcounting issues when used on light-weight (error-only) topic objects
   such as consumer errors (#2693).
 * Handle name resolution failures when formatting IP addresses in error logs,
   and increase printed hostname limit to ~256 bytes (was ~60).
 * Broker sockets would be closed twice (thus leading to potential race
   condition with fd-reuse in other threads) if a custom `socket_cb` would
   return error.

### Consumer fixes

 * The `roundrobin` `partition.assignment.strategy` could crash (assert)
   for certain combinations of members and partitions.
   This is a regression in v1.5.0. (#3024)
 * The C++ `KafkaConsumer` destructor did not destroy the underlying
   C `rd_kafka_t` instance, causing a leak if `close()` was not used.
 * Expose rich error strings for C++ Consumer `Message->errstr()`.
 * The consumer could get stuck if an outstanding commit failed during
   rebalancing (#2933).
 * Topic authorization errors during fetching are now reported only once (#3072).

### Producer fixes

 * Topic authorization errors are now properly propagated for produced messages,
   both through delivery reports and as `ERR_TOPIC_AUTHORIZATION_FAILED`
   return value from `produce*()` (#2215)
 * Treat cluster authentication failures as fatal in the transactional
   producer (#2994).
 * The transactional producer code did not properly reference-count partition
   objects which could in very rare circumstances lead to a use-after-free bug
   if a topic was deleted from the cluster when a transaction was using it.
 * `ERR_KAFKA_STORAGE_ERROR` is now correctly treated as a retriable
   produce error (#3026).
 * Messages that timed out locally would not fail the ongoing transaction.
   If the application did not take action on failed messages in its delivery
   report callback and went on to commit the transaction, the transaction would
   be successfully committed, simply omitting the failed messages.
 * EndTxnRequests (sent on commit/abort) are only retried in allowed
   states (#3041).
   Previously the transaction could hang on commit_transaction() if an abortable
   error was hit and the EndTxnRequest was to be retried.


*Note: there was no v1.5.1 librdkafka release*




# librdkafka v1.5.0

The v1.5.0 release brings usability improvements, enhancements and fixes to
librdkafka.

## Enhancements

 * Improved broker connection error reporting with more useful information and
   hints on the cause of the problem.
 * Consumer: Propagate errors when subscribing to unavailable topics (#1540)
 * Producer: Add `batch.size` producer configuration property (#638)
 * Add `topic.metadata.propagation.max.ms` to allow newly manually created
   topics to be propagated throughout the cluster before reporting them
   as non-existent. This fixes race issues where CreateTopics() is
   quickly followed by produce().
 * Prefer least idle connection for periodic metadata refreshes, et.al.,
   to allow truly idle connections to time out and to avoid load-balancer-killed
   idle connection errors (#2845)
 * Added `rd_kafka_event_debug_contexts()` to get the debug contexts for
   a debug log line (by @wolfchimneyrock).
 * Added Test scenarios which define the cluster configuration.
 * Added MinGW-w64 builds (@ed-alertedh, #2553)
 * `./configure --enable-XYZ` now requires the XYZ check to pass,
   and `--disable-XYZ` disables the feature altogether (@benesch)
 * Added `rd_kafka_produceva()` which takes an array of produce arguments
   for situations where the existing `rd_kafka_producev()` va-arg approach
   can't be used.
 * Added `rd_kafka_message_broker_id()` to see the broker that a message
   was produced or fetched from, or an error was associated with.
 * Added RTT/delay simulation to mock brokers.


## Upgrade considerations

 * Subscribing to non-existent and unauthorized topics will now propagate
   errors `RD_KAFKA_RESP_ERR_UNKNOWN_TOPIC_OR_PART` and
   `RD_KAFKA_RESP_ERR_TOPIC_AUTHORIZATION_FAILED` to the application through
   the standard consumer error (the err field in the message object).
 * Consumer will no longer trigger auto creation of topics,
   `allow.auto.create.topics=true` may be used to re-enable the old deprecated
   functionality.
 * The default consumer pre-fetch queue threshold `queued.max.messages.kbytes`
   has been decreased from 1GB to 64MB to avoid excessive network usage for low
   and medium throughput consumer applications. High throughput consumer
   applications may need to manually set this property to a higher value.
 * The default consumer Fetch wait time has been increased from 100ms to 500ms
   to avoid excessive network usage for low throughput topics.
 * If OpenSSL is linked statically, or `ssl.ca.location=probe` is configured,
   librdkafka will probe known CA certificate paths and automatically use the
   first one found. This should alleviate the need to configure
   `ssl.ca.location` when the statically linked OpenSSL's OPENSSLDIR differs
   from the system's CA certificate path.
 * The heuristics for handling Apache Kafka < 0.10 brokers has been removed to
   improve connection error handling for modern Kafka versions.
   Users on Brokers 0.9.x or older should already be configuring
   `api.version.request=false` and `broker.version.fallback=...` so there
   should be no functional change.
 * The default producer batch accumulation time, `linger.ms`, has been changed
   from 0.5ms to 5ms to improve batch sizes and throughput while reducing
   the per-message protocol overhead.
   Applications that require lower produce latency than 5ms will need to
   manually set `linger.ms` to a lower value.
 * librdkafka's build tooling now requires Python 3.x (python3 interpreter).


## Fixes

### General fixes

 * The client could crash in rare circumstances on ApiVersion or
   SaslHandshake request timeouts (#2326)
 * `./configure --LDFLAGS='a=b, c=d'` with arguments containing = are now
   supported (by @sky92zwq).
 * `./configure` arguments now take precedence over cached `configure` variables
   from previous invocation.
 * Fix theoretical crash on coord request failure.
 * Unknown partition error could be triggered for existing partitions when
   additional partitions were added to a topic (@benesch, #2915)
 * Quickly refresh topic metadata for desired but non-existent partitions.
   This will speed up the initial discovery delay when new partitions are added
   to an existing topic (#2917).


### Consumer fixes

 * The roundrobin partition assignor could crash if subscriptions
   where asymmetrical (different sets from different members of the group).
   Thanks to @ankon and @wilmai for identifying the root cause (#2121).
 * The consumer assignors could ignore some topics if there were more subscribed
   topics than consumers in taking part in the assignment.
 * The consumer would connect to all partition leaders of a topic even
   for partitions that were not being consumed (#2826).
 * Initial consumer group joins should now be a couple of seconds quicker
   thanks expedited query intervals (@benesch).
 * Fix crash and/or inconsistent subscriptions when using multiple consumers
   (in the same process) with wildcard topics on Windows.
 * Don't propagate temporary offset lookup errors to application.
 * Immediately refresh topic metadata when partitions are reassigned to other
   brokers, avoiding a fetch stall of up to `topic.metadata.refresh.interval.ms`. (#2955)
 * Memory for batches containing control messages would not be freed when
   using the batch consume APIs (@pf-qiu, #2990).


### Producer fixes

 * Proper locking for transaction state in EndTxn handler.



# librdkafka v1.4.4

v1.4.4 is a maintenance release with the following fixes and enhancements:

 * Transactional producer could crash on request timeout due to dereferencing
   NULL pointer of non-existent response object.
 * Mark `rd_kafka_send_offsets_to_transaction()` CONCURRENT_TRANSACTION (et.al)
   errors as retriable.
 * Fix crash on transactional coordinator FindCoordinator request failure.
 * Minimize broker re-connect delay when broker's connection is needed to
   send requests.
 * Proper locking for transaction state in EndTxn handler.
 * `socket.timeout.ms` was ignored when `transactional.id` was set.
 * Added RTT/delay simulation to mock brokers.

*Note: there was no v1.4.3 librdkafka release*



# librdkafka v1.4.2

v1.4.2 is a maintenance release with the following fixes and enhancements:

 * Fix produce/consume hang after partition goes away and comes back,
   such as when a topic is deleted and re-created.
 * Consumer: Reset the stored offset when partitions are un-assign()ed (fixes #2782).
    This fixes the case where a manual offset-less commit() or the auto-committer
    would commit a stored offset from a previous assignment before
    a new message was consumed by the application.
 * Probe known CA cert paths and set default `ssl.ca.location` accordingly
   if OpenSSL is statically linked or `ssl.ca.location` is set to `probe`.
 * Per-partition OffsetCommit errors were unhandled (fixes #2791)
 * Seed the PRNG (random number generator) by default, allow application to
   override with `enable.random.seed=false` (#2795)
 * Fix stack overwrite (of 1 byte) when SaslHandshake MechCnt is zero
 * Align bundled c11 threads (tinycthreads) constants to glibc and musl (#2681)
 * Fix return value of rd_kafka_test_fatal_error() (by @ckb42)
 * Ensure CMake sets disabled defines to zero on Windows (@benesch)


*Note: there was no v1.4.1 librdkafka release*





# Older releases

See https://github.com/confluentinc/librdkafka/releases