Experiencing rollback and partial writes

Hi, my team uses RocksDB heavily in a popular iOS application (3m users daily).
We’ve noticed that RocksDB might sometimes lose writes or return partial updates (violating atomicity guarantee) upon re-open.
We use Wal-recovery = kPointInTimeRecovery. Some logs show that PointInTime recovery was triggered due to corruption. However, majority of cases don’t have anything special in the logs.
This issue happens to 20-30 users (daily), but affects consistency of our application.
Any tips or pointers would be welcome.
Config below. RocksDB version is 6.2.4

2021/04/18-23:37:07.731635 700005fb1000 DB SUMMARY
2021/04/18-23:37:07.731826 700005fb1000 CURRENT file: CURRENT
2021/04/18-23:37:07.731856 700005fb1000 IDENTITY file: IDENTITY
2021/04/18-23:37:07.731893 700005fb1000 MANIFEST file: MANIFEST-000006 size: 320 Bytes
2021/04/18-23:37:07.731924 700005fb1000 SST files in /Users/myeyesareblind/Library/Developer/CoreSimulator/Devices/FAC7D101-8BCC-4702-8867-DCDF31339031/data/Containers/Data/Application/99698604-5CEA-4D34-86FB-D6447A4779C4/Library/Databases/events_main.db dir, Total Num: 2, files: 000005.sst 000008.sst
2021/04/18-23:37:07.731945 700005fb1000 Write Ahead Log file in /Users/myeyesareblind/Library/Developer/CoreSimulator/Devices/FAC7D101-8BCC-4702-8867-DCDF31339031/data/Containers/Data/Application/99698604-5CEA-4D34-86FB-D6447A4779C4/Library/Databases/events_main.db: 000007.log size: 154833 ;
2021/04/18-23:37:07.731965 700005fb1000 Options.error_if_exists: 0
2021/04/18-23:37:07.731980 700005fb1000 Options.create_if_missing: 1
2021/04/18-23:37:07.731996 700005fb1000 Options.paranoid_checks: 1
2021/04/18-23:37:07.732010 700005fb1000 Options.env: 0x1163b32e0
2021/04/18-23:37:07.732025 700005fb1000 Options.info_log: 0x600002ca3c00
2021/04/18-23:37:07.732052 700005fb1000 Options.max_file_opening_threads: 16
2021/04/18-23:37:07.732074 700005fb1000 Options.statistics: 0x0
2021/04/18-23:37:07.732091 700005fb1000 Options.use_fsync: 0
2021/04/18-23:37:07.732108 700005fb1000 Options.max_log_file_size: 0
2021/04/18-23:37:07.732124 700005fb1000 Options.max_manifest_file_size: 1073741824
2021/04/18-23:37:07.732139 700005fb1000 Options.log_file_time_to_roll: 0
2021/04/18-23:37:07.732155 700005fb1000 Options.keep_log_file_num: 10
2021/04/18-23:37:07.732178 700005fb1000 Options.recycle_log_file_num: 0
2021/04/18-23:37:07.732193 700005fb1000 Options.allow_fallocate: 1
2021/04/18-23:37:07.732213 700005fb1000 Options.allow_mmap_reads: 0
2021/04/18-23:37:07.732233 700005fb1000 Options.allow_mmap_writes: 0
2021/04/18-23:37:07.732250 700005fb1000 Options.use_direct_reads: 0
2021/04/18-23:37:07.732265 700005fb1000 Options.use_direct_io_for_flush_and_compaction: 0
2021/04/18-23:37:07.732281 700005fb1000 Options.create_missing_column_families: 0
2021/04/18-23:37:07.732297 700005fb1000 Options.db_log_dir:
2021/04/18-23:37:07.732315 700005fb1000 Options.wal_dir: /Users/myeyesareblind/Library/Developer/CoreSimulator/Devices/FAC7D101-8BCC-4702-8867-DCDF31339031/data/Containers/Data/Application/99698604-5CEA-4D34-86FB-D6447A4779C4/Library/Databases/events_main.db
2021/04/18-23:37:07.732331 700005fb1000 Options.table_cache_numshardbits: 6
2021/04/18-23:37:07.732346 700005fb1000 Options.max_subcompactions: 1
2021/04/18-23:37:07.732364 700005fb1000 Options.max_background_flushes: -1
2021/04/18-23:37:07.732382 700005fb1000 Options.WAL_ttl_seconds: 0
2021/04/18-23:37:07.732398 700005fb1000 Options.WAL_size_limit_MB: 0
2021/04/18-23:37:07.732413 700005fb1000 Options.manifest_preallocation_size: 4194304
2021/04/18-23:37:07.732429 700005fb1000 Options.is_fd_close_on_exec: 1
2021/04/18-23:37:07.732445 700005fb1000 Options.advise_random_on_open: 1
2021/04/18-23:37:07.732460 700005fb1000 Options.db_write_buffer_size: 0
2021/04/18-23:37:07.732476 700005fb1000 Options.write_buffer_manager: 0x6000001dde90
2021/04/18-23:37:07.732496 700005fb1000 Options.access_hint_on_compaction_start: 1
2021/04/18-23:37:07.732516 700005fb1000 Options.new_table_reader_for_compaction_inputs: 0
2021/04/18-23:37:07.732533 700005fb1000 Options.random_access_max_buffer_size: 1048576
2021/04/18-23:37:07.732575 700005fb1000 Options.use_adaptive_mutex: 0
2021/04/18-23:37:07.732593 700005fb1000 Options.rate_limiter: 0x0
2021/04/18-23:37:07.732609 700005fb1000 Options.sst_file_manager.rate_bytes_per_sec: 0
2021/04/18-23:37:07.732625 700005fb1000 Options.wal_recovery_mode: 2
2021/04/18-23:37:07.732641 700005fb1000 Options.enable_thread_tracking: 0
2021/04/18-23:37:07.732656 700005fb1000 Options.enable_pipelined_write: 0
2021/04/18-23:37:07.732671 700005fb1000 Options.allow_concurrent_memtable_write: 1
2021/04/18-23:37:07.732692 700005fb1000 Options.enable_write_thread_adaptive_yield: 1
2021/04/18-23:37:07.732712 700005fb1000 Options.write_thread_max_yield_usec: 100
2021/04/18-23:37:07.732734 700005fb1000 Options.write_thread_slow_yield_usec: 3
2021/04/18-23:37:07.732750 700005fb1000 Options.row_cache: None
2021/04/18-23:37:07.732773 700005fb1000 Options.avoid_flush_during_recovery: 0
2021/04/18-23:37:07.732791 700005fb1000 Options.allow_ingest_behind: 0
2021/04/18-23:37:07.732807 700005fb1000 Options.preserve_deletes: 0
2021/04/18-23:37:07.732823 700005fb1000 Options.two_write_queues: 0
2021/04/18-23:37:07.732841 700005fb1000 Options.manual_wal_flush: 0
2021/04/18-23:37:07.732858 700005fb1000 Options.atomic_flush: 0
2021/04/18-23:37:07.732874 700005fb1000 Options.avoid_unnecessary_blocking_io: 0
2021/04/18-23:37:07.732889 700005fb1000 Options.max_background_jobs: 2
2021/04/18-23:37:07.732905 700005fb1000 Options.max_background_compactions: -1
2021/04/18-23:37:07.732921 700005fb1000 Options.avoid_flush_during_shutdown: 0
2021/04/18-23:37:07.732936 700005fb1000 Options.writable_file_max_buffer_size: 1048576
2021/04/18-23:37:07.732955 700005fb1000 Options.delayed_write_rate : 16777216
2021/04/18-23:37:07.732974 700005fb1000 Options.max_total_wal_size: 0
2021/04/18-23:37:07.732992 700005fb1000 Options.delete_obsolete_files_period_micros: 21600000000
2021/04/18-23:37:07.733007 700005fb1000 Options.stats_dump_period_sec: 600
2021/04/18-23:37:07.733023 700005fb1000 Options.stats_persist_period_sec: 600
2021/04/18-23:37:07.733038 700005fb1000 Options.stats_history_buffer_size: 1048576
2021/04/18-23:37:07.733056 700005fb1000 Options.max_open_files: -1
2021/04/18-23:37:07.733072 700005fb1000 Options.bytes_per_sync: 0
2021/04/18-23:37:07.733087 700005fb1000 Options.wal_bytes_per_sync: 0
2021/04/18-23:37:07.733109 700005fb1000 Options.strict_bytes_per_sync: 0
2021/04/18-23:37:07.733127 700005fb1000 Options.compaction_readahead_size: 0
2021/04/18-23:37:07.733143 700005fb1000 Compression algorithms supported:
2021/04/18-23:37:07.733163 700005fb1000 kZSTD supported: 0
2021/04/18-23:37:07.733181 700005fb1000 kZlibCompression supported: 0
2021/04/18-23:37:07.733197 700005fb1000 kXpressCompression supported: 0
2021/04/18-23:37:07.733213 700005fb1000 kSnappyCompression supported: 0
2021/04/18-23:37:07.733229 700005fb1000 kZSTDNotFinalCompression supported: 0
2021/04/18-23:37:07.733247 700005fb1000 kLZ4HCCompression supported: 1
2021/04/18-23:37:07.733264 700005fb1000 kLZ4Compression supported: 1
2021/04/18-23:37:07.733280 700005fb1000 kBZip2Compression supported: 0
2021/04/18-23:37:07.733299 700005fb1000 Fast CRC32 supported: Not supported on x86
2021/04/18-23:37:07.733614 700005fb1000 [db/version_set.cc:4053] Recovering from manifest file: /Users/myeyesareblind/Library/Developer/CoreSimulator/Devices/FAC7D101-8BCC-4702-8867-DCDF31339031/data/Containers/Data/Application/99698604-5CEA-4D34-86FB-D6447A4779C4/Library/Databases/events_main.db/MANIFEST-000006
2021/04/18-23:37:07.733749 700005fb1000 [db/column_family.cc:482] --------------- Options for column family [default]:
2021/04/18-23:37:07.733773 700005fb1000 Options.comparator: leveldb.BytewiseComparator
2021/04/18-23:37:07.733792 700005fb1000 Options.merge_operator: None
2021/04/18-23:37:07.733808 700005fb1000 Options.compaction_filter: None
2021/04/18-23:37:07.733824 700005fb1000 Options.compaction_filter_factory: None
2021/04/18-23:37:07.733840 700005fb1000 Options.memtable_factory: SkipListFactory
2021/04/18-23:37:07.733856 700005fb1000 Options.table_factory: BlockBasedTable
2021/04/18-23:37:07.733902 700005fb1000 table_factory options: flush_block_policy_factory: FlushBlockBySizePolicyFactory (0x600000dacbf0)
cache_index_and_filter_blocks: 0
cache_index_and_filter_blocks_with_high_priority: 0
pin_l0_filter_and_index_blocks_in_cache: 0
pin_top_level_index_and_filter: 1
index_type: 0
data_block_index_type: 0
index_shortening: 1
data_block_hash_table_util_ratio: 0.750000
hash_index_allow_collision: 1
checksum: 1
no_block_cache: 0
block_cache: 0x600003ea0ab8
block_cache_name: LRUCache
block_cache_options:
capacity : 8388608
num_shard_bits : 4
strict_capacity_limit : 0
memory_allocator : None
high_pri_pool_ratio: 0.000
block_cache_compressed: 0x0
persistent_cache: 0x0
block_size: 4096
block_size_deviation: 10
block_restart_interval: 16
index_block_restart_interval: 1
metadata_block_size: 4096
partition_filters: 0
use_delta_encoding: 1
filter_policy: nullptr
whole_key_filtering: 1
verify_compression: 0
read_amp_bytes_per_bit: 0
format_version: 2
enable_index_compression: 1
block_align: 0
2021/04/18-23:37:07.733921 700005fb1000 Options.write_buffer_size: 1048576
2021/04/18-23:37:07.733938 700005fb1000 Options.max_write_buffer_number: 2
2021/04/18-23:37:07.733954 700005fb1000 Options.compression: LZ4
2021/04/18-23:37:07.733970 700005fb1000 Options.bottommost_compression: Disabled
2021/04/18-23:37:07.733987 700005fb1000 Options.prefix_extractor: nullptr
2021/04/18-23:37:07.734006 700005fb1000 Options.memtable_insert_with_hint_prefix_extractor: nullptr
2021/04/18-23:37:07.734021 700005fb1000 Options.num_levels: 7
2021/04/18-23:37:07.734037 700005fb1000 Options.min_write_buffer_number_to_merge: 1
2021/04/18-23:37:07.734052 700005fb1000 Options.max_write_buffer_number_to_maintain: 0
2021/04/18-23:37:07.734070 700005fb1000 Options.bottommost_compression_opts.window_bits: -14
2021/04/18-23:37:07.734087 700005fb1000 Options.bottommost_compression_opts.level: 32767
2021/04/18-23:37:07.734103 700005fb1000 Options.bottommost_compression_opts.strategy: 0
2021/04/18-23:37:07.734117 700005fb1000 Options.bottommost_compression_opts.max_dict_bytes: 0
2021/04/18-23:37:07.734132 700005fb1000 Options.bottommost_compression_opts.zstd_max_train_bytes: 0
2021/04/18-23:37:07.734148 700005fb1000 Options.bottommost_compression_opts.enabled: false
2021/04/18-23:37:07.734163 700005fb1000 Options.compression_opts.window_bits: -14
2021/04/18-23:37:07.734178 700005fb1000 Options.compression_opts.level: 32767
2021/04/18-23:37:07.734193 700005fb1000 Options.compression_opts.strategy: 0
2021/04/18-23:37:07.734208 700005fb1000 Options.compression_opts.max_dict_bytes: 0
2021/04/18-23:37:07.734223 700005fb1000 Options.compression_opts.zstd_max_train_bytes: 0
2021/04/18-23:37:07.734239 700005fb1000 Options.compression_opts.enabled: false
2021/04/18-23:37:07.734254 700005fb1000 Options.level0_file_num_compaction_trigger: 4
2021/04/18-23:37:07.734268 700005fb1000 Options.level0_slowdown_writes_trigger: 20
2021/04/18-23:37:07.734283 700005fb1000 Options.level0_stop_writes_trigger: 36
2021/04/18-23:37:07.734298 700005fb1000 Options.target_file_size_base: 67108864
2021/04/18-23:37:07.734340 700005fb1000 Options.target_file_size_multiplier: 1
2021/04/18-23:37:07.734360 700005fb1000 Options.max_bytes_for_level_base: 268435456
2021/04/18-23:37:07.734377 700005fb1000 Options.snap_refresh_nanos: 0
2021/04/18-23:37:07.734393 700005fb1000 Options.level_compaction_dynamic_level_bytes: 0
2021/04/18-23:37:07.734409 700005fb1000 Options.max_bytes_for_level_multiplier: 10.000000
2021/04/18-23:37:07.734425 700005fb1000 Options.max_bytes_for_level_multiplier_addtl[0]: 1
2021/04/18-23:37:07.734441 700005fb1000 Options.max_bytes_for_level_multiplier_addtl[1]: 1
2021/04/18-23:37:07.734457 700005fb1000 Options.max_bytes_for_level_multiplier_addtl[2]: 1
2021/04/18-23:37:07.734472 700005fb1000 Options.max_bytes_for_level_multiplier_addtl[3]: 1
2021/04/18-23:37:07.734489 700005fb1000 Options.max_bytes_for_level_multiplier_addtl[4]: 1
2021/04/18-23:37:07.734505 700005fb1000 Options.max_bytes_for_level_multiplier_addtl[5]: 1
2021/04/18-23:37:07.734526 700005fb1000 Options.max_bytes_for_level_multiplier_addtl[6]: 1
2021/04/18-23:37:07.734543 700005fb1000 Options.max_sequential_skip_in_iterations: 8
2021/04/18-23:37:07.734559 700005fb1000 Options.max_compaction_bytes: 1677721600
2021/04/18-23:37:07.734575 700005fb1000 Options.arena_block_size: 131072
2021/04/18-23:37:07.734591 700005fb1000 Options.soft_pending_compaction_bytes_limit: 68719476736
2021/04/18-23:37:07.734610 700005fb1000 Options.hard_pending_compaction_bytes_limit: 274877906944
2021/04/18-23:37:07.734627 700005fb1000 Options.rate_limit_delay_max_milliseconds: 100
2021/04/18-23:37:07.734643 700005fb1000 Options.disable_auto_compactions: 0
2021/04/18-23:37:07.734676 700005fb1000 Options.compaction_style: kCompactionStyleLevel
2021/04/18-23:37:07.734695 700005fb1000 Options.compaction_pri: kMinOverlappingRatio
2021/04/18-23:37:07.734711 700005fb1000 Options.compaction_options_universal.size_ratio: 1
2021/04/18-23:37:07.734727 700005fb1000 Options.compaction_options_universal.min_merge_width: 2
2021/04/18-23:37:07.734747 700005fb1000 Options.compaction_options_universal.max_merge_width: 4294967295
2021/04/18-23:37:07.734766 700005fb1000 Options.compaction_options_universal.max_size_amplification_percent: 200
2021/04/18-23:37:07.734782 700005fb1000 Options.compaction_options_universal.compression_size_percent: -1
2021/04/18-23:37:07.734799 700005fb1000 Options.compaction_options_universal.stop_style: kCompactionStopStyleTotalSize
2021/04/18-23:37:07.734815 700005fb1000 Options.compaction_options_fifo.max_table_files_size: 1073741824
2021/04/18-23:37:07.734830 700005fb1000 Options.compaction_options_fifo.allow_compaction: 0
2021/04/18-23:37:07.734846 700005fb1000 Options.table_properties_collectors:
2021/04/18-23:37:07.734861 700005fb1000 Options.inplace_update_support: 0
2021/04/18-23:37:07.734885 700005fb1000 Options.inplace_update_num_locks: 10000
2021/04/18-23:37:07.734903 700005fb1000 Options.memtable_prefix_bloom_size_ratio: 0.000000
2021/04/18-23:37:07.734919 700005fb1000 Options.memtable_whole_key_filtering: 0
2021/04/18-23:37:07.734934 700005fb1000 Options.memtable_huge_page_size: 0
2021/04/18-23:37:07.734951 700005fb1000 Options.bloom_locality: 0
2021/04/18-23:37:07.734970 700005fb1000 Options.max_successive_merges: 0
2021/04/18-23:37:07.734989 700005fb1000 Options.optimize_filters_for_hits: 0
2021/04/18-23:37:07.735006 700005fb1000 Options.paranoid_file_checks: 0
2021/04/18-23:37:07.735022 700005fb1000 Options.force_consistency_checks: 0
2021/04/18-23:37:07.735038 700005fb1000 Options.report_bg_io_stats: 0
2021/04/18-23:37:07.735053 700005fb1000 Options.ttl: 0
2021/04/18-23:37:07.735070 700005fb1000 Options.periodic_compaction_seconds: 0

Do you enable the sync option in the write? regardless I think you should use kAbsoluteConsistency which stop the recovery at the first error and not point in time which is try to salvage ad much data as possible

1 Like

Reply from Siying from the Meta team:
The first step of troubleshooting might be:
Look at RocksDB logs (by default LOG and LOG*) and see whether it replays all the log files. If not try to figure out a reason.
To analyze the last log file replayed, use “ldb dump_wal” and see what log entries can be retrieved. It will tell you offset of each log entry and some metadata. You would know whether RocksDB is able to read the log file to the end.
Unfortunately, the tool doesn’t tell why RocksDB can only read a part of the log file. Let me know if you need help with that and we’ll see what’s the best we can do.
Regarding the issue itself, do you have a power restart? If not, we aren’t supposed to loss any data no matter what WAL recovery mode you use (unless you use some uncommon command like manual_wal_flush=true). So definitely something went wrong.

Same answer (power restart, without sync after each write, may create corrupted data in the log). Sync on every write is required to ensure consistency . kAbsoluteConsistency is required so no data will be read after the first error…