ZIM-Fileformat - How do I read the directory entries to the end

I read (binary) zim files and have processed the header + mime-list. everything fits so far.
the next step would be to read out the directory entries. but here i have a problem of understanding.

According to “ZIM file format - openZIM” the entries for the URL Pointer List are stored at the location urlPtrPos with 8 bytes each. but how do I recognize the end of the list???

there doesn’t seem to be a stop character or byte like in the mime-list (\0\0).

Hi,

I have looked into this issue and found this information which might be helpful for you.

The end of the URL Pointer List in a ZIM file is typically determined by the total length of the URL Pointer List, which is usually specified in the file header. Each URL entry is a fixed size (e.g., 8 bytes), so the total length divided by the size of each entry gives you the number of entries in the list.

Pseudo Code Example:

# Pseudo-code to read URL Pointer List
url_ptr_list_length = read_url_ptr_list_length_from_header()
url_ptr_entry_size = 8  # Assuming each URL pointer entry is 8 bytes

# Calculate the number of entries in the URL Pointer List
num_entries = url_ptr_list_length // url_ptr_entry_size

# Read URL pointers based on the calculated number of entries
for _ in range(num_entries):
    url_pointer = read_url_pointer()
    process_url_pointer(url_pointer)

# Check for unexpected data or end of list
if not reached_end_of_list(url_ptr_list_length):
    handle_unexpected_data()

1 Like