Skip to content

Program hangs forever when adding two items at same path #300

@benoit74

Description

@benoit74

Given the following program:

from zimscraperlib.zim import Creator
from pathlib import Path
from zimscraperlib.zim import metadata

creator = Creator(Path("tests.zim"), "index.html").config_metadata(
    std_metadata=metadata.DEFAULT_DEV_ZIM_METADATA
)

creator.start()

creator.add_item_for("index.html", "Main Page", content="Main", is_front=True )

for i in range(10):
    creator.add_item_for(f"content{i}", content="P" * 1000000)

creator.add_item_for("conflict", content="Foo")
creator.add_item_for("conflict", content="Bar")

I get following logs:

Traceback (most recent call last):
  File "/home/benoit/Tools/zimcreateconflict.py", line 17, in <module>
    creator.add_item_for("conflict", content="Bar")
    ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<@beartype(zimscraperlib.zim.creator.Creator.add_item_for) at 0x7b7c6f51f5e0>", line 248, in add_item_for
  File "/home/benoit/Tools/venv3.14/lib/python3.14/site-packages/zimscraperlib/zim/creator.py", line 365, in add_item_for
    self.add_item(
    ~~~~~~~~~~~~~^
        StaticItem(
        ^^^^^^^^^^^
    ...<10 lines>...
        duplicate_ok=duplicate_ok,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "<@beartype(zimscraperlib.zim.creator.Creator.add_item) at 0x7b7c6f51f740>", line 83, in add_item
  File "/home/benoit/Tools/venv3.14/lib/python3.14/site-packages/zimscraperlib/zim/creator.py", line 410, in add_item
    raise exc
  File "/home/benoit/Tools/venv3.14/lib/python3.14/site-packages/zimscraperlib/zim/creator.py", line 407, in add_item
    super().add_item(item)
    ~~~~~~~~~~~~~~~~^^^^^^
  File "libzim/libzim.pyx", line 517, in libzim._Creator.add_item
RuntimeError: Impossible to add C/conflict
  dirent's title to add is : conflict
  existing dirent's title is : conflict

Which is normal. Scraper should not be doing this kind of silly thing.

What is abnormal is that the program hangs forever, waiting for something to complete. This should not happen (like the program should stop almost immediately).

I don't know it this is a pure libzim issue problem or something related to python-libzim (or python-scraperlib) so I'm asking for help to have confirmation regarding if you achieve to reproduce the problem in pure C++.

What seems to be key to reproduce the issue is to fill at least one cluster ; the more cluster we fill the hardest it is stop stop (with only range(10) in code above, python is still responding to Ctrl-C input. With range(100), I had to kill the program. With Without this initial content filling a cluster, everything works as expected (program terminates).

This has been discovered in openzim/maps#110

Metadata

Metadata

Labels

bugSomething isn't workingquestionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions