kiwix-build icon indicating copy to clipboard operation
kiwix-build copied to clipboard

setup.py broken and README.md outdated

Open cm8 opened this issue 6 years ago • 7 comments

Since presumably a commit from March 2018 that renamed/refactored kiwix-build.py to __init__.py and introduced __main__.py, the documentation in README.md does not reflect current state of the package anymore.

In particular, running pip3 install . fails to create a console script.

I was able to work around the problem doing python3 -m kiwixbuild instead of running kiwix-build after changing into the cloned directory. However, I'd think that a proper fix is to make setup.py create the proper console script.

For build machines with less than 16gb of RAM SKIP_BIG_MEMORY_TEST=1 python3 -m kiwixbuild is advisable to avoid swapping conditions and finish compilation in a sane amount of time -- maybe this can be added to README.md when/if it is reworked. If the test is run with swap to disk expect an added 10-20 minutes of compilation time, the comment below has a howto.

cm8 avatar Jun 15 '18 16:06 cm8

Here is how the big memory test may be run on low memory build hosts (12, 8 gig and maybe less):

  • do not run tests in parallel, if swap is likely to be used i.e. use MESON_TESTTHREADS=1 python3 -m kiwixbuild (might not be necessary if swapping to SSD)
  • increase the test timeout in libzim/test/meson.build (included in patch below)
  • patch libzim/test/cluster.cpp to use less memory
diff --git a/test/cluster.cpp b/test/cluster.cpp
index 3db4c36..d6b3ae9 100644
--- a/test/cluster.cpp
+++ b/test/cluster.cpp
@@ -220,8 +220,9 @@ TEST(CluterTest, read_write_extended_cluster)
   std::string blob1("ABCDEFGHIJKLMNOPQRSTUVWXYZ");
   std::string blob2("abcdefghijklmnopqrstuvwxyz");
   zim::size_type bigger_than_4g = 1024LL*1024LL*1024LL*4LL+1024LL;
+  zim::size_type str_content_size = bigger_than_4g + 1024LL;
+  char* str_content = nullptr;
 
-  std::string str_content;
   {
     std::stringstream stream;
 
@@ -252,39 +253,20 @@ TEST(CluterTest, read_write_extended_cluster)
       delete[] blob3;
       // MEM = 4GiB
 
-      // This is a nasty hack.
-      // We preallocate the stream internal buffer to handle the content of our
-      // cluster. Without this, stringstream will reallocate its internal
-      // buffer as needed. As it tries to preallocate more memory than needed,
-      // it ends to allocate 12GiB. Plus our 4GiB cluster, we end to need 16GiB
-      // of memory.
-      // By allocating a buffer bigger than what we need (but not to big),
-      // we avoid a too big allocation and limit our global need to 12GiB.
-      // We would be nice to avoid the creation of the tmp string and limit
-      // ourselves to a global allocation of 8GiB but I don't know how to do
-      // it without creating a custom streambuf :/
-      {
-        std::string tmp;
-        tmp.reserve(bigger_than_4g + 1024LL);
-        // MEM = 8GiB
-        tmp.resize(bigger_than_4g + 1024LL);
-        stream.str(tmp);
-        // MEM = 12GiB
-        stream.str("");
-      }
-      // MEM = 8GiB (tmp is deleted)
+      str_content = new char[str_content_size];
+      stream.rdbuf()->pubsetbuf(str_content, str_content_size);
       cluster.dump(stream);
       // MEM = 8GiB
+
+      ASSERT_GT(stream.tellp(), 0);
+      ASSERT_GT(zim::size_type(stream.tellp()), bigger_than_4g);
+      str_content_size = zim::size_type(stream.tellp());
     }
     // MEM = 4GiB (cluster is deleted)
-
-    str_content = stream.str();
-    // MEM = 8GiB
   }
-  // MEM = 4 GiB (stream is deleted)
-  auto size = str_content.size();
+  auto size = str_content_size;
   auto buffer = std::shared_ptr<zim::Buffer>(
-    new zim::MemoryBuffer<false>(str_content.data(), zim::zsize_t(size)));
+    new zim::MemoryBuffer<true>(str_content, zim::zsize_t(size)));
   auto reader = std::shared_ptr<zim::Reader>(new zim::BufferReader(buffer));
   zim::CompressionType comp;
   bool extended;
diff --git a/test/meson.build b/test/meson.build
index 4887b05..de409a0 100644
--- a/test/meson.build
+++ b/test/meson.build
@@ -17,7 +17,7 @@ foreach test_name : tests
                           link_with : libzim,
                           dependencies : deps + [gtest_dep],
                           build_rpath : '$ORIGIN')
-    test(test_name, test_exe, timeout : 60)
+    test(test_name, test_exe, timeout : 2000)
 endforeach
 
 if get_option('default_library') != 'static'

cm8 avatar Jun 16 '18 05:06 cm8

@cm8 Would you be able to propose a PR for these problems?

kelson42 avatar Jun 26 '18 05:06 kelson42

Yes please @cm8, the patch is not related to the issue but it is very valuable, especially the part making libzim/test/cluster.cpp to use less memory.

mgautierfr avatar Jun 26 '18 08:06 mgautierfr

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] avatar Dec 14 '19 15:12 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] avatar May 04 '20 06:05 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] avatar Sep 02 '20 12:09 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] avatar Jun 02 '21 17:06 stale[bot]