Garming Sam via samba-technical
2018-02-21 22:45:08 UTC
Hi,
This is our current set of patches for implementing an LMDB based
backend for LDB. The work is based on a prototype I wrote around this
time last year inspired by Jakub's efforts. In saying that, the approach
I took was completely different. The idea was to refactor ldb_tdb to be
agnostic about which database backend was being used. The advantage has
been quite minimal amount of code required to implement a functional
64-bit database backend. Many of the performance optimizations made for
ldb_tdb can simply be reused, but conversely, for now we have deferred
re-thinking the overall architecture e.g. consolidating the partitions
into a single file using LMDB sub databases.
Currently this backend must have keys restricted to less than 511 bytes,
which is fine for our new GUID indexing scheme, but can run into issues
with our indexes. Gary is currently working on using truncated keys to
bypass this limit.
This current set of patches passes autobuild while still running with
the TDB backend. We have patches to pass the testsuite using the LMDB
backend, but a few of them still need tidying up and have been omitted
here for now.
Performance numbers at this point seem a bit tricky to obtain. Our
existing perf testing infrastructure relies on the test environment,
however, LMDB appears to run noticeably slower under cwrap due to some
calls being intercepted. Basic testing indicates better baseline figures
and better concurrency (reads no longer block writes, and writes
effectively only block reads during commit - not prepare commit), but
there needs to be a lot more testing to properly understand the
performance characteristics.
Somewhat related, I am currently investigating running our tests without
socket wrapper by using network namespaces. There are still areas where
being able to run socket wrapper is definitely useful, but performance
testing is definitely not one of them.
Noteworthy fixes required or bugs found:
- metadata.tdb needs to be committed last during a transaction
(getncchanges tests were causing replication errors as reads occurred
between the commit of metadata.tdb and the rest of the partitions).
- schema loading during a read lock needs a cached value (as writes can
happen during reads, long running read-locked operations could read new
metadata.tdb values).
Any thoughts or comments would be well appreciated. There is definitely
more to come in this space and using LMDB allows us to effectively
implement a number of further improvements like indexing for >= which
would make replication much faster.
Cheers,
Garming
This is our current set of patches for implementing an LMDB based
backend for LDB. The work is based on a prototype I wrote around this
time last year inspired by Jakub's efforts. In saying that, the approach
I took was completely different. The idea was to refactor ldb_tdb to be
agnostic about which database backend was being used. The advantage has
been quite minimal amount of code required to implement a functional
64-bit database backend. Many of the performance optimizations made for
ldb_tdb can simply be reused, but conversely, for now we have deferred
re-thinking the overall architecture e.g. consolidating the partitions
into a single file using LMDB sub databases.
Currently this backend must have keys restricted to less than 511 bytes,
which is fine for our new GUID indexing scheme, but can run into issues
with our indexes. Gary is currently working on using truncated keys to
bypass this limit.
This current set of patches passes autobuild while still running with
the TDB backend. We have patches to pass the testsuite using the LMDB
backend, but a few of them still need tidying up and have been omitted
here for now.
Performance numbers at this point seem a bit tricky to obtain. Our
existing perf testing infrastructure relies on the test environment,
however, LMDB appears to run noticeably slower under cwrap due to some
calls being intercepted. Basic testing indicates better baseline figures
and better concurrency (reads no longer block writes, and writes
effectively only block reads during commit - not prepare commit), but
there needs to be a lot more testing to properly understand the
performance characteristics.
Somewhat related, I am currently investigating running our tests without
socket wrapper by using network namespaces. There are still areas where
being able to run socket wrapper is definitely useful, but performance
testing is definitely not one of them.
Noteworthy fixes required or bugs found:
- metadata.tdb needs to be committed last during a transaction
(getncchanges tests were causing replication errors as reads occurred
between the commit of metadata.tdb and the rest of the partitions).
- schema loading during a read lock needs a cached value (as writes can
happen during reads, long running read-locked operations could read new
metadata.tdb values).
Any thoughts or comments would be well appreciated. There is definitely
more to come in this space and using LMDB allows us to effectively
implement a number of further improvements like indexing for >= which
would make replication much faster.
Cheers,
Garming