Skip to main content

kafka: Topic & Partition

Step 1: understand Topic & Partition

1. one topic can be distributed to one or multiple brokers
2. each broker hold different partitions for topic
3. one broker hold multiple partitions for one or more topics
4. One partition is impl by one physical folder with multiple files (segment).


Step 2: check the physical file folder on /tmp/kafka-logs  (the default folder for kafka logs files)



1. the folder is named as {topic}-{partition no}
2. the same {filename}.index, .log and .timeindex construct one segment. 
3. 00000000000000000000 here = the offset of this segment
4. how to locate the message:
    1. decide the partition. 
    2. decide the segment by offset number of message.
    3. decide the offset inside of segment. 
5. for kafka log, it is clear that the SLM mechanism is used which provides the max write/read turnover. 

Step 3: Customize the partition no based on message





Comments

Popular posts from this blog

How to fix "ValueError when trying to compile python module with VC Express"

When I tried to compile the python, I always get compile issue as following: ------------ ... File "C:\Python26\lib\ distutils\msvc9compiler.py ", line 358, in initialize vc_env = query_vcvarsall(VERSION, plat_spec) File "C:\Python26\lib\ distutils\msvc9compiler.py ", line 274, in query_vcvarsall raise ValueError(str(list(result.keys()))) ValueError: [u'path'] --------------------- Python community discussed a lot but no solution: http://bugs.python.org/issue7511 The root cause is because the latest visual studio change the *.bat file a lot especially on 64bit env. The python 2.7 didn't update the path accordingly. Based on the assumption above, the following solution worked for me. To install Visual Studio 2008 Express Edition with all required components: 1. Install Microsoft Visual Studio 2008 Express Edition. The main Visual Studio 2008 Express installer is available from (the C++ installer name is vcsetup.exe): https://ww...

How to convert the ResultSet to Stream

Java 8 provided the Stream family and easy operation of it. The way of pipeline usage made the code clear and smart. However, ResultSet is still go with very legacy way to process. Per actual ResultSet usage, it is really helpful if converted as Stream. Here is the simple usage of above: StreamUtils.uncheckedConsumer is required to convert the the SQLException to runtimeException to make the Lamda clear.

How to run odoo(openerp8) in IDE from source on windows

1. install python 2.7 (per openerp8's official doc, python 27 is required.) 2. download get-pip.py from https://bootstrap.pypa.io/get-pip.py , execute the command: python get-pip.py 3. get source of openerp8 from https://github.com/odoo/odoo.git 4. execute the command: pip install -r D:\source_code\odoo\openerp8/requirements.txt . (requirements.txt contains all dependencies. ) The pip will install the python module automatically. However, the real world always bring us the issues because our C++ compile environment is not setup correctly.  we will get the link error when pip try to install psycopg2 (driver to access postgresql db.). Go to  http://www.stickpeople.com/projects/python/win-psycopg/  and choose the compiled binary file directly. For Python-ldap, go to  http://www.lfd.uci.edu/~gohlke/pythonlibs/ 5. Finally, go to http://sourceforge.net/projects/pywin32/files/pywin32 and choose correct version for python-win32service. 6. If you are family with...