据权威部门统计,截至2007年底,我国私家轿车保有量巳超过1800万辆,其中天津和广州每百户家庭拥有私家轿车分别为l7.4辆和12.7辆。 有车就得养车修车。然而,最近一项调查显示,对目前修车行业信不过的高达71.6%,基本信得过的只有11.4%。尽管有那么多人对修车行业的服务不满意,但他们一点办法也没有,因为他们根本就搞不清修车行业到底有多少黑幕……
据权威部门统计,截至2007年底,我国私家轿车保有量巳超过1800万辆,其中天津和广州每百户家庭拥有私家轿车分别为l7.4辆和12.7辆。 有车就得养车修车。然而,最近一项调查显示,对目前修车行业信不过的高达71.6%,基本信得过的只有11.4%。尽管有那么多人对修车行业的服务不满意,但他们一点办法也没有,因为他们根本就搞不清修车行业到底有多少黑幕……
1. lrf2lrs http://blog.bigcomic.com/upload/lrf2lrs_03.zip
lrf to lrs . python写的小工具
2. pylrs http://blog.bigcomic.com/upload/pylrs-1.0.0.zip
python模块, 可以生成lrf
The LRF format is a proprietary format by Sony used in their e-readers.The format is still undocumented. Many efforts were done by enthusiasts to understand the format and to make utilities for LRF conversion since appearing Sony Librie (Japaneese version) reader on the market. This review is my personal vision of the history and is limited to the resources I used in my work. LRF conversion is rapidly growing area of the content generation development (inspired by new Sony Reader PRS-500). There are several other resources and wiki pages devoted to the LRF format and tools for conversion; the goal of this page is a review of major achievements for the beginners. This section reflects works done before February 2007, some links may not work. I'm not going to update this page for new resources, please visit forums listed in the Link section for up-to-date information.
The goal of this page is to provide a reference to the programs that can be used for understanding the LRF format, the program listed here are not the only (and not necessarily the best) way to make lrf content.
Formats
LRF format. This BINARY format used in Sony Librie and Sony PRS-500 Readers. The format is barely documented. Some very limited information can be found at http://www.sven.de/librie/Librie/BBeB . The LRF format can be understood with the python code of lrf2lrs converter and utilities for extracting the objects from LRF files (see below). The LRF can contain text or images, the ability to show images sometimes used by homemade programs with the reference to "picture" LRF (actually this is just simple implementation of one of the features of LRF format).
LRS (Librie Reader Source) format. LRS is a "source" XML format with the description for the objects used in LRF. This format was introduced in the Book Creator program (commercial program with Japanese interface) by Canon (see below). The format is documented: http://www.y-adagio.com/public/committees/iec_pt62448/1_np(0509)/100_1017e_DC.pdf . LRS files can be generated by either 3 ways: 1) with commercial Book Creator program by Canon; 2) with freeware Book Designer program (see below for more details) 3) with Python lrf2lrs converter by roxfan, Igor Skochinsky and several newly developed utilities available at www.mobileread.com.
BBeB "Broadband ebook" format by Sony. Another name of LRF and, sometimes, LRS formats.
Programs useful for understanding the LRF format
Major programs for understanding the LRF format, listed by date:
1. Book Creator Commercial program by Canon. First mentioned in yahoo Librie group in June 2004, see post #18). The program has Japanese interface. Since that time the development of homemade LRF content was started. First, many affords were done to make English version of Book Creator (see Librie yahoo group); Second, some homebrew programs use XYLogParser.dll (with Lrs2lrf wrapper by roxfan) to create LRF files from LRS source; Third, reverse engineering of LRF format was started to develop programs independent from XYLogParser.dll.
2. Makelrf program by scythic (first introduced in yahoo librie group in October, 2004). The program comes with C-code and available in the File section of Librie group. The program allows to create Lrf files from the text files with support of the images. The program cannot create rich lrf files (only basic objects are supported), but it is hard to underestimated the progress archived with the program. Till now makelrf is widely used in many other programs as an engine for creation either simple text LRF or image-based LRF. Until recently makelrf was used in BookDesigner program (in "simple" conversion mode); and JAP programs to generate image-based LRFs.
3. LRFParser by roxfan (yahoo Librie group, November 2004, comes with C-source). The program decompiles LRF files to the objects, including TOC, header and compressed streams. To my knowledge, LRFparser is the first program where complex LRF objects and majority of tags were understood.
4. lrf2lrs (first version signed by roxan, February 2006; yahoo Librie group); the latest versions (signed by roxfan, Igor Skochinsky) are available at www.mobileread.com (Sony Portable Reader - Reader Developer's Corner - Lrf2Lrs thread, posts by igorsk). (I believe that roxfan=Igor Skorchinsky=igorsk). With the program the LRF files can be converted to the LRS files. Up to date the source of this Python program is the major source of information on LRF objects and tags. Almost all possible LRF objects and tags are supported by the program.
5. LRFunpack (by me, available at download section of this site, November 2006). NET2.0 C# application for extracting objects from LRF files with some translation of the tags and description of the streams. Creates both dump hex files for each of the object together with the text files with some decryption. I was using this program to understand the LRF format. The advantage of this program - it never stops decompiling even if tag is unknown. Some example of decompiled lrf can be found at download section. Sorry, no source available.
-----------
I did not mention Java tools, because I never tried them myself. Please have a look at the forums of the sites listed in the Link section. I believe the most significant contributions are flatLrf (http://monalipse.sourceforge.jp/tmp/lrf/, January 2005) and LRFParser.java (by Scotty1024, May 2005, Librie yahoo group). FlatLrf is Java application to create lrf from html files; LRFParser.java is Java version of LRFParser by roxfan.
BookDesigner (Wiki page) (by Valerii Woizechovsky (aka vvv))). This program is an universal program for conversion between many formats, reading and ebook creation. The program creates LRS file, than the LRS file is converted to LRF with external third-party converters (makelrf, lrs2lrf , LRSparser by AlexXF, the latest version uses my MSH_Lrsparser). The strong side of the BookDesigner for understanding the LRF format is ability to generate custom LRS together with corresponding LRF file for further analysis.
LRF conversion utilities
Please refer to the Wiki page and the link section for recent progress on content generation. Recently Sony launched site for the developers at www. prslabs.com. Unfortunately there is almost no new information. The LRS format specification is given (the one mentioned above), the XYLogParser.dll (almost the same to the discussed above, it is also very slow), and the description of the interface around the XYLogparser.dll.
A LRF file consists of a header, a number of objects and an object index. All values are in Intel (LSB first) order.
| Offset (hex) | Size(bytes) | Name/meaning | Example value |
| 0 | 8 | LRF Signature | 4C 00 52 00 46 00 00 00 = "LRF" in Unicode |
| 8 | 2 | version? | 999 in most files |
| A | 2 | "Psuedo-Encryption" key byte | 48 |
| 0C | 4 | RootObjectID | 0x0044 |
| 10 | 8 | NumberOfObjects | 342 |
| 18 | 8 | ObjectIndexOffset | 0x00093440 |
| 20 | 4 | unknown | 0 |
| 24 | 1 | Flags (16 - back to front, 1 = front to back) | 16 |
| 25 | 1 | unknown (padding?) | 0 |
| 26 | 2 | unknown | 1600 |
| 28 | 2 | unknown (padding?) | 0 |
| 2A | 2 | Height? | 600 |
| 2C | 2 | Width? | 800 |
| 2E | 1 | unknown | 24 |
| 2F | 1 | unknown (padding?) | 0 |
| 30 | 0x14 | unknown | zeroes |
| 44 | 4 | Object ID of only PlaneStream (0x1E) object | 0x0042 |
| 48 | 4 | unknown | 0x1536 |
| 4C | 2 | XMLCompSize | 0x035C |
Next two fields are only present if version>=800.
| 4E | 2 | unknown | 0x0014 |
| 50 | 4 | GifSize | 0x03F2 |
Immediately follows the compressed XML metainfo, of size XMLCompSize. First dword of it is the size of uncompressed data, the rest is zlib compressed unicode XML.
If version>=800, the gif thumbnail follows, of size GifSize.
Offset to the index is specified by the ObjectIndexOffset in the header, and number of entries is NumberOfObjects.
Each index entry has the following layout:
| Offset (hex) | Size(bytes) | Name/meaning | Example value |
| 00 | 4 | id | 0x32 |
| 04 | 4 | offset | 0x07B0 |
| 08 | 4 | size | 0x44 |
| 0C | 4 | reserved? | 0 |
See LrfObject [http://buycialis.cc(approve sites) buy cialis online] [http://buycialis.cc(approve sites) buy cialis] http://buycialis.cc buy cialis online(approve sites)
安装步骤基本和Ubuntu下差不多, 只不过需要多装一个pam_userdb, 我是yum装的 yum install db4-utils
ubuntu下参考之前的文章: http://blog.bigcomic.com/post/241.html
有一点不同的是ubuntu下的pam日志在 /var/log/auth.log. 而centos5.2是在 /var/log/secure
EOF
| 色彩数 | 640 X 480 | 800X600 | 1024X768 | 1280X1024 |
| 256 | 0x301 | 0x303 | 0x305 | 0x307 |
| 32k | 0x310 | 0x313 | 0x316 | 0x319 |
| 64k | 0x311 | 0x314 | 0x317 | 0x31A |
| 16M | 0x312 | 0x315 | 0x318 | 0x31B |
–auto-generate-sql, -a
自动生成测试表和数据–auto-generate-sql-load-type=type
测试语句的类型。取值包括:read,key,write,update和mixed(默认)。–number-char-cols=N, -x N
自动生成的测试表中包含多少个字符类型的列,默认1–number-int-cols=N, -y N
自动生成的测试表中包含多少个数字类型的列,默认1–number-of-queries=N
总的测试查询次数–query=name,-q
使用自定义脚本执行测试,例如可以调用自定义的一个存储过程或者sql语句来执行测试。–create-schema
测试的schema,MySQL中schema也就是database–commint=N
多少条DML后提交一次–compress, -C
如果服务器和客户端支持都压缩,则压缩信息传递–concurrency=N, -c N
并发量,也就是模拟多少个客户端同时执行select。可指定多个值,以逗号或者–delimiter参数指定的值做为分隔符–engine=engine_name, -e engine_name
创建测试表所使用的存储引擎,可指定多个–iterations=N, -i N
测试执行的迭代次数–detach=N
执行N条语句后断开重连–debug-info, -T
打印内存和CPU的信息–only-print
只打印测试语句而不实际执行测试的过程需要生成测试表,插入测试数据,这个mysqlslap可以自动生成,默认生成一个mysqlslap的schema,如果已经存在则先 删除,这里要注意了,不要用–create-schema指定已经存在的库,否则后果可能很严重。可以用–only-print来打印实际的测试过程:
$mysqlslap -a --only-print
DROP SCHEMA IF EXISTS `mysqlslap`;
CREATE SCHEMA `mysqlslap`;
use mysqlslap;
CREATE TABLE `t1` (intcol1 INT(32) ,charcol1 VARCHAR(128));
INSERT INTO t1 VALUES (1804289383,'mxvtvmC9127qJNm06sGB8R92q2j7vTiiITRDGXM9ZLzkdekbWtmXKwZ2qG1llkRw5m9DHOFilEREk3q7oce8O3BEJC0woJsm6uzFAEynLH2xCsw1KQ1lT4zg9rdxBL');
...
SELECT intcol1,charcol1 FROM t1;
INSERT INTO t1 VALUES (364531492,'qMa5SuKo4M5OM7ldvisSc6WK9rsG9E8sSixocHdgfa5uiiNTGFxkDJ4EAwWC2e4NL1BpAgWiFRcp1zIH6F1BayPdmwphatwnmzdwgzWnQ6SRxmcvtd6JRYwEKdvuWr');
DROP SCHEMA IF EXISTS `mysqlslap`;可以看到最后由删除一开始创建的schema的动作,整个测试完成后不会在数据库中留下痕迹。假如我们执行一次测试,分别50和100个并发,执行1000次总查询,那么:
$mysqlslap -a --concurrency=50,100 --number-of-queries 1000 --debug-info
Benchmark
Average number of seconds to run all queries: 0.375 seconds
Minimum number of seconds to run all queries: 0.375 seconds
Maximum number of seconds to run all queries: 0.375 seconds
Number of clients running queries: 50
Average number of queries per client: 20
Benchmark
Average number of seconds to run all queries: 0.453 seconds
Minimum number of seconds to run all queries: 0.453 seconds
Maximum number of seconds to run all queries: 0.453 seconds
Number of clients running queries: 100
Average number of queries per client: 10
User time 0.29, System time 0.11
Maximum resident set size 0, Integral resident set size 0
Non-physical pagefaults 4032, Physical pagefaults 0, Swaps 0
Blocks in 0 out 0, Messages in 0 out 0, Signals 0
Voluntary context switches 7319, Involuntary context switches 681上结果可以看出,50和100个并发分别得到一次测试结果(Benchmark),并发数越多,执行完所有查询的时间越长。为了准确起见,可以多迭代测试几次:
$ mysqlslap -a --concurrency=50,100 --number-of-queries 1000 --iterations=5 --debug-info
Benchmark
Average number of seconds to run all queries: 0.380 seconds
Minimum number of seconds to run all queries: 0.377 seconds
Maximum number of seconds to run all queries: 0.385 seconds
Number of clients running queries: 50
Average number of queries per client: 20
Benchmark
Average number of seconds to run all queries: 0.447 seconds
Minimum number of seconds to run all queries: 0.444 seconds
Maximum number of seconds to run all queries: 0.451 seconds
Number of clients running queries: 100
Average number of queries per client: 10
User time 1.44, System time 0.67
Maximum resident set size 0, Integral resident set size 0
Non-physical pagefaults 17922, Physical pagefaults 0, Swaps 0
Blocks in 0 out 0, Messages in 0 out 0, Signals 0
Voluntary context switches 36796, Involuntary context switches 4093测试同时不同的存储引擎的性能进行对比:
$ mysqlslap -a --concurrency=50,100 --number-of-queries 1000 --iterations=5 --engine=myisam,innodbmysqlslap -a --concurrency=50,100 --number-of-queries 1000 --iterations=5 --engine=myisam,innodb --debug-info
Benchmark
Running for engine myisam
Average number of seconds to run all queries: 0.200 seconds
Minimum number of seconds to run all queries: 0.188 seconds
Maximum number of seconds to run all queries: 0.210 seconds
Number of clients running queries: 50
Average number of queries per client: 20
Benchmark
Running for engine myisam
Average number of seconds to run all queries: 0.238 seconds
Minimum number of seconds to run all queries: 0.228 seconds
Maximum number of seconds to run all queries: 0.251 seconds
Number of clients running queries: 100
Average number of queries per client: 10
Benchmark
Running for engine innodb
Average number of seconds to run all queries: 0.375 seconds
Minimum number of seconds to run all queries: 0.370 seconds
Maximum number of seconds to run all queries: 0.379 seconds
Number of clients running queries: 50
Average number of queries per client: 20
Benchmark
Running for engine innodb
Average number of seconds to run all queries: 0.443 seconds
Minimum number of seconds to run all queries: 0.440 seconds
Maximum number of seconds to run all queries: 0.447 seconds
Number of clients running queries: 100
Average number of queries per client: 10
User time 2.83, System time 1.66
Maximum resident set size 0, Integral resident set size 0
Non-physical pagefaults 34692, Physical pagefaults 0, Swaps 0
Blocks in 0 out 0, Messages in 0 out 0, Signals 0
Voluntary context switches 87306, Involuntary context switches 10326
一、主要功能
DRBD实际上是一种块设备的实现,主要被用于Linux平台下的高可用(HA)方案之中。他是有内核模块和相关程序而组成,通过网络通信来同步镜像整个设备,有点类似于一个网络RAID的功能。也就是说当你将数据写入本地的DRBD设备上的文件系统时,数据会同时被发送到网络中的另外一台主机之上,并以完全相同的形式记录在一个文件系统中(实际上文件系统的创建也是由DRBD的同步来实现的)。本地节点(主机)与远程节点(主机)的数据可以保证实时的同步,并保证IO的一致性。所以当本地节点的主机出现故障时,远程节点的主机上还会保留有一份完全相同的数据,可以继续使用,以达到高可用的目的。
在高可用(HA)解决方案中使用DRBD的功能,可以代替使用一个共享盘阵存储设备。因为数据同时存在于本地主机和远程主机上,在遇到需要切换的时候,远程主机只需要使用它上面的那份备份数据,就可以继续提供服务了。
二、底层设备支持
DRBD需要构建在底层设备之上,然后构建出一个块设备出来。对于用户来说,一个DRBD设备,就像是一块物理的磁盘,可以在商脉内创建文件系统。DRBD所支持的底层设备有以下这些类:
1、一个磁盘,或者是磁盘的某一个分区;
2、一个soft raid 设备;
3、一个LVM的逻辑卷;
4、一个EVMS(Enterprise Volume Management System,企业卷管理系统)的卷;
5、其他任何的块设备。
三、配置简介
1、全局配置项(global)
基本上我们可以做的也就是配置usage-count是yes还是no了,usage-count参数其实只是为了让linbit公司收集目前drbd的使用情况。当drbd在安装和升级的时候会通过http协议发送信息到linbit公司的服务器上面。
2、公共配置项(common)
这里的common,指的是drbd所管理的多个资源之间的common。配置项里面主要是配置drbd的所有resource可以设置为相同的参数项,比如protocol,syncer等等。
3、资源配置项(resource)
resource项中配置的是drbd所管理的所有资源,包括节点的ip信息,底层存储设备名称,设备大小,meta信息存放方式,drbd对外提供的设备名等等。每一个resource中都需要配置在每一个节点的信息,而不是单独本节点的信息。实际上,在drbd的整个集群中,每一个节点上面的drbd.conf文件需要是完全一致的。
另外,resource还有很多其他的内部配置项:
net:网络配置相关的内容,可以设置是否允许双主节点(allow-two-primaries)等。
startup:启动时候的相关设置,比如设置启动后谁作为primary(或者两者都是primary:become-primary-on both)
syncer:同步相关的设置。可以设置“重新”同步(re-synchronization)速度(rate)设置,也可以设置是否在线校验节点之间的数据一致性(verify-alg 检测算法有md5,sha1以及crc32等)。数据校验可能是一个比较重要的事情,在打开在线校验功能后,我们可以通过相关命令(drbdadm verify resource_name)来启动在线校验。在校验过程中,drbd会记录下节点之间不一致的block,但是不会阻塞任何行为,即使是在该不一致的block上面的io请求。当不一致的block发生后,drbd就需要有re-synchronization动作,而syncer里面设置的rate项,主要就是用于re-synchronization的时候,因为如果有大量不一致的数据的时候,我们不可能将所有带宽都分配给drbd做re-synchronization,这样会影响对外提提供服务。rate的设置和还需要考虑IO能力的影响。如果我们会有一个千兆网络出口,但是我们的磁盘IO能力每秒只有50M,那么实际的处理能力就只有50M,一般来说,设置网络IO能力和磁盘IO能力中最小者的30%的带宽给re-synchronization是比较合适的(官方说明)。另外,drbd还提供了一个临时的rate更改命令,可以临时性的更改syncer的rate值:drbdsetup /dev/drbd0 syncer -r 100M。这样就临时的设置了re-synchronization的速度为100M。不过在re-synchronization结束之后,你需要通过drbdadm adjust resource_name 来让drbd按照配置中的rate来工作。
五、资源管理
1、增加resource的大小:
当遇到我们的drbd resource设备容量不够的时候,而且我们的底层设备支持在线增大容量的时候(比如使用lvm的情况下),我们可以先增大底层设备的大小,然后再通过drbdadm resize resource_name来实现对resource的扩容。但是这里有一点需要注意的就是只有在单primary模式下可以这样做,而且需要先在所有节点上都增大底层设备的容量。然后仅在primary节点上执行resize命令。在执行了resize命令后,将触发一次当前primary节点到其他所有secondary节点的re-synchronization。
如果我们在drbd非工作状态下对底层设备进行了扩容,然后再启动drbd,将不需要执行resize命令(当然前提是在配置文件中没有对disk参数项指定大小),drbd自己会知道已经增大了容量。
在进行底层设备的增容操作的时候千万不要修改到原设备上面的数据,尤其是drbd的meta信息,否则有可能毁掉所有数据。
2、收缩resource容量:
容量收缩比扩容操作要危险得多,因为该操作更容易造成数据丢失。在收缩resource的容量之前,必须先收缩drbd设备之上的容量,也就是文件系统的大小。如果上层文件系统不支持收缩,那么resource也没办法收缩容量。
如果在配置drbd的时候将meta信息配置成internal的,那么在进行容量收缩的时候,千万别只计算自身数据所需要的空间大小,还要将drbd的meta信息所需要的空间大小加上。
当文件系统收缩好以后,就可以在线通过以下命令来重设resource的大小:drbdadm — –size=***G resize resource_name。在收缩的resource的大小之后,你就可以自行收缩释放底层设备空间(如果支持的话)。
如果打算停机状态下收缩容量,可以通过以下步骤进行:
a、在线收缩文件系统
b、停用drbd的resource:drbdadm down resourcec_name
c、导出drbd的metadata信息(在所有节点都需要进行):drbdadm dump-md resource_name > /path_you_want_to_save/file_name
d、在所有节点收缩底层设备
e、更改上面dump出来的meta信息的la-size-sect项到收缩后的大小(是换算成sector的数量后的数值)
f、如果使用的是internal来配置meta-data信息,则需要重新创建meta-data:drbdadm create-md resource_name
g、将之前导出并修改好的meta信息重新导入drbd(摘录自linbit官方网站的一段导入代码):
drbdmeta_cmd=$(drbdadm -d dump-md test-disk)
${drbdmeta_cmd/dump-md/restore-md} /path_you_want_to_save/file_name
h、启动resource:drbdadm up resource_name
六、磁盘损坏
1、detach resource
如果在resource的disk配置项中配置了on_io_error为pass_on的话,那么drbd在遇到磁盘损坏后不会自己detach底层设备。也就是说需要我们手动执行detach的命令(drbdadm detach resource_name),然后再查看当前各节点的ds信息。可以通过cat /proc/drbd来查看,也可以通过专有命令来查看:drbdadm dstat resource_name。当发现损坏的那方已经是Diskless后,即可。如果我们没有配置on_io_error或者配置成detach的话,那么上面的操作将会由自动进行。
另外,如果磁盘损坏的节点是当前主节点,那么我们需要进行节点切换的操作后再进行上面的操作。
2、更换磁盘
当detach了resource之后,就是更换磁盘了。如果我们使用的是internal的meta-data,那么在换好磁盘后,只需要重新创建mata-data(drbdadm create-md resource_name),再将resource attach上(drbdadm attach resource_name),然后drbd就会马上开始从当前primary节点到本节点的re-synchronisation。数据同步的实时状况可以通过 /proc/drbd文件的内容获得。
不过,如果我们使用的不是internal的meta-data保存方式,也就是说我们的meta-data是保存在resource之外的地方的。那么我们在完成上面的操作(重建meta-data)之后,还需要进行一项操作来触发re-synchnorisation,所需命令为:drbdadm invalidate resource_name 。
七、节点crash(或计划内维护)
1、secondary节点
如果是secondary接待你crash,那么primary将临时性的与secondary断开连接,cs状态应该会变成WFConnection,也就是等待连接的状态。这时候primary会继续对外提供服务,并在meta-data里面记录下从失去secondary连接后所有变化过的block的信息。当secondary重新启动并连接上primary后,primary –> secondary的re-synchnorisation会自动开始。不过在re-synchnorisation过程中,primary和secondary的数据是不一致状态的。也就是说,如果这个时候primary节点也crash了的话,secondary是没办法切换成primary的。也就是说,如果没有其他备份的话,将丢失所有数据。
2、primary节点
一般情况下,primary的crash和secondary的crash所带来的影响对drbd来说基本上是差不多的。唯一的区别就是需要多操作一步将secondary节点switch成primary节点先对外提供服务。这个switch的过程drbd自己是不会完成的,需要我们人为干预进行一些操作才能完成。当crash的原primary节点修复并重新启动连接到现在的primary后,会以secondary存在,并开始re-synchnorisation这段时间变化的数据。
在primary节点crash的情况下,drbd可以保证同步到原secondary的数据的一致性,这样就避免了当primary节点crash之后,secondary因为数据的不一致性而无法wcitch成primary或者即使切换成primary后因为不一致的数据无法提供正常的服务的问题。
3、节点永久性损坏(需要更换机器或重新安装相关软件的情况)
当某一个节点因为硬件(或软件)的问题,导致某一节点已经无法再轻易修复并提供服务,也就是说我们所面对的是需要更换主机(或从OS层开始重新安装)的问题。在遇到这样的问题后,我们所需要做的是重新提供一台和原节点差不多的机器,重新开始安装os,安装相关软件,从现有整提供服务的节点上copy出drbd的配置文件(/etc/drbd.conf),创建meta-data信息,然后启动drbd服务,以一个secondary的身份连接到现有的primary上面,后面就会自动开始re-synchnorisation。
八、split brain的处理
split brain实际上是指在某种情况下,造成drbd的两个节点断开了连接,都以primary的身份来运行。当drbd某primary节点连接对方节点准备发送信息的时候如果发现对方也是primary状态,那么会会立刻自行断开连接,并认定当前已经发生split brain了,这时候他会在系统日志中记录以下信息:“Split-Brain detected,dropping connection!”当发生split brain之后,如果查看连接状态,其中至少会有一个是StandAlone状态,另外一个可能也是StandAlone(如果是同时发现split brain状态),也有可能是WFConnection的状态。
如果我们在配置文件中配置了自动解决split brain(好像linbit不推荐这样做),drbd会自行解决split brain问题,具体解决策略是根据配置中的设置来进行的。
如果没有配置split brain自动解决方案,我们可以手动解决。首先我们必须要确定哪一边应该作为解决问题后的primary,一旦确定好这一点,那么我们同时也就确定接受丢失在split brain之后另外一个节点上面所做的所有数据变更了。当这些确定下来后,我们就可以通过以下操作来恢复了:
a、首先在确定要作为secondary的节点上面切换成secondary并放弃该资源的数据:
drbdadm secondary resource_name
drbdadm — –discard-my-data connect resource_name
b、在要作为primary的节点重新连接secondary(如果这个节点当前的连接状态为WFConnection的话,可以省略)
drbdadm connect resource_name
当作完这些动作之后,从新的primary到secondary的re-synchnorisation会自动开始。
八、meta data存放地点的比较
1、internal meta-data(meta-data和数据存放在同一个底层设备之上)
优点:一旦meta-data创建之后,就和实际数据绑在了一起,在维护上会更简单方便,不用担心meta-data会因为某些操作而丢失。另外在硬盘损坏丢失数据的同时,meta-data也跟着一起丢失,当更换硬盘之后,只需要执行重建meta-data的命令即可,丢失的数据会很容易的从其他节点同步过来。
缺点:如果底层设备是单一的磁盘,没有做raid,也不是lvm等,那么可能会造成性能影响。因为每一次写io都需要更新meta-data里面的信息,那么每次写io都会有两次,而且肯定会有磁头的较大寻道移动,因为meta-data都是记录在dice设备的最末端的,这样就会造成写io的性能降低。
2、external meta data(meta-data存放在独立的,与存放数据的设备分开的设备之上)
优点:与internal meta-data的缺点完全相对,可以解决写io的争用问题。
缺点:由于meta-data存放在与数据设备分开的地方,就意味着当磁盘损坏更换磁盘之后,必须手动发起全量同步的操作。也就是管理维护会稍微麻烦那么一点点,很小的一点点。
如果我们希望在已经存在数据的设备上面建立drbd的资源,并且不希望丢失该设备上面的数据,又没办法增大底层设备的容量,而且上层文件系统又没办法收缩的话,我们就只能将meta data创建成external方式。