Binary-parser

Circle CI

Binary-parser is a binary parser builder library for node, which enables you to write efficient parsers in a simple & declarative way.

It supports all common data types required to analyze a structured binary data. Binary-parser dynamically generates and compiles the parser code on-the-fly, which runs as fast as a hand-written parser (which takes much more time and effort to write). Supported data types are:

  • Integers (supports 8, 16, 32 bit signed- and unsigned integers)
  • Floating point numbers (supports 32 and 64 bit floating point values)
  • Bit fields (supports bit fields with length from 1 to 32 bits)
  • Strings (supports various encodings, fixed-length and variable-length, zero terminated string)
  • Arrays (supports user-defined element type, fixed-length and variable-length)
  • Choices
  • User defined types

This library's features are inspired by BinData , its syntax by binary.

Installation

Binary-parser can be installed with npm:

$ npm install binary-parser

Quick Start

  1. Create an empty Parser object with new Parser().
  2. Chain builder methods to build the desired parser. (See API for detailed document of each methods)
  3. Call Parser.prototype.parse with an Buffer object passed as argument.
  4. Parsed result will be returned as an object.
// Module import
var Parser = require('binary-parser').Parser;

// Build an IP packet header Parser
var ipHeader = new Parser()
    .endianess('big')
    .bit4('version')
    .bit4('headerLength')
    .uint8('tos')
    .uint16('packetLength')
    .uint16('id')
    .bit3('offset')
    .bit13('fragOffset')
    .uint8('ttl')
    .uint8('protocol')
    .uint16('checksum')
    .array('src', {
        type: 'uint8',
        length: 4
    })
    .array('dst', {
        type: 'uint8',
        length: 4
    });

// Prepare buffer to parse.
var buf = new Buffer('450002c5939900002c06ef98adc24f6c850186d1', 'hex');

// Parse buffer and show result
console.log(ipHeader.parse(buf));

API

new Parser()

Constructs a Parser object. Returned object represents a parser which parses nothing.

parse(buffer[, callback])

Parse a Buffer object buffer with this parser and return the resulting object. When parse(buffer) is called for the first time, parser code is compiled on-the-fly and internally cached.

create(constructorFunction)

Set the constructor function that should be called to create the object returned from the parse method.

[u]int{8, 16, 32}{le, be}(name [,options])

Parse bytes as an integer and store it in a variable named name. name should consist only of alphanumeric characters and start with an alphabet. Number of bits can be chosen from 8, 16 and 32. Byte-ordering can be either l for little endian or b for big endian. With no prefix, it parses as a signed number, with u prefixed as an unsigned number.

var parser = new Parser()
    // Signed 32-bit integer (little endian)
    .int32le('a')
    // Unsigned 8-bit integer
    .uint8('b')
    // Signed 16-bit integer (big endian)
    .int16be('c')

bit[1-32](name [,options])

Parse bytes as a bit field and store it in variable name. There are 32 methods from bit1 to bit32 each corresponding to 1-bit-length to 32-bits-length bit field.

{float, double}{le, be}(name [,options])

Parse bytes as an floating-point value and store it in a variable named name. name should consist only of alphanumeric characters and start with an alphabet.

var parser = new Parser()
    // 32-bit floating value (big endian)
    .floatbe('a')
    // 64-bit floating value (little endian)
    .doublele('b')

string(name [,options])

Parse bytes as a string. name should consist only of alpha numeric characters and start with an alphabet. options is an object; following options are available:

  • encoding - (Optional, defaults to utf8) Specify which encoding to use. 'utf8', 'ascii', 'hex' and else are valid. See Buffer.toString for more info.
  • length - (Optional) Length of the string. Can be a number, string or a function. Use number for statically sized arrays, string to reference another variable and function to do some calculation.
  • zeroTerminated - (Optional, defaults to false) If true, then this parser reads until it reaches zero.
  • stripNull - (Optional, must be used with length) If true, then strip null characters from end of the string

buffer(name [,options])

Parse bytes as a buffer. name should consist only of alpha numeric characters and start with an alphabet. options is an object; following options are available:

  • clone - (Optional, defaults to false) By default, buffer(name [,options]) returns a new buffer which references the same memory as the parser input, but offset and cropped by a certain range. If this option is true, input buffer will be cloned and a new buffer referncing another memory is returned.
  • length - (either length or readUntil is required) Length of the buffer. Can be a number, string or a function. Use number for statically sized buffers, string to reference another variable and function to do some calculation.
  • readUntil - (either length or readUntil is required) If 'eof', then this parser will read till it reaches end of the Buffer object.

array(name [,options])

Parse bytes as an array. options is an object; following options are available:

  • type - (Required) Type of the array element. Can be a string or an user defined Parser object. If it's a string, you have to choose from [u]int{8, 16, 32}{le, be}.
  • length - (either length or readUntil is required) Length of the array. Can be a number, string or a function. Use number for statically sized arrays.
  • readUntil - (either length or readUntil is required) If 'eof', then this parser reads until the end of Buffer object. If function it reads until the function returns true.
var parser = new Parser()
    // Statically sized array
    .array('data', {
        type: 'int32',
        length: 8
    })

    // Dynamically sized array (reference another variable)
    .uint8('dataLength')
    .array('data2', {
        type: 'int32',
        length: 'dataLength'
    })

    // Dynamically sized array (with some calculation)
    .array('data3', {
        type: 'int32',
        length: function() { return this.dataLength - 1; } // other fields are available through this
    });

    // Dynamically sized array (with stop-check on parsed item)
    .array('data4', {
        type: 'int32',
        readUntil: function(item, buffer) { return item === 42 } // stop when specific item is parsed. buffer can be used to perform a read-ahead.
    });

    // Use user defined parser object
    .array('data5', {
        type: userDefinedParser,
        length: 'dataLength'
    });

choice(name [,options])

Choose one parser from several choices according to a field value. Combining choice with array is useful for parsing a typical Type-Length-Value styled format.

  • tag - (Required) The value used to determine which parser to use from the choices Can be a string pointing to another field or a function.
  • choices - (Required) An object which key is an integer and value is the parser which is executed when tag equals the key value.
  • defaultChoice - (Optional) In case of the tag value doesn't match any of choices use this parser.
var parser1 = ...;
var parser2 = ...;
var parser3 = ...;

var parser = new Parser()
    .uint8('tagValue')
    .choice('data', {
        tag: 'tagValue',
        choices: {
            1: parser1, // When tagValue == 1, execute parser1
            4: parser2, // When tagValue == 4, execute parser2
            5: parser3  // When tagValue == 5, execute parser3
        }
    });

nest(name [,options])

Nest a parser in this position. Parse result of the nested parser is stored in the variable name.

  • type - (Required) A Parser object.

skip(length)

Skip parsing for length bytes.

endianess(endianess)

Define what endianess to use in this parser. endianess can be either 'little' or 'big'. The default endianess of Parser is set to big-endian.

var parser = new Parser()
    .endianess('le')
    // You can specify endianess explicitly
    .uint16be('a')
    .uint32le('a')
    // Or you can omit endianess (in this case, little-endian is used)
    .uint16('b')
    .int32('c')

compile()

Compile this parser on-the-fly and cache its result. Usually, there is no need to call this method directly, since it's called when parse(buffer) is executed for the first time.

getCode()

Dynamically generates the code for this parser and returns it as a string. Usually used for debugging.

Common options

These are common options that can be specified in all parsers.

  • formatter - Function that transforms the parsed value into a more desired form.
var parser = new Parser()
  .array('ipv4', {
    type: uint8,
    length: '4',
    formatter: function(arr) { return arr.join('.'); }
  });
  • assert - Do assertion on the parsed result (useful for checking magic numbers and so on). If assert is a string or number, the actual parsed result will be compared with it with === (strict equality check), and an exception is thrown if they mismatch. On the other hand, if assert is a function, that function is executed with one argument (parsed result) and if it returns false, an exception is thrown.

    // simple maginc number validation
    var ClassFile =
        Parser.start()
        .endianess('big')
        .uint32('magic', {assert: 0xcafebabe})
    
    // Doing more complex assertion with a predicate function
    var parser = new Parser()
        .int16le('a')
        .int16le('b')
        .int16le('c', {
            assert: function(x) {
                return this.a + this.b === x;
            }
        });

Examples

See example for more complex examples.

Support

Please report issues to the issue tracker if you have any difficulties using this module, found a bug, or request a new feature.

Pull requests with fixes and improvements are welcomed!

License

The MIT License (MIT)

Copyright (c) 2013-2014 Keichi Takahashi keichi.t@me.com

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.



Binary-parser

圆CI

二进制解析器是节点的二进制解析器构建器库, 这使您能够在简单的&声明式。

它支持分析结构化二进制数据所需的所有常见数据类型。 二进制解析器动态生成并编译解析器代码, 其运行速度与手写解析器一样快(需要更多的时间和精力来编写)。 支持的数据类型有:

  • 整数(支持8,16,32位有符号和无符号整数)
  • 浮点数(支持32位和64位浮点值)
  • 位字段(支持长度为1到32位的位字段)
  • 字符串(支持各种编码,固定长度和可变长度,零终止字符串)
  • 数组(支持用户定义的元素类型,固定长度和可变长度)
  • 选择
  • 用户定义的类型

此图书馆的功能受 BinData 的启发 ,其语法由二进制

安装

可以使用 npm 安装二进制解析器:

$ npm install binary-parser

快速入门

  1. Create an empty Parser object with new Parser().
  2. Chain builder methods to build the desired parser. (See API for detailed document of each methods)
  3. Call Parser.prototype.parse with an Buffer object passed as argument.
  4. Parsed result will be returned as an object.
// Module import
var Parser = require('binary-parser').Parser;

// Build an IP packet header Parser var ipHeader = new Parser() .endianess('big') .bit4('version') .bit4('headerLength') .uint8('tos') .uint16('packetLength') .uint16('id') .bit3('offset') .bit13('fragOffset') .uint8('ttl') .uint8('protocol') .uint16('checksum') .array('src', { type: 'uint8', length: 4 }) .array('dst', { type: 'uint8', length: 4 });

// Prepare buffer to parse. var buf = new Buffer('450002c5939900002c06ef98adc24f6c850186d1', 'hex');

// Parse buffer and show result console.log(ipHeader.parse(buf));

API

新的Parser()

构造一个Parser对象。返回的对象表示一个解析器,不解析。

parse(buffer [,callback] )

使用此解析器解析缓冲区对象缓冲区,并返回生成的对象。 当第一次调用 parse(buffer)时,解析器代码即时编译 并内部缓存。

create(constructorFunction)

设置应该调用的构造函数来创建返回的对象 parse 方法。

[u] int {8,16,32}​​ {le,be}(name [,options])

将字节解析为整数,并将其存储在名为 name 的变量中。应该包含 name 只有字母数字字符,并以字母开始。 位数可以从8,16和32中选择。 字节排序可以是小尾数的 l 或大端的 b 。 没有前缀,它将作为一个有符号的数字进行解析,其中 u 作为无符号数字前缀。

var parser = new Parser()
    // Signed 32-bit integer (little endian)
    .int32le('a')
    // Unsigned 8-bit integer
    .uint8('b')
    // Signed 16-bit integer (big endian)
    .int16be('c')

位[1 -32](name [,options])

将字节解析为位字段,并将其存储在变量 name 中。有32种方法 每个对应于1位长度到32位长度的位字段的 bit1 bit32

{float,double} {le,be}(name [,options])

将字节解析为浮点值,并将其存储在变量中 命名为 name name 应仅包含字母数字字符并开始 用字母表。

var parser = new Parser()
    // 32-bit floating value (big endian)
    .floatbe('a')
    // 64-bit floating value (little endian)
    .doublele('b')

string(name [,options] )

将字节解析为字符串。 name 应该只包含字母数字字符和开始 用字母表。 options 是一个对象;以下选项可用:

  • encoding - (可选,默认为 utf8 )指定要使用的编码。 ‘utf8’‘ascii’‘hex’等 是有效的。有关详细信息,请参阅 Buffer.toString
  • length - (可选)字符串的长度。可以是数字,字符串或函数。 使用静态大小的数组,字符串引用另一个变量 功能做一些计算。
  • zeroTerminated - (可选,默认为 false )如果为true,则此解析器读取直到达到零。
  • stripNull - (可选,必须与长度一起使用)如果为true,则从字符串末尾剥去空字符

buffer(name [,options] )

将字节解析为缓冲区。 name 应该只包含字母数字字符和开始 用字母表。 options 是一个对象;以下选项可用:

  • clone - (可选,默认为 false )默认情况下, buffer(name [,options])返回一个新的缓冲区, 与解析器输入相同的内存,但偏移和裁剪一定范围。如果此选项为真,输入缓冲区 将被克隆,并返回引用另一个内存的新缓冲区。
  • length - (需要 length readUntil )缓冲区的长度。可以是数字,字符串或函数。 使用静态大小缓冲区的数字,字符串引用另一个变量 功能做一些计算。
  • readUntil - ( length readUntil 是必需的)如果‘eof’,那么这个解析器 将读取,直到它到达缓冲区对象的末尾。

array(name [,options] )

将字节解析为数组。 options 是一个对象;以下选项可用:

  • type - (必需)数组元素的类型。可以是字符串或用户定义的Parser对象。 如果它是一个字符串,你必须从[u] int {8,16,32}​​ {le,be}中选择。
  • length - (需要 length readUntil )数组的长度。可以是数字,字符串或函数。 使用数字作为静态大小的数组。
  • readUntil - ( length readUntil 是必需的)如果‘eof’,那么这个解析器 读取直到缓冲区对象的结尾。如果函数读取,直到函数返回true。
var parser = new Parser()
    // Statically sized array
    .array('data', {
        type: 'int32',
        length: 8
    })

<span class="pl-c"><span class="pl-c">//</span> Dynamically sized array (reference another variable)</span>
.<span class="pl-en">uint8</span>(<span class="pl-s"><span class="pl-pds">&#39;</span>dataLength<span class="pl-pds">&#39;</span></span>)
.<span class="pl-en">array</span>(<span class="pl-s"><span class="pl-pds">&#39;</span>data2<span class="pl-pds">&#39;</span></span>, {
    type<span class="pl-k">:</span> <span class="pl-s"><span class="pl-pds">&#39;</span>int32<span class="pl-pds">&#39;</span></span>,
    length<span class="pl-k">:</span> <span class="pl-s"><span class="pl-pds">&#39;</span>dataLength<span class="pl-pds">&#39;</span></span>
})

<span class="pl-c"><span class="pl-c">//</span> Dynamically sized array (with some calculation)</span>
.<span class="pl-en">array</span>(<span class="pl-s"><span class="pl-pds">&#39;</span>data3<span class="pl-pds">&#39;</span></span>, {
    type<span class="pl-k">:</span> <span class="pl-s"><span class="pl-pds">&#39;</span>int32<span class="pl-pds">&#39;</span></span>,
    <span class="pl-en">length</span><span class="pl-k">:</span> <span class="pl-k">function</span>() { <span class="pl-k">return</span> <span class="pl-c1">this</span>.<span class="pl-smi">dataLength</span> <span class="pl-k">-</span> <span class="pl-c1">1</span>; } <span class="pl-c"><span class="pl-c">//</span> other fields are available through this</span>
});

<span class="pl-c"><span class="pl-c">//</span> Dynamically sized array (with stop-check on parsed item)</span>
.<span class="pl-en">array</span>(<span class="pl-s"><span class="pl-pds">&#39;</span>data4<span class="pl-pds">&#39;</span></span>, {
    type<span class="pl-k">:</span> <span class="pl-s"><span class="pl-pds">&#39;</span>int32<span class="pl-pds">&#39;</span></span>,
    <span class="pl-en">readUntil</span><span class="pl-k">:</span> <span class="pl-k">function</span>(<span class="pl-smi">item</span>, <span class="pl-smi">buffer</span>) { <span class="pl-k">return</span> item <span class="pl-k">===</span> <span class="pl-c1">42</span> } <span class="pl-c"><span class="pl-c">//</span> stop when specific item is parsed. buffer can be used to perform a read-ahead.</span>
});

<span class="pl-c"><span class="pl-c">//</span> Use user defined parser object</span>
.<span class="pl-en">array</span>(<span class="pl-s"><span class="pl-pds">&#39;</span>data5<span class="pl-pds">&#39;</span></span>, {
    type<span class="pl-k">:</span> userDefinedParser,
    length<span class="pl-k">:</span> <span class="pl-s"><span class="pl-pds">&#39;</span>dataLength<span class="pl-pds">&#39;</span></span>
});</pre></div>

choice(name [,options] )

根据字段值从多个选项中选择一个解析器。 将选择 array 组合可用于解析典型代码 Type-Length-Value 样式格式。

  • 标签 - (必需)用于确定从选项使用哪个解析器的值 可以是指向另一个字段或函数的字符串。
  • 选择 - (必需)一个对象,该键是一个整数和值,是执行的解析器 当标签等于键值时。
  • defaultChoice - (可选)如果标签值与任何选项不匹配,请使用此解析器。
var parser1 = ;
var parser2 = ;
var parser3 = ;

var parser = new Parser() .uint8('tagValue') .choice('data', { tag: 'tagValue', choices: { 1: parser1, // When tagValue == 1, execute parser1 4: parser2, // When tagValue == 4, execute parser2 5: parser3 // When tagValue == 5, execute parser3 } });

nest(name [,options] )

在这个位置嵌入一个解析器。嵌套解析器的解析结果存储在变量中 name

  • type - (必需)一个 Parser 对象。

skip(length)

跳过解析长度字节。

endianess(endianess)

定义在此解析器中使用的endianess。 endianess 可以是‘little’‘big’ Parser 的默认endianess设置为big-endian。

var parser = new Parser()
    .endianess('le')
    // You can specify endianess explicitly
    .uint16be('a')
    .uint32le('a')
    // Or you can omit endianess (in this case, little-endian is used)
    .uint16('b')
    .int32('c')

compile()

即时编译此解析器并缓存其结果。通常,没有必要 直接调用此方法,因为在执行 parse(buffer)时调用该方法 第一次。

getCode()

动态生成此解析器的代码,并将其作为字符串返回。 通常用于调试。

常用选项

这些是可以在所有解析器中指定的常用选项。

  • formatter - 将解析值转换为更理想的形式的功能。
var parser = new Parser()
  .array('ipv4', {
    type: uint8,
    length: '4',
    formatter: function(arr) { return arr.join('.'); }
  });
  • assert - 对解析的结果进行断言(用于检查魔术数据等等)。 如果 assert string number ,实际的解析结果将与之进行比较 使用 === (严格等式检查),并且如果它们不匹配,则抛出异常。 另一方面,如果 assert 是一个函数,则该函数用一个参数执行 (解析结果),如果返回false,则抛出异常。

       // 简单的maginc数字验证
     var  ClassFile  = 
     Parser  start ()
         endianess   big  )
         uint32   magic  ,{assert   0xcafebabe })

    // 使用谓词函数执行更复杂的断言 var parser = new 解析器()      int16le a )      int16le b )      int16le c ,{          assert function pl-smi> x ){              return a pl-k> + b === x;         }     });

示例

有关更复杂的示例,请参阅示例

支持

请将问题报告给问题跟踪器,如果您 使用此模块有任何困难,发现错误或请求新功能。

欢迎提出修正和改进的请求!

许可证

麻省理工学院许可证(MIT)

版权所有(c)2013-2014 Keichi Takahashi keichi.t@me.com

特此授予任何获得副本的人免费许可 的这个软件和相关的文档文件(软件)来处理 软件无限制,包括但不限于权限 使用,复制,修改,合并,发布,分发,再许可和/或销售 软件的副本,并允许本软件的人员 须遵守以下条件:

上述版权声明和本许可声明应包括在内 软件的所有副本或主要部分。

该软件按原样提供,不提供任何明示或暗示的保证 暗示,包括但不限于适销性的保证, 适用于特定用途和非侵权。在任何情况下 作者或版权所有人不得对任何索赔,损害或其他责任负责 责任,无论在合同,侵权行为或其他方面的行为, 与本软件或其使用或其他交易相关或不存在 该软件。




相关问题推荐