|
|
# faultreiber
[![FOSSA Status](https://app.fossa.io/api/projects/git%2Bgithub.com%2Fbloodstalker%2Ffaultreiber.svg?type=shield)](https://app.fossa.io/projects/git%2Bgithub.com%2Fbloodstalker%2Ffaultreiber?ref=badge_shield)
`faultreiber` generates a parser library in C for a structured (binary) file format. The input is an XML file that describes the format.<br/>
The C source code will be in the form of multiple source and header files.<br/>
The headers have header guards and are already `extern "C"`ed.<br/>
## Demo
For a practical example, look at the example XML file under `resources`. The XML file describes the format of a WASM object file:<br/>
To run the demo, run `run.sh`, go to the `test` direcotory and run `make test` to run the executable.<br/>
To run `valgrind --leak-check=yes` run `make valgrind`.<br/>
## Memory Leaks
Code generated by faultreiber should not leak any memory if everything went according to plan during code-gen. If that's not the case let me knkow.<br/>
## How to Use
A function named `read_aggr_{name}` will be generated that takes an `int _fd` file descriptor for the file that it will read.`{name}` is what you pass to faultreiber with the `--name` option.<br/>
The return type will be a C structure with type `{name}_lib_ret_t`. The struct is defined as:<br/>
```C
typedef struct {
name_obj_t obj;
void** void_train;
uint64_t* current_void_size;
uint64_t* current_void_count;
}name_lib_ret_t
```
`{name}_obj_t` is a C structure defined in `aggregate.h` that holds all the read modules.<br/>
A function named `release_all_{name}` will be generated in `aggregate.c` that releases almost all the memory.<br/>
The proper order of realeasing the memory in the client code will be like below assuming the return value of `read_aggr_{name}` is stored in `lib_ret` and `--name` was passed a value of `wasm`:<br/>
```C
release_all_wasm(lib_ret->void_train, lib_ret->current_void_count);
free(lib_ret->obj);
free(lib_ret);
```
## faultreiber XML file
The root node should have two childs, named exactly `READ` and `DEFINITION`(order not important).<br/>
The `READ` node will include the actual structures that the parser will read and can return.<br/>
The `DEFINITION` node includes the definitions for the structures that are aggregate.<br/>
## Rules:
Any child node of either `DEFINITION` or `READ` will have to at least have the attributes `name` and `type` defined. The presence of the attribute `count` is optional but if it's not present faultreiber will assume that the count is one.<br/>
The presence of the attribute `isaggregate` signifies the fact that the data structure is composed of other smaller parts. faultreiber will only read the children of a node that is the child of either the `DEFINITION` or `READ` node(unless a child node has the attribute `conditional` set). If a data structure requires more children then you should add a new node under `DEFINITION` and reference that node from it's parent. In other words, an aggregate node can't itself have child nodes that are aggregate.<br/>
`count`, `size`, `type` and `condition` attributes can reference a child node of the `DEFINITION` node. To do that, you should use `self::TAG`.<br/>
the tag names of the nodes that are on the same level should be unique. The `name` attribute of the nodes on the same level need to be unique as well.<br/>
The order of the nodes that appear as children of the `DEFINITION` node, even when the child nodes are referencing each other, is unimportant to faultreiber.<br/>
Tags should follow the naming convention for naming XML nodes. The `name` attributes should follow the C identifier naming convention(if the value of the `name` attribute is invalid in C as as identifier you're going to end up with code that won't even build).<br/>
The following values are valid values for the `type` attribute:<br/>
* int8
* uint8
* int16
* uint16
* int32
* uint32
* int64
* uint64
* int128
* uint128
* float
* double
* string
* FT::conditional
* self::TAG
For string nodes, the node should either have a non-empty `size` attribute or have a `delimiter` attribute. In case a `delimiter` attribute is selected the value of the delimiter should be provided as the value of the `delimiter` attribute to the node.<br/>
Strings read through a `delimiter` node will have their delimiter attached to the end of the string(null-terminated or otherwise). String reads that have a `size` attribute will be forcefully null-terminated even if the original string was not null-terminated.<br/>
Child nodes of `READ` node that have the `unordered` attribute set, will be regarded as such, meaning they can appear in the file sporaically. Such nodes will have to have a child node with attriute `sign`.The value of the sign attribute is used to check for the presence of the parent node in the file.<br/>
`unorderedbegin` and `unorderedend` attributes denote the begenning and end of an unordered section in the `READ` node. For every unordered section, only one node needs to define the begin and end attributes. All the other nodes, including the nodes that define the `unorderedbegin` and `unorderedend` attributes, shall have the `unordered` attribute defined.<br/>
Any child of the `READ` node that is not inside an unordered block or doesnt have the `unordered` attribute set, will be regarded as ordered.<br/>
Whether `int128` or `uint128` are defined depends on your the C implementation you are using on your host. If 128-bit integers are not supported or you need to read in bigger integers, you can simply use a smaller int type and increase the `count` attribute accordingly.<br/>
The `FT::conditional` tag for a type means that the actual content of the node will depend on a value. The attribute `condition` will provide what that condition is. The value for the condition should be provided as text for the different nodes that define what the actual contents should be.<br/>
`size` attribute is currently only meaningful when the `type` attribute is set as `string` in which case it denotes the size of the string.<br/>
## Options
```bash
-h, --help show this help message and exit
--targetname TARGETNAME
main target name
--outdir OUTDIR path to output dir
--structs STRUCTS the structs json file
--structsinclude STRUCTSINCLUDE
the path to the header that's going to be included by
structs.h before structure declarations.
--xml XML paht to the xml file
--dbg debug
--datetime print date and time in autogen files
--inline inlines reader funcs
--static statics reader funcs
--verbose verbose
--forcenullterm terminate all strings with null even if they are not
originally null-terminated
--strbuffersize STRBUFFERSIZE
the size of the buffer for string reads
--strbuffgrowfactor STRBUFFGROWFACTOR
the factor by which the strbuffer will grow
--voidbuffersize VOIDBUFFERSIZE
the size of the buffer for void* buffer
--voidbuffgrowfactor VOIDBUFFGROWFACTOR
the factor by which the voidbuffer will grow
--singlefile the generated code will be put in a single file
--singlefilename SINGLEFILENAME
name of the single file
--name will be used in generating some code identifiers
```
## limitations
Big-Endian reads are not supported.<br/>
None-byte-sized raw reads are not supported.<br/>
## makefile
That would be on you but there is an example makefile in the `test` directory so you can take a look if you want.<br/>
You can also get generic ones from [here](https://github.com/bloodstalker/lazymakefiles). They're licensed under the Unlicense.<br/>
## TODO
All the items under limitations.<br/>
Figure out what the license of the generated code is.<br/>
## Projects
The list of the projects that use faulreiber:<br/>
* [bruiser](https://github.com/bloodstalker/mutator/tree/master/bruiser)<br/>
## License
`faultreiber` is provided under MIT. I'm assuming(I'm not a lawyer) the generated code is considered "derived work". If it is, then the generated code will also fall under MIT.<br/>
[![FOSSA Status](https://app.fossa.io/api/projects/git%2Bgithub.com%2Fbloodstalker%2Ffaultreiber.svg?type=large)](https://app.fossa.io/projects/git%2Bgithub.com%2Fbloodstalker%2Ffaultreiber?ref=badge_large)
|